INTRODUCTION

Gender differences in employment, education, and income have been explained by a variety of theories, including gender differences in psychological characteristics [Cunha and Heckman 2007; Borghans et al. 2008; Segal 2012] and competitiveness [Gneezy and Rustichini 2004; Niederle and Vesterlund 2007; Croson and Gneezy 2009; Niederle and Vesterlund 2011]. In laboratory experiments, examinations of differences in competitiveness across genders typically find that females are less willing to compete or have less taste for competition [Gneezy and Rustichini 2004; Niederle and Vesterlund 2007; Croson and Gneezy 2009; Niederle and Vesterlund 2011]. Studies have also shown that males are more likely to respond positively to competition [Gneezy and Rustichini 2004]. Gender differences in competition have also been explained by differences in biology, differences in socio-cultural norms/expectations, and differences in opportunities and incentives.

Because of the nature of sports, gender differences in competition have occasionally been examined in sports. As described in Frick [2011a, 2011b], it is found that female distance competitions have become more competitive over the last few decades. This result lends support to the hypothesis that females were less competitive in an earlier time because of socio-cultural reasons rather than differences in physical abilities or biology. Gneezy and Rustichini [2004] examine the performance of elementary school children in a 40 meter footrace. Children were first asked to compete without competition. Next, children were paired with a closely matched competitor and they ran the race again. Boys completed the race with a faster time than they completed the solitary race, and girls completed the competitive race with a slower time than the solitary race. This suggests that males respond more positively to competition.

In addition to gender effects on competition, other work has investigated the role of peers in general on performance. If males and females respond differently to peers, then this may be evidence of differences in competitiveness. Lavy et al. [2012] examine whether good academic peers affect children’s test scores. Interestingly, results show that girls respond positively to good peers but boys experience a small negative effect of good peers. Other research has examined peer effects in sports. Hill [forthcoming] examined the relationship between tournament structure, peers, and performance using data on the major 5,000 meter International Association of Athletics Federations (IAAF) events from 2001 to 2011. Hill [forthcoming] finds that an individual runner’s ability is the most consistent predictor of performance and qualification, but runner’s times are positively affected by the abilities of runners within their heats as well as the abilities of runners competing in other heats. This is in contrast to Guryan et al. [2009] who find that a playing partner’s performance has very little impact on a player’s own performance in professional golf tournaments. Brown [2011] finds that peers do matter, as the results in her paper show that golfers’ consistently perform better (shoot lower scores) in tournaments with higher quality peers.Footnote 1

Research on tournament theory has also examined what impact tournament structure has on a competitor’s behavior, including effort and performance. As described in Szymanski [2003], the research on tournament theory has taken the foundations from Lazear and Rosen [1981] and applied the concepts to questions of sports economics. Much of the work on tournament design in sporting contests has been concerned with the relationship between the prize structure of the contest and participants’ efforts. While this is not the primary question of this research, it still provides important insights into the determinants of an athlete’s performance.Footnote 2

As described below, the construction of track and field tournaments provides situations in which some runners have more information about the performance needed to qualify for additional rounds of a tournament. Hill [forthcoming] examines how different information from tournament structure influences performance. The lack of information by certain runners may create an incentive for increased effort. Hill [forthcoming] examines the performance of runners in 5,000 meter tournaments and examines whether runners who compete in the first heat of any given round perform at a higher level given they have less information than runners in subsequent heats, and he finds that certain heats appear to be faster than others, but the information asymmetry does not induce a consistent performance effect across time.

Using data from 2003 to 2012 on the major IAAF track and field events, such as the World Championships in Athletics and the Olympic Games, this paper examines whether the behavior of males and females in the 1,500 meter race differs and if the difference in behavior is indicative of differences in competitiveness. Results indicate that there are some gender differences in the competition and the evidence is suggestive of strategic behavior being stronger among males. Where the gender differences exist, the evidence indicates that the differences arise because of differences in the relationship between ability and performance and between peer effects and performance. These results could potentially be explained because strategic behavior differs among genders. The correlation between ability and performance is stronger among females and the significant differences in peer effects indicate that males run slower and are less likely to qualify for the next round of the tournament in the presence of slower runners which may mean that males are more likely to run strategically, even if the results are suboptimal.

TOURNAMENT BACKGROUND

Before discussing the data and empirical methodology, some additional discussion regarding the tournament is required. In the 1,500 meter multiple round tournaments analyzed here, runners can advance to the next round in the tournament 2 ways. As described in Hill [forthcoming], in a tournament with μ rounds (where μ=1, 2, or 3), one way to qualify for round μ+1 in a heat with N runners is by finishing in the top δ of the heat (where 1⩽δN), so that δ runners qualify from each heat. This type of qualification is referred to as a place qualification and is similar to a rank-order tournament where relative performance, not absolute performance, of the runners in each heat determines qualification. However, those who do not finish in the top δ of the heat can still qualify for the finals if their finishing time from their heat is fast enough. Specifically, runners from heat h (where h=1−4) who did not qualify (N h δ) have their times compared with other non-place qualifiers from all heats. Runners whose times rank as the λ fastest, qualify for the finals, where 1⩽λ⩽∑(N h δ). This type of qualification is referred to as a time qualification and is similar to the losers’ round of a double elimination tournament.

Tournaments following this structure include those organized by the IAAF. These include the Olympic Games and the World Championships in Athletics.Footnote 3 Since 1991, the IAAF has held the World Championships biennially and the Olympics have been held every 4 years since 1912. Track and field tournaments are typically set up with similar structure for both males and females. According to the 2012–2013 rules [IAAF 2011], the Games Committee determines the number and composition of heats according to several guidelines. Table 1 provides the guidelines regarding the structure of the tournament, including how many rounds, heats, and qualifiers by time and place. For example, in 2012 there were 45 female entrants and 44 male entrants. By the guidelines shown in Table 1, each gender competed in a tournament with three rounds. There were three heats within the preliminary round and in this round the top six finishers in each heat qualified by place and the six fastest non-place qualifiers from the round qualified by time. By guidelines, the middle round consisted of two heats for each gender, and the five fastest finishers in each heat qualified by place, while the two fastest non-place qualifiers from the round qualified by time.

Table 1 Round, heat, and qualification rules

When determining heat placement, organizers are required to use all information about a runner’s past performances “so that, normally, the best performers reach the final” [IAAF Competition Rules 2012–2013]. The runners are then organized according to the zigzag method. For example, a competition consisting of 45 athletes, seeded 1–45, would be organized with the following seedings during the preliminary round of the tournament:

illustration

figure a

After the heats are assigned, a random drawing determines the order in which the three heats (A, B, or C) will be run.

DATA

Data collected include information on 651 runners comprising 1,122 observations from the 2003, 2005, 2007, 2009, and 2011 World Championships in Athletics as well as the 2004, 2008, and 2012 Olympics. As seen in Table 2, data include information on the runners’ performances (times, probability of qualification, and finishing position), season best time, age, and teammates as well as information on the structure of the tournament. As it is expected that there are differences across rounds and gender, the summary statistics are separated by round and gender. For both male and female runners, the average time declined as the rounds progressed and the average age increased. In the preliminary round, male and female runners qualified for the next round of the tournament 58 percent of the time, and in the middle round, male and female runners qualified for the final round of the tournament 51 and 50 percent of the time, respectively. In the final round, males and females earned medals roughly 25 percent of the time.

Table 2 Variable definitions and summary statistics

The average season best time of males and females decreased as the rounds increased, but the average season best time was always greater than the average time. In the preliminary round, males and females participated in the first and last heat 30–36 percent of the time. In the preliminary round, the maximum number of heats was four for males and three for females, and in the middle round, the maximum number of heats was two for males and females. It is extremely rare for a teammate to be present in a heat during the preliminary round. In fact, this never occurred for females. During the later rounds, teammates become more prevalent.

As seen in Table 2, the average runners’ times varied across rounds. Not surprisingly, as the tournament progressed, runners ran faster times. As described above, there is an information asymmetry with regard to the timing of the heats. Table 3 includes the times and probability of qualification across heats within rounds.

Table 3 Performance by round and heat

In the preliminary round, the times are fastest in the first and last heat, and the probability of qualification is highest in the first and last heat.

Before examining whether there is a gender difference in performance during the tournament, it is important to consider the differences in runner abilities by gender. Figures 1, 2, 3 provide the distribution of season best times across rounds and by gender. While not a formal test of differences in distributions, it appears that runners of both genders face similar competitive pressures across the rounds. The average abilities of runners differ by gender, but the variation across genders is similar. This is also supported by looking at the standard deviations in Table 2. It appears from these figures that males and females face competitive situations which do not differ in an obvious way.

Figure 1
figure 1

Preliminary round season best times (seconds) by gender.

Figure 2
figure 2

Middle round season best times (seconds) by gender.

Figure 3
figure 3

Final round season best times (seconds) by gender.

ECONOMETRIC METHODOLOGY

The principal concern of this paper is to use track and field data to examine whether gender differences in competitiveness exist at the elite level of performance in the 1,500 meter race. To estimate this, the following equation is proposed:

Performance ihry is measured using a runner’s time, probability of qualification, or probability of medaling as described in Table 3. SB i is runner i’s season best time and is intended to control for the runner’s ability. Within-Heat Peer Performance ihry is the average of the performance of all runners in heat h during round r, excluding runner i, which is intended to capture the peer effect of runners within the heat.Footnote 4 Age i is runner i’s age in years and Age i 2 is a quadratic term for runner i’s age. FirstHeat ihry and LastHeat ihry are dummy variables equal to 1 if a competitor is competing in the first heat or the last heat of the round, respectively. Teammate ihry is a dummy variable equal to 1 if runner i has an athlete from their own country in their heat during year y. Male i is a dummy variable equal to 1 if the runner is a male and 0 otherwise. Year is a set of dummy variables for the years. Male*X i is a set of interaction terms, where Male is interacted with all of the explanatory variables except for the year dummy variables. Finally e ihsy is the error term for runner i in heat h during round r of year y.

If a runner’s ability is an important predictor of performance, then it is expected that the coefficient estimate of Ability i will have a positive sign in the models. A smaller coefficient is indicative of the ability of a runner to alter their season best time. As described above [Hill forthcoming; Brown 2011; and Guryan et al. 2009], the effect of peers is ambiguous, and the existing literature has found a variety of results. Age i and Age i 2 are included because of the expectation that there is a peak age of performance.

FirstHeat ihry , LastHeat ihry , and Teammate ihry are included to control for competition factors. As described in Hill [forthcoming], the effect of information is ambiguous. Running with no information might lead to increased effort, but full information provides a runner with a clear goal to achieve. Runners competing in the first heat of a round have the least information and runners competing in the last heat have full information concerning all heats except their own. The presence of a teammate may provide opportunity for strategic interaction within a race, but potentially provides a runner with information about an opponent, so the expected result is ambiguous.

Male and the Male interaction terms are included to control for gender differences in behavior during competition. If males behave differently in competition, then the interaction terms are expected to be significant. Additional variables are needed to control aspects of the races that may vary across time, but not across heats, such as weather, location, crowd and so on. For this reason, the year dummy variables are included.Footnote 5

Prior to estimation, a few econometric issues must be addressed. As described in Ammermueller and Pischke [2009], it is difficult to estimate causal peer effects from equation (1), especially when peers are not randomly assigned. Ammermueller and Pischke [2009] show that one potential solution to the problem is to estimate an individual’s performance on peer abilities rather than peer performance. As an individual’s ability is determined prior to the performance, an individual’s performance will not experience common shocks with peer abilities. As a result, the peer performance variable in equation (1) becomes Within-Heat Avg SB ihry that measures the average season best time of all within-heat peers, excluding runner i, as a measure of the peers’ abilities.

An additional econometric issue is the clustering of data. Runners are behaving strategically, so their results are correlated and it is expected that the error terms are correlated across observations within a heat. To prevent the standard errors from being incorrectly estimated, models are estimated with standard errors clustered by heat.

RESULTS

Estimation results are reported in Tables 4, 5, 6, 7, 8, 9. As the tournament progresses, it is possible that explanatory factors differ. As a result, models are estimated separately for each round. Before discussing the gender differences, some general results are worth noting. First, as expected, a runner’s ability (measured with season best time) is positively related to the runner’s time and negatively related to the probability of qualification or medaling. In the preliminary round, runners in the last heat of the round run significantly faster. Information seems to matter — full information seems to be important in improving a runner’s performance. Finally, peer effects, when statistically significant, indicate a positive relationship between peers’ abilities and performance. The remaining discussion of results considers whether these relationships differ for males and females.

Table 4 Preliminary round – Dependent variable=time (in seconds)
Table 5 Middle round – Dependent variable=time (in seconds)
Table 6 Final round – Dependent variable=time (in seconds)
Table 7 Preliminary round – Dependent variable=qualify for middle rounda
Table 8 Middle round – Dependent variable=qualify for final round
Table 9 Final round – Dependent variable=medal (top 3 finishers in final round)

The first model predicts performance in terms of time, that is, time is the dependent variable. For each of the three rounds in the 1,500 meter tournaments, a model is estimated including a male dummy variable and male interaction terms for all explanatory variables except for the year dummy variables.Footnote 6 These results are reported in the first column of Tables 4, 5, 6 for all three rounds of the tournaments. These results give us a comparison for females and males in each round of the tournament. For example, the estimated model for females in the preliminary round is:

Alternatively, the estimated model for males in the preliminary round is:

These results can be calculated from the first column of the same Table 4 by substituting male=0 to obtain the female results and male=1 to obtain the male results. We are also interested in testing for overall gender differences. Formally, the null hypothesis is that there are no gender differences. An F-test is used to test this joint hypothesis that the coefficients on all the interaction terms involving the male dummy variables are jointly equal to 0 (no gender differences other than a different intercept). If the null hypothesis is rejected, then there are jointly significant gender differences. In other words, even without individually significant interaction terms, rejection of the null hypothesis indicates that the model that allows for gender differences is the appropriate model. The final column of each table reports a restricted model where the estimated coefficients are restricted to being the same for men and women (note that the intercept is allowed to differ for males and females).

The first column of Table 4 provides an initial investigation of gender differences in the preliminary round of World Championship and Olympic 1,500 meter tournaments. For females, (excluding year dummies) season best and running in the last heat have significant effects on their times in the preliminary round of the tournaments. For females, the coefficient of season best is significant at the 1 percent level and the coefficient on the last heat dummy variable is significant at the 5 percent level. In addition, looking at the male dummy interaction terms, we find that the effects of season best and within-heat average season best of competitors are individually statistically different for males and females. The difference in the effect of season best for men and women is significant at the 5 percent level (one-sided P-value of 0.029), while the difference in the within-heat peer effects is significant at the 10 percent level of significance. This indicates that there is a gender difference in the relationship between ability and performance and between peer effects and performance. In addition, when the joint restrictions that there are no gender differences in the preliminary round are imposed on the model, the null hypothesis of no gender differences is rejected at the 10 percent level of significance using the F-test (P-value=0.0628). This suggests that the model reported in the first column of Table 4 allowing for gender differences is the preferred model. The results in Table 4 provide evidence of gender differences in the preliminary round of the 1,500 meter tournaments.

The results for the middle round of the 1,500 meter tournaments are reported in Table 5. The investigation for gender differences in the middle round proceeds in the same manner described above for the preliminary round. For females, only season best appears to have a significant effect (at the 1 percent level) on their times in the middle round of the tournaments. In addition, looking at the male dummy interaction terms, we find none of the interaction terms coefficients are individually statistically different for males and females. Supplementing these findings, when the joint restrictions that there are no (slope) gender differences in the middle round are imposed on the model, the null hypothesis of no gender differences is not rejected using the F-test (P-value=0.5310). The results in Table 5 provide no evidence of gender differences in the middle round of the 1,500 meter tournaments.

Finally, the results for the final round are reported in Table 6. For females, similar to the middle round results, only season best appears to have a significant effect on their times in the final round of the tournaments. In addition, looking at the male dummy interaction terms, we find that the effect of season best is statistically different for males and females at the 10 percent level of significance (one-sided P-value=0.056), but otherwise no individual interaction terms are statistically significant. Again, this indicates that there is a gender difference in the relationship between ability and performance. However, when the joint restrictions that there are no gender differences in the final round are imposed on the model, the null hypothesis of no (slope) gender differences is rejected at the 5 percent significance level using the F-test (P-value=0.0242). The results in Table 6 provide evidence of gender differences in the finals of the 1,500 meter championship tournaments.

In addition to time, probit models are considered for predicting the probability of advancing to the next round of the tournament (qualifying). Since runners in the final round do not advance to any other round, the probit model predicts the probability of winning a medal (top three finishers) for those running in the final round of the tournament. Marginal effects for probit models are reported in Tables 7, 8, 9.Footnote 7

The results for the probability of qualifying in the preliminary round are reported in Table 7. For females, (excluding year dummies) season best significantly affects the probability of advancing beyond the preliminary round of the tournaments at the 1 percent level. In addition, within-heat peer effects are significant at the 10 percent level for females. Looking at the male dummy interaction terms, we find that the within-heat peer effects are statistically different for males and females at the 1 percent level. This indicates that there are gender differences in the relationship between peer effects and performance. Finally, when the joint restrictions that there are no gender differences in the preliminary round are imposed on the probit model, the null hypothesis of no (slope) gender differences is rejected at the 10 percent level of significance using the χ2 test (P-value=0.0733). The results in Table 7 provide evidence of gender differences in the probability of advancing to the middle round of the 1,500 meter tournaments.

The results for the probability of qualifying for the final round are reported in Table 8. For females, season best, Within-Heat Average season best (peer effects), Age, and Age2 all affect the probability of advancing beyond the middle round of the tournaments at the 1 percent level of significance. In addition, looking at the male dummy interaction terms, we find that the effects Age and Age2 are statistically different for males and females at the 1 percent level of significance, while the effects of season best and running in the first heat are statistically different for men and women at the 10 percent level. Further, this conjecture of gender differences is strongly supported when the joint restrictions that there are no (slope) gender differences in the middle round are imposed on the probit model. The null hypothesis of no gender differences in the probit model for the middle round is rejected using the χ2 test (P-value=0.0001). The results in Table 8 provide strong evidence of gender differences in the probability of advancing to the finals of the 1,500 meter tournaments.

The results for the probability of winning a medal in the finals are reported in Table 9. For females, season best and peer effects significantly affect the probability of winning a medal in the finals at the 1 percent level of significance. Looking at the male dummy interaction terms, we find that the coefficient of season best is statistically different for men and women at the 10 percent level of significance. Further, while the peer effects are not statistically different for males and females at the 10 percent level of significance, they are statistically different at the 10.2 percent level of significance. However, the null hypothesis of no gender differences in the probit model for winning a medal in the finals is not rejected using the χ2 test (P-value=0.3226). The results in Table 9 provide some weak evidence of gender differences in the probability of winning a medal in the finals of the 1,500 meter tournaments.

DISCUSSION

Using data from 2003 to 2012 on the major IAAF track and field 1,500 meter events, this paper examines whether the behavior of males and females differs, and if the difference in behavior is indicative of differences in competitiveness. Results indicate males and females do behave differently in competition. The primary gender difference is the difference between the relationship between ability and performance. In the preliminary and final rounds, the relationship between ability (measured with season best time) and time is represented by a significantly steeper slope for females than it is for males. A female with an increase in their season best time (slower) by 1 second is associated with performance that is 0.89 seconds slower in the preliminary round and 0.93 seconds slower in the final round, holding all else constant. In contrast, a male with a 1 second increase in their season best time is associated with performance that is 0.80 seconds slower in the preliminary round and 0.40 seconds slower in the final round, holding all else constant. It is important to note that these are differences in magnitude but not direction. While not formally tested here, the weaker link between ability and performance for males may be indicative of more strategic behavior. If more strategic behavior occurs during the race, then it is likely that past times are weaker predictors for current performance.

In all but the two models for the preliminary round, the peer effect does not differ by gender. In the preliminary round time model, the average season best of all runners within the heat has a larger positive coefficient for men than for women. In the preliminary round probit model, the average season best of all runners within the heat has a significant positive coefficient for women, however, the coefficient for men is significantly different and is negative. For males in the preliminary round, the presence of slower runners in their heat is positively related to slower times and negatively related to the probability of qualifying for the middle round. This negative within heat peer effect might be evidence of more strategy by males. Perhaps the presence of slower runners within the heat leads to an expectation that it will be easier to get a qualifying place with less effort. However, this greatly reduces the chance of qualifying on time from this heat if the runner fails to qualify by place.

As discussed above, there is evidence that there are gender differences in explaining performance (time and qualification) in track and field races. Research has attributed differences in gender outcomes to differences in competitiveness. Results here indicate that the gender differences occur because of differences in the relationships between ability and performance and between peer effects and performance. Further research may provide more explanation for the result, but results can be explained by differences in strategic interaction. Even though males and females face similar competitive pressures (as seen above), they may have different strategies going into the race. Males have participated in the 1,500 meter event at the Olympics since its inception in 1896, whereas females have only participated in the 1,500 meter Olympic event since 1972.Footnote 8 This experience in the event may have allowed men to develop more strategies and plans for progressing through the tournament. For example, males may receive different coaching, training, or race-day strategies that alter their behavior. As females gain more experience in the long distance races, and competition gets tougher throughout their running careers, it is expected that differences in strategic behavior will erode.

Another possible explanation related to the relative experience of male runners is that female competitors may experience larger variance in their running performances, as seen in Table 2. The larger variance in times for female runners may be because male runners are relatively closer to their absolute physical maximum.Footnote 9 This was first described by Gould [1986], and applied to an examination of gender differences in competition in Treber et al. [2013]. The relatively large variance in female runner times will lead to larger coefficient estimates on the season’s best variable, but as female runner variation decreases over time, the coefficient estimates may be expected to converge.