Run the Numbers — NCAA XC (Part 1)
Back in November, I posted two series of tweets in the week leading up to the Division 1 NCAA Cross Country Championships. These two series were data analysis of the championship results from the preceding seven years where I looked to examine what factors may or may not impact a team’s success on the biggest stage. After a couple of months of having to continually look back, I decided I would put all of the analysis into one post for ease of reference and on the off chance folks are still interested in it.
Let’s start by looking at the distribution of teams that have finished on the podium (top 4) over the past seven years (does not include 2019).
The first observation that stands out is that the women’s podium sees a greater variety of teams than the men’s. Colorado, New Mexico, Oregon, Providence, and Michigan have all been consistent forces for women’s cross country but there are also a handful of teams who have made it on once.
As for the men, there is less diversity over the years when looking at podium teams. While 14 different teams have finished on the podium for women, only 10 have for men over that time. Northern Arizona (NAU) has the most appearances of any team, missing the podium only one time in the past seven seasons.
While both of the above focus only on the podium teams, what about looking at all 31 that race?
Above we see the median scores based on the team finish. I chose to use the median instead of average to avoid any bias that may be introduced using the average but in general, the two were relatively close so either would have worked. Right away we notice that the first and last place teams appear to be significantly better and worse than the teams directly in front/behind them. There’s not a lot to pull specifically from this but it is interesting to note the fourth and fifteenth place spots. Typically it takes under 200 points to make the podium based on the median total for fourth. Similarly, to finish in the top half of the field (top 15), it takes right around 400 points.
Looking at the median provides some insight but does not give the full picture of how well a team can or cannot do and still earn a given place. To get a better perspective, I pulled together the distribution of team scores for each team place.
Using the above visualization, we are able to get a better understanding of just how well a team must run to earn a particular finish. Let’s start by looking at the podium again. We can see that in almost all scenarios, a team must score less than 250 points to finish in the top 4. Only one team, the BYU men in 2013, have finished on the podium while exceeding that mark.
Another place of note is 10th place (and drifting to 11th). In order to finish in the top 10, teams usually need to score under 350 points and in all but one scenario must score less than 375 points. For perspective, 350 points is averaging 70th (team) place for a school’s top five runners. Team place and overall place are slightly different — team place is scoring runners only belonging to a qualifying team whereas overall place includes runners who have qualified individually.
With the difference between overall and team place in mind, let’s examine the different runners that make up these teams at NCAAs.
Each team scores five runners at NCAAs — denoted by R1, R2, etc in the chart above. This visualization displays the average place of each runner against where the team finished overall. For example, the top finishing runner (R1) on the fifth-place team finished an average of 18th (team) place. The fifth runner on the same team typically finished 87th.
Looking at R1, we see that almost every team tends to have one runner who is relatively strong. In fact, through 24 teams, each team tends to have at least one runner who is a fringe All-American (top 40). When looking at the top teams, the biggest difference tends to fall to the fifth runner. The top three teams tend to keep all five of their runners relatively close, with average spreads of 40, 51 and 53 points. The fourth team sees a significant dropoff with their fifth runner, increasing an average of 18 spots compared to the third-place team. When looking at the top four runners of the third and fourth place teams, the largest increase is an average of six places between the fourth runner on each team. The fifth runner triples that average gap.
Above is a visualization of the gap between runners. ‘GAP1’ is the median number of places between the first and second runners on a team, ‘GAP2’ the number between second and third runners, etc. Here we can see a clear spike for the fourth and fifth place teams in GAP4 which then levels off again. Notably, the first four runners on the top five teams all tend to maintain relatively the same distance (points-wise) regardless of team finish. From this, it appears that what really separates the top teams are… 1) General talent/ability where they simply finish a little better than runners on other top teams 2) The fifth runner on the top three teams tends to be substantially better than the fifth runners on the fourth and fifth place teams.
Most of this analysis falls on the team side of things. In Part 2, I will take a look more at the individual side of things and see what trends can be pulled from individual finishes.