How Accurate Was the Boston Marathon Cutoff Time Tracker In the Past?

Feature photo courtesy of erp3d on reddit

A question I frequently get asked is, “How accurate was the Boston Marathon Cutoff Time Tracker in the past?”

It’s impossible to answer that question directly, because the tracker itself is new this year. This is the first time that I’ve collected the results and made a continuing projection throughout the year.

But the underlying assumptions were tested last year, when I collected a large dataset and used it to make a projection for the 2025 Boston Marathon cutoff time. Those assumptions led to a very accurate prediction of the number of applicants – and a relatively accurate prediction of the actual cutoff time.

I’ve written about these things before, but they’re scattered across various places and buried in the archives. So I thought it would be helpful to take a quick walk down memory lane and look back at how last year’s prediction (for the 2025 Boston Marathon) worked out.

The Assumptions Underlying the Model

First, let’s take a minute to establish how the model operates and what the underlying assumptions are.

Working backwards, the cutoff time is a function of the number of runners who apply and the distribution of their cutoff times. There’s a decent amount of data available to estimate the cutoff time required to eliminate a given number of applicants.

Prior to the 2024 Boston Marathon, /u/flatcoke shared an analysis on Reddit based on this concept. There are a few different ways to look at this – the acceptance rate, the number of applicants, the number of rejected applicants – but they all lead to a similar linear relationship. The only problem is that you need to know the number of applicants.

That’s where Joe Drake’s work comes in. He shared a series of projections related to the cutoff time, and I forget when I initially read his work. But the general idea is if you know the number of qualifiers out there, you can estimate the number of applicants.

He focused on a subset of races which he called the BIB50 – the top 50 races leading to Boston applicants. Basically if you know how many qualifiers there are and you combine that with data from BAA about the number of applicants, you can estimate how many applicants will result. Here was his final update before the 2025 cutoff was released.

I took that idea and ran with it. By enlarging the sample, you can account for more variables. So I collected data from ~250 races in the qualifying periods for the 2024 and 2025 Boston Marathon. After some analysis, I realized that the key metric – if you want to boil it down to one thing – is the net change in qualifiers from year to year.

If a similar percentage of qualifiers convert to applicants, then you can take the rate at which the number of qualifiers grew or shrank, apply that rate to the number of applicants, and project the anticipated number of applicants – which then leads to an anticipated cutoff time.

Where Did All That Math Lead Last Year?

I worked on data collection throughout the month of August, and I published an initial projection on August 30, 2024. I also shared that projection on /r/AdvancedRunning.

In that initial projection, I calculated that the number of qualifiers in my sample increased 8.3%. This was based on de-duplicated results that used a runner’s fastest qualifying time. At the time, I concluded that “this will likely result in a larger pool of qualified applicants [compared to the previous year].”

I then graphed the number of qualifiers in each qualifying period according to their buffer, looked at how many qualifiers led to filling up the field for the 2024 Boston Marathon, and drew an intersecting line to see how deep the cutoff time needed to be to achieve the same number of applicants: 7:03.

The following week, I published a follow up that explored a few additional questions. I also shared it on /r/AdvancedRunning. Ultimately, I doubled down on the projection – with some signs indicating it might be a little higher.

After looking more closely at the distribution of buffers, I offered a projection of the number of applicants: 36,248.

I concluded that analysis with: “if these predicted applicant numbers are accurate [it] would put the cut-off time somewhere just under 8 minutes.

How Accurate Was the Projection?

There are essentially two parts to this question:

  1. How close did the prediction come on the number of applicants?
  2. How close did the prediction come on the cutoff time resulting from those applicants?

On the first count, things were very close.

As I detailed here, my prediction for the number of applicants was 36,248. The actual number released by BAA was 36,406. With a difference of under 1%, I’ll gladly take that and call it a win.

After the applicant numbers were released, I concluded, “I’d feel comfortable narrowing the margin of error down to 6:30 to 7:30.

But I also indicated that there was some uncertainty about the distribution of buffers among the applicants. If the field of applicants looked different – and people with bigger buffers were more likely to apply – that could lead to a “worst case scenario […] around 8:00 to 8:30. But if I was betting money, I’d cap the likely outcomes at 7:30.

The actual cutoff time was 6:51.

As I explained in this reflection, that means that my projection was both spot on – and slightly off.

On the one hand, 6:51 is not far from my original projection (7:03) and it sits squarely in my projected range (6:30 to 7:30). On the other hand, this projection was based on an assumption that there would be 22,000 accepted applicants. It turns out, 24,000 applicants were actually accepted.

As it turns out, there was a shift on the likelihood of specific runners to apply. People with small buffers were less likely to apply, likely because they anticipated not getting in. Meanwhile, people with moderate (5-10 minute) buffers were more likely to apply. This tracks with my final suggestion that a worst case scenario could be around 8:00.

If the field size had actually been restricted to 22,000 accepted runners, instead of 24,000, the resulting cutoff time would have been approximately 7:55: squarely in line with that worst case scenario I projected.

How Confident Should We Be in the Cutoff Tracker?

This brings us back to the original question. Given past projections, how confident should we be in the current projection in the Boston Cutoff Time Tracker?

The fundamental assumptions underlying the tracker are sound. In past years, other people have used similar methods to come up with projections that were in the right ballpark. And when I put all the pieces together last year, I came up with a solid projection.

The number of applicants was pretty much spot on. If the number of accepted applicants remained 22,000, I would have slightly underestimated the actual cutoff time. But I was still in the right ballpark, and I anticipated the possibility of a cutoff that high.

So without a doubt, I think you should be very confident in the general contours of the projection. The number of finishers is up.

The number of qualifiers is down slightly. The cutoff time will likely be a little lower than last year – but it will almost definitely be above 5:00.

What Remains Unknown or Unaccounted For?

But tracking the cutoff time throughout the year is slightly different from projecting it at the end of the qualifying period – once all of the data is in. From week to week, there’s bound to be some fluctuations. And with some results yet to be determined, there could be minor changes in the underlying data.

The dashboard itself also simplifies some of the assumptions and calculations to reduce things to a single number. This all adds up to a little more uncertainty than a final, more rigorous analysis might provide.

One thing that isn’t currently incorporated into the tracker is how a runner’s buffer predicts their likelihood of applying. Past data suggests that big buffers (i.e. 20+ minutes) are far less likely to apply than medium or small buffers (like 5-10 minutes).

This year’s qualifying period has some slight variations in the distribution of buffer times. But the biggest difference is in runners with 20+ minute buffers – there are fewer of them. I ran some quick numbers to incorporate this factor, and a cutoff time of 5:00 leave about 27,000 runners in the pool, while a cutoff time of 7:00 would leave about 22,000 runners in the pool.

If the number of accepted applicants remains 24,000, that means you’ll need a cutoff time that’s a little higher than 5:00 – but not as high as 7:00.

Another thing that’s unaccounted for is the changes in international races.

My sample includes London and Berlin, because those are among the top qualifying races. But I do not track other international races because such a small share of those runners end up applying to Boston. As long as these races were all in last year’s qualifying period and there were no drastic changes, it should all wash out in the data. But if there are any drastic changes, that complicates things.

One example is Tokyo. It was warm this year, and the number of qualifiers was down. Many of them won’t apply to Boston, so the magnitude of that change would unduly sway the projection. But this is a wildcard that points slightly towards a lower cutoff.

The flip side is Sydney. It’s not accounted for in the tracker, either. The relative number of BQ’s is low, due to the course elevation profile. And the relative share of applicants is likely low, too. But the 2024 Sydney Marathon was much larger than 2023. It also included the Abbott Age Group World championships, which would likely include many Boston applicants. And the 2025 Sydney Marathon is early enough (August 31) that it’ll also be in this qualifying period.

That’s a wildcard that points to more qualifiers – and a higher cutoff time. And it’s likely much stronger than the downward pressure of Tokyo.

The Projection Will Continue to Gain Confidence

So there you have it.

I’d say that last year’s projection wasn’t perfect, but it was pretty accurate. And I’m confident that the current projection puts things in the right ballpark.

But there are some remaining unknowns. And there are some areas of analysis that have not yet been applied.

As we get closer to September, there will be fewer unknowns. And I plan to do a more rigorous analysis that improves on the basic assumptions underlying the tracker this summer.

As a result, my confidence in this projection will continue to increase. I expect to refine things somewhat and reduce the level of uncertainty. But I don’t expect anything to change the fundamentals – or the broader window of likely outcomes (5:00 to 7:00).

I’m in the process of cleaning up my dataset, and I plan to publish it (along with updates) to Kaggle by the end of the month. If you’re proficient with data, I’d encourage you to try your hand at your own analysis and projection.

To stay up to date on that and on future analysis, use the form to subscribe to my weekly newsletter.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.