Model with METR time horizon ≥8 hours released in 2026? | Manifold

Model with METR time horizon ≥8 hours released in 2026?

6

Ṁ253

2026

68%

chance

1D

1W

1M

ALL

This market will resolve to yes if any model is released in 2026 that has a METR time horizon (50% reliability) of at least 8 hours. It will resolve to no, if at the end of 2026, no such model has been released.

#️ Technology

#Technical AI Timelines

#Upcoming releases

Get

1,000

and

1.00

Sort by:

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/

So a "doubling time" of about 260 days (8.7 months) would be required to reach 8 hours in 2026 if we extrapolate from GPT-5.1 Codex Max (2h42m), or 280 days (9.4 months) from GPT-5, if my maths is correct.

@Jw we've had a doubling time faster than that for a few years now. I default to thinking it's pretty likely that continues.

The other dynamic here favoring a "no" resolution is that the benchmark is only designed to measure tasks up to 8 hours so maybe that makes getting to 8 hours unusually difficult without an update to the benchmark.

Related questions

Claude Opus 4.5's METR-50 time horizon

Best AI time horizon by February 2026, per METR?

Will the METR long-horizons have a >6 month doubling time for at least a 4 month period before 2026?

R2's METR 50% time horizon

Will GPT-5.2's METR 50% time horizon exceed 3 hours 30 minutes?

+6% 1d30% chance

Grok 4.20's METR 50% time horizon

Will a model achieve a METR 50% time-horizon of 4+ hours by the end of 2025?

+6% 1d31% chance

Best AI time horizon by August 2026, per METR?

Will a Google model lead METR's task duration chart by EOY?

Opus 4.5's METR time horizon beats GPT-5.1's?

Related questions

Claude Opus 4.5's METR-50 time horizon

Grok 4.20's METR 50% time horizon

Best AI time horizon by February 2026, per METR?

Will a model achieve a METR 50% time-horizon of 4+ hours by the end of 2025?

Will the METR long-horizons have a >6 month doubling time for at least a 4 month period before 2026?

Best AI time horizon by August 2026, per METR?

R2's METR 50% time horizon

Will a Google model lead METR's task duration chart by EOY?

Will GPT-5.2's METR 50% time horizon exceed 3 hours 30 minutes?

Opus 4.5's METR time horizon beats GPT-5.1's?

Terms & Conditions•Privacy Policy•Sweepstakes Rules