Will Al achieve 95% or higher score on the FrontierMath benchmark before 2030?
Will Al achieve 95% or higher score on the FrontierMath benchmark before 2030?
➕
Plus
3
Ṁ85
2029
73%
chance

From a recent arXiv preprint,

We introduce FrontierMath, a benchmark of hundreds of original, exceptionally challenging mathematics problems crafted and vetted by expert mathematicians. The questions cover most major branches of modern mathematics -- from computationally intensive problems in number theory and real analysis to abstract questions in algebraic geometry and category theory. Solving a typical problem requires multiple hours of effort from a researcher in the relevant branch of mathematics, and for the upper end questions, multiple days. FrontierMath uses new, unpublished problems and automated verification to reliably evaluate models while minimizing risk of data contamination. Current state-of-the-art AI models solve under 2% of problems, revealing a vast gap between AI capabilities and the prowess of the mathematical community. As AI systems advance toward expert-level mathematical abilities, FrontierMath offers a rigorous testbed that quantifies their progress.

Resolution Criteria:

This question resolves to YES if the state-of-the-art average accuracy score on the FrontierMath benchmark, as reported prior to January 1, 2030 is 95% or higher for any fully-automated computer method. Credible reports include but are not limited to blog posts, arXiv preprints, and papers. Otherwise, this question resolves to NO.

Fine Print:

There is no maximum inference budget for this benchmark - models can use any amount of compute to solve the problems.

Get
Ṁ1,000
and
S1.00

What is this?

What is Manifold?
Manifold is the world's largest social prediction market.
Get accurate real-time odds on politics, tech, sports, and more.
Win cash prizes for your predictions on our sweepstakes markets! Always free to play. No purchase necessary.
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like trading still use Manifold to get reliable news.
How do I win cash prizes?
Manifold offers two market types: play money and sweepstakes.
All questions include a play money market which uses mana Ṁ and can't be cashed out.
Selected markets will have a sweepstakes toggle. These require sweepcash S to participate and winners can withdraw sweepcash as a cash prize. You can filter for sweepstakes markets on the browse page.
Redeem your sweepcash won from markets at
S1.00
→ $1.00
, minus a 5% fee.
Learn more.