Will Gemini outperform GPT-4 at mathematical theorem-proving?
Plus
20
Ṁ428Jan 1
62%
chance
1D
1W
1M
ALL
Based on speculation from https://youtu.be/tkqD9W5U9F4?t=468
To operationalize this, this question will resolve based on the LeanDojo benchmark (https://leandojo.org/), in particular the Pass@1 metric, where "The prover is given only one attempt and must find the proof within a wall time limit of 10 minutes."
GPT-4 is reported to achieve an accuracy of 28.8% on the "random" split of the test data in Table 2 of the LeanDojo paper (https://arxiv.org/pdf/2306.15626.pdf).
This question closes when an evaluation of Gemini's performance on this task is brought to my attention.
Get
1,000and
1.00
Related questions
Related questions
Will "Gemini [Ultra, 1.0] smash GPT-4 by 5x"?
18% chance
Will Google Gemini do as well as GPT-4 on Sparks of AGI tasks?
76% chance
Will GPT-5 perform better than o1 (not preview) at AIME 2024, Codeforces elo, GPQA, or the 2024 ioi?
91% chance
Will an open-source LLM beat or match GPT-4 by the end of 2024?
83% chance
Will any open source LLM with <20 billion parameters outperform GPT-4 on most language benchmarks by the end of 2024?
13% chance
Will Google Gemini be able to answer the simple geometry/number theory question in the description?
14% chance
Will an open source model beat GPT-4 in 2024?
76% chance
Which, if any, GPT-n will outperform AlphaGeometry merely via prompting, by 2030?