Will Gemini outperform GPT-4 at mathematical theorem-proving?
➕
Plus
20
Ṁ428
Jan 1
62%
chance

Based on speculation from https://youtu.be/tkqD9W5U9F4?t=468

To operationalize this, this question will resolve based on the LeanDojo benchmark (https://leandojo.org/), in particular the Pass@1 metric, where "The prover is given only one attempt and must find the proof within a wall time limit of 10 minutes."

GPT-4 is reported to achieve an accuracy of 28.8% on the "random" split of the test data in Table 2 of the LeanDojo paper (https://arxiv.org/pdf/2306.15626.pdf).

This question closes when an evaluation of Gemini's performance on this task is brought to my attention.

Get
Ṁ1,000
and
S1.00
Sort by:
bought Ṁ30 YES

This has happened I think?

Or maybe no one's applied it to that benchmark yet

Which Gemini version?