
Which Benchmarks will OpenAI show results from GPT-5 on, when it is announced?
Plus
21
Ṁ9104Jan 1
1D
1W
1M
ALL
99%
SimpleQA
99%
HumanEval
99%
MMLU
99%
GPQA
99%
SWE-Bench
99%
ARC-AGI-2
16%
MATH
14%
Big-Bench-Hard
12%
DROP
12%
MGSM
8%
GSM8K
Some flexibility on variations of specific benchmarks. eg SWE-Bench-Hard would resolve SWE-Bench YES.
Update 2025-05-11 (PST) (AI summary of creator comment): The benchmarks must be those that GPT-5 is benchmarked against by OpenAI.
Must be on roughly the same day / during / around the time of the announcement. If there are several announcements over multiple days, all those times are acceptable for the purpose of this market.
Get
1,000and
1.00
Sort by:
@bbb Idk if i was actually able to change the settings back then but since then ive learned how to do it, so added arc agi 2