Which of these models have an ELO Rating in the LMARENA (formerly known as LMSYS) by the end of January 2025?
๐Ÿ’Ž
Premium
6
แน€28k
Feb 1
70%
Gemini 2 (flagship)
52%
DeepSeek's r1
32%
Openai's o1 Pro
Resolved
YES
OpenAI's o1

If on January 31st 2025 or earlier a model has a score in the LMARENA leaderboard, the respective market resolves to YES.

Gemini 2.0 (flagship) resolves to YES if Google DeepMind implies that the model is their best Gemini 2.0 version, whatever that is called.

Get
แน€1,000
and
S1.00
Sort by:

Have you ever been duped off your funds all in the name of investment and investing in companies and getting a certain percentage in return or your bitcoin account was hacked and your funds was stolen, any which ways i am here with a way you can get your stolen funds back which is you contacting (dorisashley71 (@) gmail. Com) also Whatsapp +1---(404)--721--56--08 and following all their instructions because this is something i did and i got my stolen funds back from scammers in the form of a company, they also offer other cyber technology services you just present it before them and you will get the solution you desire that i can assure you of.

OpenAI's o1
bought แน€9,000 OpenAI's o1 YES

@MP resolves Yes

opened a แน€103 OpenAI's o1 NO at 90% order

o1 has a rating now!

bought แน€500 Gemini 2 (flagship) YES

Another question: You say the Gemini 2.0 resolves according to the "best" model. Does this include the "Thinking Mode"? So if Gemini 2.0 has an Elo, but a "Gemini 2.0 Thinking Mode" (analogous to "Gemini 2.0 Flash Thinking Mode") has been announced but does not have an Elo yet, will the Gemini 2 question resolve Yes or No?

bought แน€350 Gemini 2 (flagship) YES

For r1, would "r1-lite" count? Would "r1-preview" count?

For Gemini, would "Gemini-2.0-Exp" count, or does it have to be "Gemini-2.0" without the "-Exp" marker?

@MP Friendly ping! Would be lovely to get a clarification on the resolution criteria.