Will Llama-4 be (open sourced and) as good as GPT-4?
➕
Plus
93
Ṁ12k
2026
87%
chance

This will be based on whatever Meta calls Llama-4, whether or not it deserves that name, or if it renames its next larger LLM to not include 'llama' I will use best judgment on whether it counts. If Meta does not release a relevant model by EOY 2025 this resolves to NO. If the model is not open sourced, it does not count.

By default will judge based on the leaderboard here: https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard

Clarification: This will compare to GPT-4 versions that existed at market creation. At this point, this is 99% a market on whether Llama-4 will exist and be an open model, I would be super surprised if it wasn't good enough on Arena.

Once it has been on the leaderboard for 7 days if it is close to allow ratings to settle, or if the resolution is obvious in either direction for any reason, I will resolve. If I feel the leaderboard is clearly wrong or it is not available at the time and the answer is non-obvious, I will consult experts and/or use a Twitter poll.

Get
Ṁ1,000
and
S1.00
Sort by:

Currently, there are multiple GTP4s being ranked with elo in the arena, which are we comparing to Llama 4? :

• ChatGPT-4o-latest (2024-09-03)
• GPT-4o-2024-05-13
• GPT-4o-mini-2024-07-18

[1] Are future GPT-4 models included in the comparison or just one of the existing ones being ranked?

[2] Will you compare highest GPT-4 elo against the highest Llama elo, or lowest against lowest, or lowest GPT-4 against highest Llama 4?

[2] Please specify, and, is there a tie-breaker in the rare case the models were tied in elo?

Thank you, and please add these instructions to the market to clear any confusion.

@nixtoshi I have made this very clear now.

(And not that it is going to happen, but 'as good' means it only has to tie the Elo number)

How is this only 55? Llama 3 405B should be GPT-4 level Llama 4 should obviously be much better

Because it has to be better AND open-source.

Which GPT-4? The GPT-4 that's serving now is miles ahead on all of the benchmarks compared to what was originally released.

@jonsimon See note below.

Note the dispute in the Llama-3 market. I will use whatever is decided there here, as well. Which means that this is now effectively 'Will Llama-4 be open sourced?'

Do you consider current LLAMA2 to be "open sourced" even though it contains a non-commercial clause?

@AdamTreat https://github.com/facebookresearch/llama/blob/main/LICENSE

"2. Additional Commercial Terms. If, on the Llama 2 version release date, the

monthly active users of the products or services made available by or for Licensee,

or Licensee's affiliates, is greater than 700 million monthly active users in the

preceding calendar month, you must request a license from Meta, which Meta may

grant to you in its sole discretion, and you are not authorized to exercise any of the

rights under this Agreement unless or until Meta otherwise expressly grants you

such rights."

@AdamTreat Yes. If I can get the weights it counts.

@ZviMowshowitz ok, then the question is do you have (or have any expectation of having) greater than 700 million monthly active users ;)