ARC-AGI Prize 2024 (Dwarkesh Podcast) - Which of these scores will be achieved during the 2024 competition?

Premium

Ṁ140k

resolved Dec 4

ALL

Resolved

YES

≥50%

Resolved

≥70%

Resolved

≥85%

Dwarkesh Patel's podcast on June 11, 2024 had guests Francois Chollet and Mike Knoop launching the $1M ARC-AGI Prize.

https://www.dwarkeshpatel.com/p/francois-chollet

https://arcprize.org/

This market resolves independently as "YES" or "NO" for each of the three thresholds based on whether they are achieved during the 2024 competition.

ARC Prize measures AGI progress using the ARC-AGI private evaluation set, the leaderboard is here. Validity of scores will be based on the contest rules and judgement of the Arc Prize sponsors.

https://www.kaggle.com/competitions/arc-prize-2024/rules

#AI

#Technical AI Timelines

#AI Benchmarks

#AGI

#ARC-AGI

Get

1,000

and

1.00

15 Comments

Sort by:

Why was this resolved? It's still 2 days before they announce the winners and score. Was it announced somewhere else and I missed it?

53% was just achieved

https://www.kaggle.com/competitions/arc-prize-2024/leaderboard

49.5% was just achieved

https://www.kaggle.com/competitions/arc-prize-2024/leaderboard

49% was just achieved

https://www.kaggle.com/competitions/arc-prize-2024/leaderboard

reposted

48% was just achieved

reposted

I’m surprised “>50%” is only at 64% right now. Someone’s already managed to get 50% on ARC-AGI with GPT-4o using specific prompts, and it probably won’t be that hard to replicate.

The result on the leaderboard is 42%/43%

Yea because the admins haven’t managed to replicate it yet, but they’ll probably hear the news and replicate it soon

Sorry are you talking about https://www.lesswrong.com/posts/Rdwui3wHxCeKb7feK/getting-50-sota-on-arc-agi-with-gpt-4o or something else? The former is the first entry on arc AGI pub at https://arcprize.org/leaderboard

There are a bunch of limitations for the prize, one of which is that you can't use a closed source model. Now with LLama 3.1 405b the chance is probably higher, but then you also have limited compute for Kaggle that if the rules don't change don't allow afaik to run this model.

bought Ṁ500 ≥70% YES

https://redwoodresearch.substack.com/p/getting-50-sota-on-arc-agi-with-gpt
ineligible for the prize though

reposted

neat benchmark,

IMO not that hard to solve, just have the synthetic data for it trained in just like ChessGPT or anything alike

They wouldn't count that.

Related questions

Related questions