Does OpenAI's Q* 'breakthrough' represent a significant advance in AI capabilities?

Plus

Ṁ16k

Jan 1

84%

chance

ALL

Resolvse to the opinion of the AI safety community, whenever information that could resolve this comes out and consensus is reached, on my judgement or the judgement of a moderator resolution council if a single person disputes it

Get

1,000

and

1.00

8 Comments

Sort by:

This benchmark shows it only marginally improves the score. I mean sure it is better, but it also thinks way longer. Comparing to traditional benchmarks is also misleading, because it uses multi-step thinking, which could be trivially added to e.g. Claude as well using Auto GPT or similar, would be interesting to see a comparison then.

https://aider.chat/2024/09/12/o1.html

bought Ṁ30 YES

https://www.theinformation.com/articles/openai-races-to-launch-strawberry-reasoning-ai-to-boost-chatbot-business

bought Ṁ500 YES

@jacksonpolack for the purpose of this question you count o1 as being Q* right? OpenAI doesn't need to explicitly mention the old name?

I think that maybe the "AI safety community" isn't the best authority on what constitutes a breakthrough in AI capabilities.

Could you clarify what you think should or shouldn't count count as a breakthrough?

predicts NO

Hm.

In spirit, the idea is if it's something worth being interested in or nervous about in terms of AI capabilities. So the idea is, if I'm thinking about AI safety, or the general rate of AI advancement, should I pay any attentiont to what Q* is? This is obviously pretty fuzzy, but I don't think there's a less soft way to make a market on the topic, considering I don't know too much about what the thing is or what it accomplished.

Some potentially useful clarifications :)
- does it need to have ultimately been related to the firing?
- some anchors (e.g. would Transformers, GPT-3, GPT-4, RLHF, AlphaGo, AlphaZero, OpenAI Five etc be counted as 'significant capabilities advances')
- If Q* is a model, does it matter if the underlying approach needs to be subsequently scaled up?

1) No, edited title
2) Transformers, gpt-3 and 4, alphago, alphazero, and rlhf would count. No position on openai five.
3) No

Related questions

Related questions