GPT-5 plus scaffolding and inference-compute ~= training compute will achieve capabilities advance >= (GPT-4 to GPT-5).

Plus

Ṁ2377

2026

79%

chance

ALL

Question written out without the abbreviations for clarity:

GPT-5, if given scaffolding and inference-compute that is approximately equal to its training compute will achieve a capabilities advance of similar or greater magnitude than the capabilities advance from GPT-4 to GPT-5.

Important! This question is assuming that the capabilities increase from GPT-4 to GPT-5 is at least as large as the increase from GPT-3 to GPT-4. If it is widely agreed that the capabilities increase from GPT-4 to GPT-5 is significantly smaller (e.g. because LLM scaling hits a ceiling), then the question will resolve N/A.

This question is related to, but different from, my other question here: https://manifold.markets/NathanHelmBurger/will-gpt5-be-capable-of-recursive-s

The discussion in the comments section on that question will give you more insight into my thinking, if that's something you want.

#AI

#Technical AI Timelines

Get

1,000

and

1.00

11 Comments

Sort by:

I am working on gathering expert opinions on this question. I do think that the clause about GPT-3 to GPT-4 being roughly the same magnitude as GPT-4 to GPT-5 is definitely met, so the question will not resolve N/A.

The capability advance should be measured above gpt-5, or above public sota or what? This resolves to your credence at time of resolution or...?

@JacobPfau I'm going to talk to some experts and gather opinions. I want to resolve it fairly, and it's somewhat tricky to not be subjective.

https://x.com/adonis_singh/status/1918934825223794888

Another perspective related to this: https://youtube.com/clip/UgkxFgl8Zw2bFKBtS8BPrhuHjtODMNCN5E7H?si=JBw5ZUylexeR43DT

Very difficult to read but i think it mean

Base_delta = "base gpt-5" - "base gpt-4"

Scaffold_bonus = "gpt-5 scaffolded" - "base gpt-4"

bool market_outcome = (scaffold_bonus > Base_delta)

So I guess this is trying to compare intelligence improvements vs tool use? (Though a smarter model should recognize whenever a tool is a more effective option)

Also tool use should be fully integrated.

Gpt-5 may be natively multimodal and have python interpreter access and reference material access at all times in training. I assume if there is no way to benchmark the model without scaffolding the market resolves N/A?

@GeraldMonroe That's not quite what I mean. Maybe this comment will make it more clear: https://www.lesswrong.com/posts/NXcm2zWx2MG4sbQio/deliberative-cognitive-algorithms-as-scaffolding?commentId=3vKkasBqFjDSrCpKv

@NathanHelmBurger interesting. Note that if this works as well as the paper claims you can bake it into the model itself during the RL phase. Scaffolding is all internal, model weights adjusted to effectively use this tool.

This would express itself as 0 improvement using this method on gpt-5, since the model is already doing something similar.

@GeraldMonroe Yes, I agree that the more powerful way to use this scaffolding is to apply it in the RL phase. I expect that things like this (probably including the ideas in the discussed paper) will be included in the RL phase for GPT-5. Which would mean that this exact scaffolding might not show itself to be of much help on top of GPT-5.

Nevertheless, I think that there will be NEW scaffolding which is devised in the future which will be of use on top of GPT-5. Thus, my heavy betting on YES.

For instance, an API by which an LLM could run relatively complicated ML experiments and receive nicely formatted data back once the experiment completed. This is something which I don't think anyone has published about yet, but I do expect will be tried by at least one of the frontier labs.

predicts NO

Can you somehow put the conditional in the title? I missed it at first from not reading full description

predicts YES

@JacobJacob sorry, already ran into max question length as is. Hopefully careful interactors will read the description or see your comment here.

Related questions

Related questions