Will a language model that runs locally on a consumer cellphone beat GPT4 by EOY 2026? | Manifold

Will a language model that runs locally on a consumer cellphone beat GPT4 by EOY 2026?

Plus

39

Ṁ3313

2027

79%

chance

1D

1W

1M

ALL

GPT4-0314

For the locally run model, we refer to the Language Model alone, not augmented with search/RAG/function_call. It needs a minimum throughput of 4 tokens/second

Not sure what benchmarks people use in 2026, but let’s say LMSYS Arena for the moment. Will change depends on the trend.

Current SOTA:

I am not sure Phi3(3.8B) can fit on a phone. If not, the current bests are MiniCPM and Gemma 2B

Get

1,000

and

1.00

Sort by:

Gemma 2 it 9b is already higher on llmsys arena. Only 2 points though.

Upd: oops sorry was thinking that criteria is <10b, confused with different question

bought Ṁ20 NO

I'm guessing this means any consumer cellphone? E.g. if a model that fits in 32GB RAM beats GPT-4 and there's only 1-2 phones with that much RAM in 2026 (current record is 24GB), this resolves Yes.

@JoshYou yes. Any consumer cellphone

bought Ṁ10 NO

Runs at what rate? Of its token per minute does it count?

@0482 Let’s say 4 tokens/s

if they are allowed to browse the internet - then for sure. If we are talking about encoding all the knowledge. Then probably not.

@Magnus_ Great question...

How should we specify this? I am thinking that it can do RAG for everything inside the phone but no internet connection. What do you think?

Currently, a phone can hold up to 512GB. It is a lot of info, but not the whole internet.

This criterion captures the "local" aspect.

how you think?

@Magnus_ Another option is to say language model only, no local RAG

@Magnus_ I thought about it again. For a fair comparison, we should have the same standard for the local mobile LM since GPT-4 is not using any RAG/search/tools. I have updated the criterion.

Related questions

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models at the end of 2025?

By January 2026, will we have a language model with similar performance to GPT-3.5 (i.e. ChatGPT as of Feb-23) that is small enough to run locally on the highest end iPhone available at the time?

Will a single model running on a single consumer GPU (<1.5k 2020 USD) outperform GPT-3 175B on all benchmarks in the original paper by 2025?

Most popular language model from OpenAI competitor by 2026?

Will $10,000 worth of AI hardware be able to train a GPT-3 equivalent model in under 1 hour, by EOY 2027?

By January 2026, will a language model with similar performance to GPT-4 be able to run locally on the latest iPhone?

Will we have an open-source model that is equivalent GPT-4 by end of 2025?

Will a model as great as GPT-5 be available to the public in 2025?

GPT-5 level model runnable on phones by 2030?

By 2028 will a language model beat the Ender Dragon?

Related questions

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models at the end of 2025?

By January 2026, will a language model with similar performance to GPT-4 be able to run locally on the latest iPhone?

By January 2026, will we have a language model with similar performance to GPT-3.5 (i.e. ChatGPT as of Feb-23) that is small enough to run locally on the highest end iPhone available at the time?

Will we have an open-source model that is equivalent GPT-4 by end of 2025?

Will a single model running on a single consumer GPU (<1.5k 2020 USD) outperform GPT-3 175B on all benchmarks in the original paper by 2025?

Will a model as great as GPT-5 be available to the public in 2025?

Most popular language model from OpenAI competitor by 2026?

GPT-5 level model runnable on phones by 2030?

Will $10,000 worth of AI hardware be able to train a GPT-3 equivalent model in under 1 hour, by EOY 2027?

By 2028 will a language model beat the Ender Dragon?

Terms & Conditions•Privacy Policy•Sweepstakes Rules