What is true about gpt2-chatbot?

Mini

Ṁ54k

resolved Mar 19

ALL

Resolved

YES

Its knowledge cutoff is in or before Jan 2024

Resolved

YES

Available on ChatGPT by June 1st

Resolved

YES

Made by OpenAI

Resolved

YES

We will learn who created it by June 1, 2024

Resolved

YES

Its full form has modalities other than text and vision input. (e.g., audio input, vision output, etc.)

Resolved

YES

It will be mentioned on OpenAI Live Event on Monday May 13

Resolved

YES

>70% on MATH (few-shot)

Resolved

N/A

It will get matched/surpassed by a LLM from another company/organization by EOY 2024

Resolved

N/A

Mixture of Experts

Resolved

N/A

It runs on H200 GPUs

Resolved

N/A

It generates audio in a discrete manner (next token prediction + vqvae) instead of continuous (next patch prediction, diffusion)

Resolved

N/A

It generate images in a similar fashion to diffusion models

Resolved

N/A

The "also-good-chat-model" version is a smaller variant than the "good-chat-model."

Resolved

N/A

> 15% on SWEBench (few shot or zero shot)

Resolved

N/A

Mixture of Depth/Dynamic allocation of computation to each token depending on difficulty

Resolved

N/A

It uses Hierarchy Alignment (https://arxiv.org/abs/2404.13208)

Resolved

N/A

It used RAG/search/function-calls during initial debut on LMSys

Resolved

N/A

A prompt that enables it to solve easy Sudoku puzzles be found by EOY 2024 (same rule as https://manifold.markets/Mira_/will-a-prompt-that-enables-gpt4-to) (resolve NA if not released)

Resolved

Sam Altman will tweet about it again before the day of launch (currently twice) https://twitter.com/sama/status/1787222050589028528

Resolved

> 1315 on Lmsys Arena Leaderboard (2 weeks after first appearance) (for the most capable of gpt2, also, and good variants)

I reserve the rights to NA any added options.

Clarification: gpt2-chatbot include the three variants, i.e. gpt2-chatbot, "i-am-a-good-gpt2-chatbot", and "i-am-also-a-good-gpt2-chatbot"

related market:

What is “gpt2-chatbot”?

What made gpt2-chatbot smarter?

Get

1,000

and

1.00

27 Comments

Sort by:

@Sss19971997 @mods There was no news indicating >50% was synthesized, so this answer can resolve No.

@Primer Okay, I found similar results, resolving No.

@traders This has been in the mod queue for 15 days and no other moderators showed up. I tried to look at what is going on but it's all nonsense to me.

This is way too much to expect the mods to just show up and figure out the answer to 20 questions. If you think some should resolve, explain each one concisely with some rationale, and ping the mods again.

@Eliza Maybe best to just N/A everything. Those zombie markets are no use to anyone.

@Primer I think there are probably at least some which can resolve but if traders aren't very interested it seems like a lot will N/A. Let's @mods tag it so someone will notice soon.

@Eliza There's already chris, nikki and you involved in the latest comments. Thought you all were mods?

@Primer Chris has/had tens of thousands of mana on various options, maybe doesn't want to interfere. I'm guessing for nikki and myself we both have no idea what any of this stuff means and/or not enough context to resolve 20+ options. Maybe some other moderator will.

Currently the mod queue seems to grow faster than the current set of moderators can handle it.....

@Eliza Yeah, I was thinking this market should have enough mod attention already.

As for the mod queue: I can only imagine. I suppose the current approach isn't very sustainable. I've been trying to argue towards more acceptance for N/A, which at least might make some tasks less time-consuming. Also, I keep wondering why Manifold doesn't incentivice resolution criteria: /Primer/will-resolution-criteria-be-mandato-h1il222gib

I got other ideas, if people are interested.

Also, sorry, about 30 mod-pings in the last hours are mine, I went through my closed but unresolved trades.

@mods bump

@Sss19971997 now that o1-preview has been released, and is a reasonable match to descriptions of what Q* was supposed to be, can "It is Q star" resolve NO?

Creator is a deleted account, @mods other than me, I'd like to unlock my mana from this market, any chance of a resolution for this option (and perhaps some others)?

It will get matched/surpassed by a LLM from another company/organization by EOY 2024

@Sss19971997 I might resolve this after the Lmsys score of Claude 3.5 Sonnet comes out.

Sonnet 3.5 is statistically significantly worse than 4o. I am not resolving now.

Mixture of Experts

Is there actually any evidence that it is a mixture of experts? Why is this so high?

@benshindel

It explicitly seems like it’s NOT a MoE model

@benshindel The word "a single model" is used to show the difference to previous multimodal solutions, where for vision, people assembled a CLIP with an LLM through projection, and for audio, people used three models that go speech-to-text (Whisper), text reasoning (GPT4), and TTS at the output end.

@benshindel You can bet according to your opinion for sure.

Resolution of GPT-5 and Q* answers @Sss19971997?

@chrisjbillington It might make sense to resolve GPT-5 to NO now, but do we really have any idea what Q* is and how it is definitely not GPT4o?

@Sss19971997 If you're resolving stuff, "It is close to GPT-2 (e.g., the same architecture but with different instruction tuning)" should probably also resolve NO.

@MugaSofer I have learned that deferring resolution is always a good idea.

Its knowledge cutoff is in or before Jan 2024

bought Ṁ150 Its knowledge cutoff... YES

The system prompt says it's October 2023. (I just ran a few tests to confirm, and it's guesses about events after that point were complete hallucinations.)

@MugaSofer I am debating if I should resolve according to GPT4o or wait till we find out what the other two variants (gpt2-chatbot, i-am-a-good-gpt2-chatbot) are?

@MugaSofer For example, what if any of the two variants have a later knowledge cutoff? (though very unlikely)

@Sss19971997 Obviously it would benefit me to have it resolved immediately.

How would you resolve it if some fit the criterion and some didn't? (For any question.) You did already resolve several options based on things that we only know for sure apply to one of the models.

@MugaSofer Any variant fitting count.

@Sss19971997 In that case you're safe to resolve YES; at least the main variant's knowledge cutoff "is in or before Jan 2024".

Related questions

Related questions