Did OpenAI use MUP for zero shot hyper-parameter transfer in GPT-4? | Manifold

Did OpenAI use MUP for zero shot hyper-parameter transfer in GPT-4?

Mini

5

Ṁ169

Dec 31

81%

chance

1D

1W

1M

ALL

Maximal Update Parameterization is technique published last year by Yang et al. at Microsoft. https://arxiv.org/abs/2203.03466

#— LLM & AI Capabilities—

Get

1,000

and

1.00

Sort by:

predicts YES

@firstuserhere interesting that it is in the bibliography, although the reference in the first image is from a different section of the report with its own bibliography (that [16] actually refers to "DALL·E 2 Preview - Risks and Limitations.").

So the muP paper is in the bibliography, but not referenced anywhere.

@Stefan yep, and even then it's not actually used in gpt-4, the report only mentions the red team to have used the paper?

Related questions

Did OpenAI transcribe Youtube videos to train a GPT model as claimed by NYT?

Will OpenAI release true multimodal image generation for GPT-4.5 before 2026?

When will OpenAI release “GPT-image-2” or an equivalent?

Do OpenAI's o*-series models share a common lineage with GPT-4o?

Will OpenAI's autonomous agent be based on GPT-4?

Will there be evidence in 2025 that in April 2023, OpenAI had a GPT-4.5 or higher model?

Will OpenAI abandon discrete GPT releases in favor of continuous updates?

Related questions

Did OpenAI transcribe Youtube videos to train a GPT model as claimed by NYT?

Will OpenAI's autonomous agent be based on GPT-4?

Will OpenAI release true multimodal image generation for GPT-4.5 before 2026?

Will there be evidence in 2025 that in April 2023, OpenAI had a GPT-4.5 or higher model?

When will OpenAI release “GPT-image-2” or an equivalent?

Will OpenAI abandon discrete GPT releases in favor of continuous updates?

Do OpenAI's o*-series models share a common lineage with GPT-4o?

Terms & Conditions•Privacy Policy•Sweepstakes Rules