Did OpenAI use MUP for zero shot hyper-parameter transfer in GPT-4?
Mini
5
Ṁ169Dec 31
81%
chance
1D
1W
1M
ALL
Maximal Update Parameterization is technique published last year by Yang et al. at Microsoft. https://arxiv.org/abs/2203.03466
Get
1,000
and1.00
Sort by:
@firstuserhere interesting that it is in the bibliography, although the reference in the first image is from a different section of the report with its own bibliography (that [16] actually refers to "DALL·E 2 Preview - Risks and Limitations.").
So the muP paper is in the bibliography, but not referenced anywhere.
@Stefan yep, and even then it's not actually used in gpt-4, the report only mentions the red team to have used the paper?
Related questions
Related questions
Will OpenAI's autonomous agent be based on GPT-4?
19% chance
Will OpenAI release GTP-4.5 before GPT-5?
58% chance
Will OpenAI change their naming scheme (GPT-X) with the successor to GPT-4? (Ṁ200 subsidy!)
14% chance
Will OpenAI's next major LLM (after GPT-4) surpass 70% accuracy on the GPQA benchmark?
66% chance
Will OpenAI abandon discrete GPT releases in favor of continuous updates?
44% chance
Has openAI intentionally made chatGPT lazy to save inference costs?
21% chance
Will the next LLM released by OpenAI be worse than GPT-4 at MMLU?
16% chance
Will there be evidence in 2025 that in April 2023, OpenAI had a GPT-4.5 or higher model?
16% chance
Did OpenAI intentionally handicap GPT4's image modality's ability to identify people?
83% chance
Did OpenAI transcribe Youtube videos to train a GPT model as claimed by NYT?
89% chance