Will a Mamba-based LLM of GPT 3.5 quality or greater be open sourced in 2024?
➕
Plus
18
Ṁ1569
Jan 1
79%
chance

Mamba is a next-generation architecture that seeks to improve on the shortcomings of transformers, mainly around context size and eliminating quadratic memory consumption during inference. https://arxiv.org/abs/2312.00752

YES resolution requires the Mamba LLM to match or beat GPT 3.5 on at least 5 popular benchmarks.

Get
Ṁ1,000
and
S1.00
Sort by:

does this count as Mamba-based? it's open and easily above GPT-3.5 quality

https://www.ai21.com/blog/announcing-jamba-model-family

Here's a mamba-transformer-moe hybrid that's about as good as gpt-3.5. ai21.com/jamba.

bought Ṁ10 NO

Looks like Gemini 1.5 used normal transformers and not Mamba, while also seeming to get around these shortcomings (1M context size). I expect this will cause interest in Mamba to wane, which decreases the chance someone will bother training and testing a Mamba LLM to GPT 3.5 level.

@adele I remain unconvinced that transformer architecture will be the long term winner due to its compute and memory-hungry nature. These are great improvements though.