๐Ÿ• Will A.I. Achieve Significantly Higher Performance Over "General Conceptual Skills" by end of 2024?
Mini
8
แน€180
Jan 1
30%
chance

Continuation of:

https://manifold.markets/PatrickDelaney/will-ai-achieve-significantly-highe

I reserve the right to change the metrics if they have grown stale in the above. Aim to get this finalized by end of January 2024.

Get
แน€1,000
and
S1.00
Sort by:
predicts YES

The currently available flagship models (PaLM 2, GPT-4, and Gemini Pro) have not yet been evaluated. As far as I can tell, the largest model is the original PaLM, not PaLM 2. Additionally, it is GPT-3, not GPT-4V which is being evaluated. You can verify this in their published paper.

This is because GPT-4 stated in their technical report that they are not evaluating using BIG-Bench because "portions of BIG-Bench were inadvertently mixed into the training set..." (pg. 6).

Given that the question is trying to gauge whether advances in AI this year are significantly higher with respect to "general conceptual skills", I would argue we need a new metric which includes the current state of the art models.

I don't think you can fairly resolve this market by carrying over the old metric of achieving a 60 on the BIG-Bench Lite to another test. I propose resolving this N/A and remaking this with the Massive Multitask Language Understanding.