if this market resolves YES following the first run of the model against the benchmark MMLU-Pro. They need to get within 7% of the announced benchmark result in the DeepSeek v3 paper, as per the market description (if that changes or I described it wrong, the precise criterion is that this linked market resolves YES after the first benchmark run)
Update 2025-04-01 (PST) (AI summary of creator comment): - Resolution Conditions:
If the other market resolves no, this market resolves no.
If the other market resolves yes, this market resolves yes only if the evaluation was completed and the market was resolved yes on the first run.
Evaluation Timing:
The evaluation must be completed before January 5 to resolve the main market.
Evaluations completed after January 5 will not be sufficient to resolve this market.
@summer_of_bliss If the other market resolves no this one definitely resolves no. If the other one resolves yes this one resolves yes iff the eval was completed and resolved the market yes the first time it was run. So it depends on whether that eval completed after jan 5 is enough to resolve the main market or not (i would guess it would not be enough?)
@Bayesian ah yep. so this resolves no and the main market yes only if they do at least 2 evals before close
(i mean there’s some chance they get a >400x speed up in the next 6 hours so that they can run the eval exactly twice in the remaining 24 hours right? gotta hedge that)