What years will AlphaGeometry win IMO bronze (or higher)?
Mini
10
Ṁ2043
2030
11%
2025
11%
2026
10%
2027
12%
2028
13%
2029
Resolved
NO
2024

A year resolves yes, if a program that is called AlphaGeometry by official Google Deepmind communication can get enough points to win at least bronze on the International Mathematics Olympiad of that year. The years are resolved completely independent of each other.

Criteria:

  • The program does not have to be the exact same code as the that of the paper published on Jan 17, 2024.

  • The program has to be called AlphaGeometry by Google Deepmind. If a program is called AlphaGeometry 2.0, that is not sufficient.

  • If the program is called AlphaGeometry but also something else (to distinguish it from the original version), that is fine (c.f. AlphaGo Fan and AlphaGo Lee)

  • AlphaGeometry has to be actually run on the problems of an IMO and receive enough points for a bronze medal. If nobody publicly announced that AlphaGeometry succesfully ran on IMO problems of a particular year, that year resolves NO.

I will not bet on this market.

Related questions:

Get
Ṁ1,000
and
S1.00
Sort by:

@FlorisvanDoorn I am confused about what names are allowed. You say in one bullet point

If a program is called AlphaGeometry 2.0, that is not sufficient.

But then you immediately say

If the program is called AlphaGeometry but also something else (to distinguish it from the original version), that is fine

What is "2.0" if not "something else to distinguish from the original version"?

The blog post never calls this program AlphaGeometry, but consistently AlphaGeometry 2. Therefore, they really want to emphasize that this is a different program. This is a different situation with AlphaGo, where the versions playing against Fan Hui and Lee Sedol were very different, but Deepmind called both of them AlphaGo. Therefore, AlphaGeometry 2 does not count for this market.

I added this condition as a proxy for "didn't increase capability by too much".

About AlphaGeometry ceasing to exists: it is open source, and a credible claim in these comments explaining that the code successfully got bronze would be sufficient for a YES resolution on this market.

Ok thanks, I had forgotten it was open sourced.

bought Ṁ200 2024 NO

Barring everything else I think 2024 should resolve "NO". Looks like the bronze cutoff was around 17, and there was only one geo problem, making this impossible.

And I am very confused about what names count per the resolution criteria, but "AlphaGeometry 2" sounds a lot like "AlphaGeometry 2.0" which is explicitly not allowed, so perhaps we should be betting things down on the basis that AlphaGeometry versions will be numbered going forwards and software packages that meet the criteria will cease to exist?

bought Ṁ10 2029 YES

alphageometry solves like 1 or 2 of the 6 problems, and the other AI solves the rest right? idk why we would expect alphageometry to solve the non-geometry problems

bought Ṁ30 2027 NO

Man reading the criteria I am suddenly super confused, "AlphaGeometry 2" is not allowed but "AlphaGeometry Fan" would be? Why draw that distinction?

Perhaps I'm misunderstanding something here but:

  • A system so specifically named seems unlikely to be able to solve problems that are not geometry problems

  • Typically there are only one or two geo problems on an IMO, making for a maximum of 7 or 14 points if it gets the problems right.

  • Bronze cutoffs are typically around 14 and always higher than 7.

So this seems pretty unlikely across the board, even if future version of the system are named the same and do always get the geo problems right.

@BoltonBailey Lest people be confused by the impressive-looking chart at the top of their blog post, I am pretty sure that chart is only saying they have performance equivalent to a silver-medalist just on the geo problems. On the non-geo problems, it can't perform as well because it doesn't know how to solve those problems.

@BoltonBailey

About 15% total with no changes to the software.

Possibly even less likely in future years, since I think there is maybe more of a chance that it will get folded into a new software package with a new name and no one will run the old one.

@BoltonBailey I agree with your analysis.

The 5/6 succes rate might be an overestimate, since the current version of AlphaGeometry doesn't even attempt geometric inequalities.