More specifically: "Will a frontier AI lab release a generally available computer use agent product by April 1st 2025?"
A generally available promptable system that can click buttons on either your computer, your browser, or a virtual/remote machine/browser.
You don't need to be able to watch it interact, but it must use the GUI (doesn't count if it only uses the command line/APIs).
It must be a consumer product, not just an API - so Anthropic's computer use developer preview doesn't count.
It must be generally available, not in beta - all ChatGPT users must be able to access it (at least in some regions and on some payment plans).
The agent must be able to use the browser and/or machine in a broad and flexible way. Some restrictions are fine. But if, for example, the agent is restricted to only a small number of allowed websites, then that wouldn't count for the purpose of this market.
It must be made by the company themselves, not just using their model (e.g. Dia browser won't count, unless they got acquired by a frontier lab).
Bloomberg reported on a leak from OpenAI that they plan to release a research preview of an Operator computer use product in January.
"Research prototype" from GDM:
https://deepmind.google/technologies/project-mariner/
Maybe relevant info: theaidigest.org/agent our demo of an agent can use 4o or new sonnet 3.5, and both are decent but do fail fairly frequently
Copilot Vision rolling out from Microsoft: can it click buttons? https://www.microsoft.com/en-us/microsoft-copilot/blog/2024/12/05/copilot-vision-now-in-preview-a-new-way-to-browse/
@jellyberg I can't actually tell from the announcement but here it mentions that it can "interact" https://www.microsoft.com/en-us/microsoft-copilot/for-individuals/copilot-labs?form=MO12KM&OCID=MO12KM