OpenAI has upped the stakes in artificial intelligence research with its latest upgrade, o3, which boasts improved reasoning skills and is three times better than its predecessor at tackling complex problems.
OpenAI Upgrades Its Smartest AI Model With Improved Reasoning Skills
A day after Google announced its first model capable of reasoning over problems, OpenAI has upped the stakes with an improved version of its own. The company today announced an upgraded artificial intelligence (AI) model, called o3, which replaces o1, introduced in September.
What Makes o3 So Special?
Like o1, the new model spends time deliberating over a problem to deliver better answers to questions that require step-by-step logical reasoning. OpenAI‘s CEO Sam Altman views this as “the beginning of the next phase of AI,” where models can be used for increasingly complex tasks that require a lot of reasoning.
Improved Performance
The o3 model scores much higher on several measures than its predecessor, including ones that measure complex coding-related skills and advanced math and science competency. It is three times better than o1 at answering questions posed by ARC-AGI, a benchmark designed to test an AI model’s ability to reason over extremely difficult mathematical and logic problems.
Competition Heats Up
Google is pursuing a similar line of research with its own reasoning model, called Gemini 2.0 Flash Thinking. However, OpenAI’s new o3 model is 20 percent better than o1, according to Ofir Press, a post-doctoral researcher at Princeton University who helped develop SWE-Bench.
Deliberative Alignment
OpenAI has also revealed more details of techniques used to align o1. The new method, known as deliberative alignment, involves training a model with a set of safety specifications and having it reason about the nature of the request as well as its own answer to interrogate whether it may contravene its guardrails.
Implications for AI Research
The development of models that can reason over problems will be important as companies seek to deploy so-called AI agents that can reliably figure out how to solve complex problems on a user’s behalf. Mark Chen, senior vice president of research at OpenAI, said, “This really signifies that we are really climbing the frontier of utility.