OpenAI recently announced the release of a new family of large language models (LLMs) called “o1,” aimed at tasks related to science, technology, engineering, and math (STEM). This announcement surprised many, as there were expectations for a model named “Strawberry” or even GPT-5.
The o1 family introduces two models: the o1-preview and the less advanced o1-mini. These models are currently available to ChatGPT Plus users and developers through OpenAI’s paid API, enabling them to test the models in various applications, especially those requiring deep reasoning.
OpenAI describes the o1 models as having advanced reasoning capabilities, with the ability to “try different strategies, recognize mistakes, and engage in a full thinking process,” as explained by Michelle Pokrass, OpenAI’s API Tech Lead.
These models reportedly perform similarly to PhD students on challenging benchmarks, excelling in reasoning-related tasks, particularly when compared to the GPT series, according to Nikunj Handa, a Product Lead at OpenAI.
The o1 models are currently limited to text inputs and outputs, lacking multimodal capabilities like those found in GPT-4o, which can process image and file inputs. Additionally, the o1 models cannot connect to web browsing, meaning they rely on knowledge up to their training cutoff in October 2023.
While slower to respond, often taking over a minute for outputs, developers with early access have reported significant improvements in coding tasks and drafting complex documents, suggesting these models could be valuable for specific applications despite their limitations.
OpenAI recommends that developers interested in reasoning tasks experiment with the o1 models, especially for complex problems that can tolerate longer response times. However, they caution that for tasks requiring faster responses or multimodal inputs, GPT-4o remains a better choice. Developers are encouraged to test o1-preview and o1-mini on tasks like coding challenges and provide feedback to OpenAI to improve the models.
Pricing for the o1 models is notably higher than other OpenAI models. The main o1-preview model is the most expensive, costing $15 per 1 million input tokens and $60 per 1 million output tokens, compared to GPT-4o’s $5 and $15 respectively. On the other hand, the o1-mini model is more affordable, priced at $3 per 1 million input tokens and $12 per 1 million output tokens. OpenAI plans to adjust pricing over time based on feedback and usage patterns.
The o1 models have a context limit of 128,000 tokens, comparable to GPT-4o, and can produce a maximum of 32,768 tokens in a single output, with o1-mini able to handle double that amount. Developers have already begun exploring various use cases, including generating white papers, optimizing staff schedules, and designing infrastructure, showcasing the models’ potential for complex, reasoning-intensive tasks.
In less than 24 hours since their release, developers have tested the o1 models for a variety of applications. These include generating detailed plans and white papers with citations, optimizing organizational workflows, creating apps and games quickly, and even completing request-for-proposal (RFP) documents autonomously. While still in its early stages, the o1 family has already proven its ability to tackle sophisticated reasoning tasks with high accuracy.
Developers can access the new o1 models through OpenAI’s public API, Microsoft Azure OpenAI Service, Azure AI Studio, and GitHub Models. While not suitable for all use cases, the o1 models offer exciting opportunities for developers working on complex, reasoning-driven applications. OpenAI plans to continue enhancing both the o1 family and the GPT series, giving developers ample tools to build new and innovative solutions.