OpenAI begins releasing its next generation of reasoning models with o3-mini

OpenAI released its newest reasoning model, called o3-mini, on Friday. OpenAI says the model delivers more intelligence than OpenAI’s first small reasoning model, o1-mini, while maintaining o1-mini’s low price and speed. The company says o3-mini excels in science, math, and coding problems.

Developers can access o3-mini through an API, and can select between three levels of reasoning intensity. The lowest setting, for example, might be best for less difficult problems where speed of response is a factor. ChatGPT Plus, Team, and Pro users can access OpenAI o3-mini starting today, OpenAI says, while enterprise users will get access in a week.

The announcement comes at the end of a week in which the Chinese company DeepSeek dominated headlines after releasing a pair of surprisingly powerful and cost-effective AI models called DeepSeek-V3 and DeepSeek-R1. The latter, a reasoning model, scored close to, and sometimes above, OpenAI’s o1 in a set of recognized benchmark tests.

“We’re shifting the entire cost‑intelligence curve,” OpenAI researcher Noam Brown said of o3-mini on X. “Model intelligence will continue to go up, and the cost for the same intelligence will continue to go down.” He said o3-mini even outperforms the full-sized o1 model in a number of evaluations.

OpenAI CEO Sam Altman said in December that the o3 series models demonstrate significantly higher levels of intelligence than the o1 models, including in computer coding and problem solving requiring advanced mathematics. The largest version of o3 also achieved the highest score yet of any AI system on a test called ARC-AGI, a logic and reasoning test designed to measure progress toward artificial general intelligence, meaning AI that’s as smart or smarter than humans at most tasks. The o3 model scored 87.5% on the test (humans can score around 85%).

OpenAI originally announced o3, along with a smaller version called o3-mini, in December, but said it would complete its internal safety testing, and get feedback from a group of outside safety and security testors, before launching the models. OpenAI said it would release o3-mini this month, and gave no release timeframe for the larger o3 model.

OpenAI chose not to expose the o1 models’ chain of thought, and the same holds true for o3-mini. Researchers have shown that generating chain-of-thought can sometimes confuse models and pull them off focus. DeepSeek-R1, however, is trained to show its chain of thought, and Google announced in December a new experimental model called Gemini 2.0 Flash Thinking that also shows its “thinking.”

Reasoning models represent a new chapter in developing generative AI models. From 2020-2023 AI labs won almost all of their performance increases by pre-training their models with more data and computing power. That “brute force” approach began to show diminishing returns in 2024, so the AI labs–OpenAI chief among them–began to teach models to do more reasoning (and use more computing power) at inference time just after the user has asked a question or posed a problem. The model might generate multiple streams of tokens at once, then choose which one leads to the best answers. Or it might follow a certain branch of logic then iteratively backtrack after hitting a dead end. The model generates a lot of tokens, which must all be stored in a “context window” while the problem is being solved. This requires a lot of memory and a lot of computing power.

OpenAI’s first try at reasoning models with the o1 series wasn’t perfect. The largest o1 model is very expensive to run and its needs a long time to reach an answer. The o3 models are said to do more reasoning at inference time, but return answers faster using less computing power.

https://www.fastcompany.com/91271011/openai-begins-releasing-its-next-generation-of-reasoning-models-with-o3-mini?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Vytvořeno 4mo | 31. 1. 2025 21:20:04

Chcete-li přidat komentář, přihlaste se

Ostatní příspěvky v této skupině

‘We’re on the cusp of more widespread adoption’: Laura Shin on Trump, stablecoins, and the global rise of cryptocurrency

With the first family actively engaged in memecoin ventures, speculation about the future of cryptocurrency has never been hotter. Laura Shin, crypto expert and host of the podcast Unchained

12. 6. 2025 11:10:06 | Fast company - tech

Thanks to AI, the one-person unicorn is closer than you think

When Mike Krieger helped launch Instagram in 2010 as a cofounder, building something as simple as a photo filter took his team wee

12. 6. 2025 11:10:04 | Fast company - tech

Gen Alpha side hustles: How kids are earning big online before they can even drive

If Gen Z is known as the side hustle generation, Gen Alpha may soon take the crown.

A survey of 2,002 U.S. Gen Alpha and Gen Z individuals (ages 12 to 28) by social commerce platform

12. 6. 2025 6:30:03 | Fast company - tech

Gavin Newsom is having his social media moment

“Fuck around” and “find out,” read a TikTok post that followed a screenshot announcing that California is suing President Donald Trump for deploying

11. 6. 2025 23:30:05 | Fast company - tech

‘This was peak technology’: Gen Z is bringing back the BlackBerry

It’s 2009. Everyone is rocking ankle socks. “TikTok,” is just a Ke$ha song. You pull out your BlackBerry Bold 9700 and update your BlackBerry Messenger (BBM) status. All is well.

Before

11. 6. 2025 21:10:05 | Fast company - tech

Space and defense tech firm Voyager raises $382.8 million in IPO

Voyager Technologies raised $382.8 million in its U.S. initial public offering, the space and defense tech company said on Tuesday, amid a

11. 6. 2025 18:50:03 | Fast company - tech

Hinge is teaming up with Esther Perel to rethink dating prompts

Need help sparking conversation on Hinge? Esther Perel has some questions for you.

The renowned ps

11. 6. 2025 14:20:05 | Fast company - tech

Tomas_r2