DeepSeek has called into question Big AI’s trillion-dollar assumption

Recently, Chinese startup DeepSeek created state-of-the art AI models using far less computing power and capital than anyone thought possible. It then showed its work in published research papers and by allowing its models to explain the reasoning process that led to this answer or that. It also scored at or near the top in a range of benchmark tests, besting OpenAI models in several skill areas. The surprising work seems to have let some of the air out of the AI industry’s main assumption—that the best way to make models smarter is by giving them more computing power, so that the AI lab with the most Nvidia chips will have the best models and shortest route to artificial general intelligence (AGI—which refers to AI that’s better than humans at most tasks).

No wonder some Nvidia investors are questioning their faith in the unlimited demand for the most powerful AI chips in the future. And no wonder some in AI circles are questioning the world view and business strategy of OpenAI CEO Sam Altman, the biggest evangelist for the “brute force” approach to ever-smarter models.

“The assumption behind all this investment is theoretical . . . the so-called scaling laws where when you double compute, the quality of your models increases in kind of the same way—it’s kind of a new Moore’s Law,” says Abhishek Nagaraj, a professor at the University of California–Berkeley’s Haas business school. (Moore’s Law said that software developers could expect microchips to become predictably more powerful as chipmakers packed more transistors into their microchips.)

“And so if that holds, it effectively means that whoever controls the infrastructure will control a lot of the market,” adds Nagaraj. That’s why companies like OpenAI, Anthropic, and X are building data centers as fast as they can. OpenAI CEO Sam Altman last year said he needs to raise $7 trillion to build the data centers needed to reach AGI. OpenAI, Microsoft, Softbank, and Oracle said recently they’ll spend up to $500 billion over the next five years to build new data centers for AI in Texas.

Attracting the money to do that, however, is something only “closed-source” companies like OpenAI can do, Nagaraj points out. OpenAI’s private equity backers (such as Andreessen Horowitz) and big tech backers (such as Microsoft) are willing to bankroll the AI infrastructure (chips, software, data centers, electricity), which OpenAI says it needs, if it keeps the recipes of its models secret. That’s the “moat” around their investment, after all. Establishing such a moat was the main reason OpenAI stopped being an “open” AI company back in 2019.

DeepSeek shares the weights of its models (the mathematical calculations at each connection point in their neural networks) and allows any developer to build with them. After essentially giving away its research and eschewing a moat, DeepSeek was never going to attract the private equity funding needed to bankroll hundreds of thousands of Nvidia chips. Adding to its challenge were the U.S. chip bans that reserved the most powerful AI chips for U.S. companies. So DeepSeek found ways to build state-of-the-art models using far less computing power. In doing so, it appears to have collapsed Altman’s assumption that massive computing power is the only route to AGI.

Not everybody thinks so, of course. Particularly in OpenAI circles. “I would never bet against compute as the upper bound for achievable intelligence in the long run,” says Andrej Karpathy, one of the original founders of OpenAI, in an X post. “Not just for an individual final training run, but also for the entire innovation/experimentation engine that silently underlies all the algorithmic innovations.”

Altman, too, seemed undeterred. “We will obviously deliver much better models and also it’s legit invigorating to have a new competitor! We will pull up some releases . . . ,” he posted breezily on X. “But mostly we are excited to continue to execute on our research roadmap and believe more compute is more important now than ever before to succeed at our mission.” OpenAI’s “mission” is AGI.

Lots of powerful chips will be needed, if only because the general demand for AI services is going to grow exponentially. More data centers will be needed just to respond to calls from millions of AI-infused apps built on OpenAI APIs, he added.

Some have suggested that DeepSeek’s discovery of ways to build more compute-efficient advanced AI models could reduce the barrier to entry and allow far more developers to build such models of their own, therefore pushing up demand for AI chips.

For example, DeepSeek’s most recent model, DeepSeek-R1, provided the open-source world with a reasoning model that appears to be comparable to OpenAI’s state-of-the-art o1 series, which applies more computing power at inference time, when the model is reasoning through various routes to a good answer. In a statement Monday, Nvidia gives DeepSeek props for creating reasoning models using “widely available” Nvidia GPUs, and adds that such models require “significant numbers” of the GPUs as well as fast chip-to-chip networking technology.

The latest DeepSeek models have only been available to developers for a short time. Just like when Meta introduced its open-source Llama models, it will take some time to understand the real economics of building new models and apps based on the DeepSeek models. It’s possible that more widely distributing the ability to build cutting edge models could put more brains to work on finding novel routes to AGI and, later, superintelligence. That’s the good news. The bad news may be that powerful models, and the means to build them, will become more available to people who might use them maliciously, or who may not be fastidious about using accepted safety guardrails.

But DeepSeek is not perfect. The DeepSeek chatbot has in anecdotal cases emphatically misidentified itself as the creation of OpenAI or Microsoft. Nor can the chatbot speak freely on all subjects. “Like all Chinese AI companies, DeepSeek operates within the People’s Republic of China’s regulatory framework, which includes restrictions on how language models handle politically sensitive topics,” says David Bader, a professor at the New Jersey Institute of Technology. “These constraints are evident in how their models respond to queries about historical events and government policies.” If you ask the chatbot about the Tiananmen Square protests, for example, it responds with, “Let’s talk about something else.”

https://www.fastcompany.com/91268664/deepseek-called-into-question-big-ai-trillion-dollar-assumption-openai?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Vytvořeno 5mo | 29. 1. 2025 2:10:04

Chcete-li přidat komentář, přihlaste se

Ostatní příspěvky v této skupině

This free website is like GasBuddy for parking

Parking in a city can be a problem. It’s not just about finding parking—it’s about finding the right parking. Sometimes, there’s a $10 parking spot only a block away from a garage that ch

14. 6. 2025 11:40:07 | Fast company - tech

How a planetarium show discovered a spiral at the edge of our solar system

If you’ve ever flown through outer space, at least while watching a documentary or a science fiction film, you’ve seen how artists turn astronomical findings into stunning visuals. But in the proc

14. 6. 2025 11:40:05 | Fast company - tech

Apple just made 3 great new privacy and security enhancements—but missed these 3 opportunities

This week, Apple previewed its redesigned (and renumbered) operating syste

14. 6. 2025 9:30:02 | Fast company - tech

TikTok users are exposing their worst exes—all to the soundtrack of Lorde’s new single

The latest TikTok trend has people exposing their terrible exes and most toxic relationship stories to Lorde’s new single “ ">Man of the Ye

13. 6. 2025 21:50:03 | Fast company - tech

Trump’s ‘gold card’ visa scheme is pure gilded nonsense

President Donald Trump announced, back on February 25, that his administration would soon debut a “gold card,” an immigration program that would allow wealthy foreigners, for the low, low price of

13. 6. 2025 19:30:05 | Fast company - tech

How singles are using AI to improve their online dating success

Singles are increasingly turning to AI to boost their odds in the dating world.

According to a new study, just over a quarter (26%) of singles are using artificial intelligence to enhanc

13. 6. 2025 19:30:04 | Fast company - tech

Can AI fact-check its own lies?

As AI car crashes go, the recent publishing of a

13. 6. 2025 17:10:07 | Fast company - tech

Tomas_r2