Why we may be headed for a generative AI winter

Welcome to AI Decoded, Fast Company’s weekly newsletter that breaks down the most important news in the world of AI. You can sign up to receive this newsletter every week here.

The generative AI winter, part two

The threat of a new “AI winter” may dominate the AI conversation in the latter half of 2024.

AI companies and their investors have been telling us for more than a year now that generative AI will create untold amounts of wealth by increasing worker productivity. There’s no doubt that generative AI is already increasing productivity in some areas, such as graphic design and legal research work. But there’s little evidence that the technology is broadly unleashing enough new productivity to push up company earnings or lift stock prices.

Many large companies dove into building and piloting generative AI infrastructure and apps last year. Many of them found the undertaking complex and time-consuming, and many are still struggling to find the productivity returns on their investments. “You see very few people out there today saying, ‘Yes, we use this everyday, and it’s absolutely essential to what we do,’” said AI expert and ex-NYU professor Gary Marcus during a ">recent interview with The Agenda.

Have the companies leading the AI boom overhyped the technology? OpenAI’s Sam Altman, the face of the AI boom, said earlier this year that AI models that can do most revenue-generating tasks better than humans will arrive in the “reasonably close-ish” future. He suggested, in front of Congress, that AI systems will become so powerful as to ">threaten humanity itself, and that regulators might consider requiring AI companies to get a license from the government to develop very large models. But artificial general intelligence—that is, a point where AI systems can learn to accomplish any intellectual task that human beings can perform—still seems far away right now.

Marcus says the big leaps forward in AI model performance are coming less frequently. “Everybody got super excited last year, but we are running out of improvements; at least for a little while, things are slowing down,” Marcus said. Actually, some of the biggest gains in performance are coming from open-source models, such as Meta’s Llama-3 and others, which are now catching up with closed, proprietary models like OpenAI’s.

OpenAI COO Brad Lightcap said on a recent &t=1280s">20VC podcast that the expectations born in the wake of ChatGPT have now outpaced what the technology is able to do. “These models are not that good,” he said. “I think, very quickly, expectations will start to come down after people come in contact with today’s models.” He was quick to add that OpenAI expects its models to get very good “very quickly,” and that he expects to see “this inversion of expectations versus reality where all of sudden expectations have to catch up.”

And Meta’s Yann LeCun has always said that, while transformer-based LLMs have made tremendous strides over the past five years, they’re still “not that smart.”

Lightcap told me during an interview last October that his company was trying to make AI models more “agentic.” Indeed, the AI industry at large is trying to push LLMs beyond ChatGPT-style text drafting and summarizing tools to become autonomous agents that can reason and make plans. “Emerging startups are now focusing on AI agents, which will act more independently from humans and complete tasks on behalf of humans on the internet and in other apps, without the human being present,” says Jeremiah Owyang of Blitzscaling Ventures.

Agent Lunar, for example, is developing an AI platform that offers businesses “AI teammates,” or agents that specialize in different aspects of the business. Many agent companies will build their platforms on top of foundation models like GPT-4, and the agents will become more capable as new models come online. Still, building more “agentic” behaviors into LLMs is a difficult problem, and some experts believe that the first truly useful agents won’t show up until 2026 or 2027.

In the meantime, the fiery glow around the generative AI space will likely begin to fade.

AI email apps remain limited—and expensive

I’ve long been on the lookout for generative AI apps that actually help me in my job as a journalist. I’ve not found many, partly because LLMs still hallucinate. One real pain point is the time I spend editing interview transcripts, for example, but I’ve had only limited success using consumer chatbots to ingest an interview transcript and turn it into a readable (and quotable) text.

I could also use some help parsing my email inbox. But my search for a consumer product that could scan my inbox and quickly offer up complex searches on its contents (for example, “find me three experts who can explain the TikTok ban”) has been mostly unsuccessful. I talked to a startup called Floode at a recent AI event; the founder said the tool can ingest an inbox full of data and do retrieval on it (and do a lot of other things like plan events and create to-do lists). But Floode is aimed at enterprise execs, and there’s a waiting list to try it.

I also tried a well-reviewed consumer product in the “email parser” space called Airparser, which uses OpenAI LLMs to intelligently extract information from your email (and attachments). But the results were mixed. For about three-fourths of the emails, the tool extracted the information I wanted; but some 25% of the information contained errors—sometimes, for example, identifying me as the subject expert.

Price presents another big hurdle with Airparser and many of the other AI email parser tools I saw. It cost me 10 credits to do one extraction from my 10 emails—and I was allotted only 30 credits in the free trial. So to dig through hundreds of emails for expert sources would require a lot of credits—probably more than what’s offered by Airparser’s largest publicized plan, which is a $250 Premium tier with 5,000 credits. A starter plan ($32.50/month) gets you 100 credits.

Perplexity becomes a Unicorn

The AI search company Perplexity, one of just a few consumer-facing AI startups having success and demonstrating real potential, just raised another $62.7 million in its fourth round of funding. The San Francisco-based company raised the funding at a $1.04 billion valuation, putting it in the Unicorn club, with a total funding of $165 million. The round itself was nothing special in terms of the dollar amount. It was the list of investors that’s notable here. Jeff Bezos, Nvidia, and NEA doubled down in the new round, and OpenAI cofounder Andrej Karpathy invested for the first time. Some of the money will likely go to hiring more researchers and builders to add to its roughly 55-person head count. I also learned recently that the company will soon move from the shared workspace it currently occupies into an entire floor of a downtown San Francisco office building.

Perplexity is one of just a handful of startups that have dared to challenge Google in internet search (and potentially go after a piece of the $120 billion search advertising business). The company started out with a free AI “answer engine” that creates a narrative answer with linked citations to the sources it called upon. It later added a $20/month Pro tier that includes a search copilot and direct access to powerful third-party large language models, including OpenAI’s GPT-4 and Meta’s Llama 3. It’s now preparing to launch a $40/month enterprise version that offers enhanced security and other features.