What Grok’s recent OpenAI snafu teaches us about LLM model collapse

In the year since ChatGPT was released to the public, researchers and experts have warned that the ease with which content can be created using generative AI tools could poison the well, creating a vicious circle where those tools produce content that is then used to train other AI models.

That so-called “model collapse”—which would hollow out any “knowledge” accrued by the chatbots—appears to have come true.

Last week, X user Jax Winterbourne posted a screenshot showing that Grok, the large language model chatbot developed by Elon Musk’s xAI, had (presumably unintentionally) plagiarized a response from rival chatbot-maker OpenAI. When asked by Winterbourne to tinker with malware, Grok responded that it could not, “as it goes against OpenAI’s use case policy.”

“This is what happened when I tried to get it to modify some malware for a red team engagement,” Winterbourne explained in his post, suggesting that the response could be evidence that “Grok is literally just ripping OpenAI’s code base.”

That explanation was denied by Igor Babuschkin, a member of technical staff at xAI who has previously worked for both OpenAI and Google DeepMind. “Don’t worry, no OpenAI code was used to make Grok,” he replied on X.

Instead, it was model collapse—though Babuschkin didn’t use those exact words. “The issue here is that the web is full of ChatGPT outputs, so we accidentally picked up some of them when we trained Grok on a large amount of web data,” he wrote. “This was a huge surprise to us when we first noticed it.” Grok was notably set up to pull from livestreams of internet content, including X’s feed of posts, which was identified as a potential issue by experts who spoke to Fast Company a month ago.

“It really shows that these models are not going to be reliable in the long run if they learn from post-LLM age data—without being able to tell what data has been machine-generated, the quality of the outputs will continue to decline,” says Catherine Flick, a professor of ethics and games technology at Staffordshire University.

The reason for that decline is the recursive nature of the LLM loop—and exactly what could have caused the snafu with Grok. “What appears to have happened here is that Elon Musk has taken a less capable model,” says Ross Anderson, one of the coauthors of the original paper that coined the term model collapse, “and he’s then fine-tuned it, it seems, by getting lots of ChatGPT-produced content from various places.” Such a scenario would be precisely what Anderson and his colleagues warned could happen come to life. (xAI did not respond to Fast Company’s request for comment.)

And it’s likely to get worse, Anderson warns. “When LLMs are producing output without human supervision, they can produce all sorts of weird shit,” he says. “As soon as you’ve got an LLM bot that’s just spewing all sorts of stuff out on the internet, it could be doing all sorts of bad things and you just don’t know.” Nearly half of gig workers on Amazon’s Mechanical Turk platform, which is often employed by academic researchers to gather data and conduct experiments, have reported using generative AI tools, suggesting hallucinations and errors could soon find their way into scientific literature.

The particular phrasing that first tipped off Winterbourne, the X user, to something suspicious going on with Grok is not exactly unique. “[I]t goes against OpenAI’s use case policy” appears on thousands of websites prior to Winterbourne’s tweet on December 9. And including stories and commentary written about last weekend’s shock finding, there are around 20,000 results on the web that use the exact same phrasing.

While some are quotes included in stories about how people are misusing ChatGPT and running up against its built-in limitations, many are from websites that appear to have unwittingly included the phrase in AI-generated content that has been published directly to the internet without editing.

In short: ChatGPT outputs are already out there, littered across the web. And as new LLMs scour the web looking for more training data, they’re increasingly likely to pick up more AI-generated content for wider use, including in businesses and governments.

Winterbourne’s issues with Grok are just the tip of the iceberg. A visual representation of the damage that model collapse can have has been demonstrated by researchers at Stanford University and the University of California, Berkeley, who fed generative AI image creators with AI-generated output. The distortions and warping that happened turned perfectly normal human faces into grotesque caricatures, as the model begins to break. The fun “make it more” meme that has circulated on social media, where users ask AI image generators to make their output more extreme, also highlights what can happen when AI begins to train itself on AI-generated output.

Those who have already started complaining about the reliability of AI-enhanced search results and humanity’s collective knowledge are about to encounter things getting much worse. Flick compares it to an ouroboros—a snake eating its own tail.

“Each generation of a particular model will be that much less reliable as a source of true facts about the world because each will be trained with an ever less reliable data set,” says Mike Katell, ethics fellow at the Alan Turing Institute. “Given that the accuracy and reliability of tools like ChatGPT are a major problem now, imagine how difficult it will be to get these models to portray reality when an ever-larger ratio of their training data is full of generated errors and falsehoods?”

It’s an issue that is likely to only get worse as LLM-based chatbots become more ubiquitous in our day-to-day lives, and their outputs become more common in our online experience. Fixing it isn’t an easy solution, either, once the slide down the slippery slope has begun. “I suspect [xAI will] just do some sort of exclusion of ‘OpenAI’ and other model names and plaster over the issue, but the underlying problem won’t go away,” says Flick. “The machine will continue to eat its own creations until there’s just a blur of what was original left.”

https://www.fastcompany.com/90998360/grok-openai-model-collapse?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Created 1y | Dec 14, 2023, 6:20:04 PM


Login to add comment

Other posts in this group

This new app helps chronic latecomers stay on time

Every friend group has one person who’s always running late. If you can’t think of one, chances are you’re that friend.

Now, a newly launched app

Apr 30, 2025, 6:30:08 PM | Fast company - tech
Duolingo doubles its language offerings with AI-built courses

Duolingo launched 148 new language classes that were built by generative AI, the company announced Wednesday.

The move, which more than doubles it current language offering, comes as th

Apr 30, 2025, 6:30:06 PM | Fast company - tech
100 men vs. 1 gorilla: The  internet’s wildest debate yet

Pretend you and 99 peers had to duke it out against a gorilla. Would your squad emerge victorious? That debate has been dividing the internet over the past few days.

The conversation ori

Apr 30, 2025, 6:30:05 PM | Fast company - tech
What to know about the ‘revenge porn’ bill that’s headed to Trump’s desk for approval

Congress has overwhelmingly approved bipartisan legislation to enact stricter penalties for the distribution of

Apr 30, 2025, 4:10:05 PM | Fast company - tech
Skype saved me in a war zone. Now it’s going away

The year is 2014, and I’m stuck in Ukraine. I have a particularly antsy mother who wasn’t keen on me visiting the country just weeks into

Apr 30, 2025, 1:50:06 PM | Fast company - tech
Marc Lore wants AI to feed you—and make you healthier

Billionaire entrepreneur, NBA owner, and CEO of Wonder Marc Lore reveals that he plans all his meals with AI—and he loves it. It’s just one part of his vision for transforming people’s relationshi

Apr 30, 2025, 1:50:04 PM | Fast company - tech
The NFT market fell apart. Brands are still paying the price

The NFT market crash has a long tail.

In the late 2010s, crypto enthusiasts and web3 advocates celebrated the arrival of digital art. Non-Fungible Tokens, they argued, could offer the pe

Apr 30, 2025, 1:50:03 PM | Fast company - tech