An AI watchdog accused OpenAI of using copyrighted books without permission

An artificial intelligence watchdog is accusing OpenAI of training its default ChatGPT model on copyrighted book content without permission.

In a new paper published this week, the AI Disclosures Project alleges that OpenAI likely trained its GPT-4o model using nonpublic material from O’Reilly Media. The researchers used a legally obtained dataset of 34 copyrighted O’Reilly books and found that GPT-4o showed “strong recognition” of the company’s paywalled content. By contrast, GPT-3.5 Turbo appeared more familiar with publicly accessible O’Reilly book samples.

“These results highlight the urgent need for increased corporate transparency regarding pre-training data sources as a means to develop formal licensing frameworks for AI content training,” the authors wrote in the paper. Tim O’Reilly, one of the paper’s authors, is a cofounder and CEO of O’Reilly Media.

An OpenAI spokesperson didn’t immediately respond to Fast Company‘s request for comment.

Training data lies at the heart of all artificial intelligence models. Large language models (LLMs) require an incredible amount of information that it uses to guide back on when it churns out text or images for users.

OpenAI has struck up some licensing deals to be able to train their models on certain content. But the company, which recently fundraised and is worth $300 billion, has also come under fire for sourcing certain content. The New York Times, for example, is leading a charge against OpenAI and minority owner Microsoft over alleged copyright infringement.

The researchers acknowledged limitations in their study but argued that the issue is likely part of a broader systemic problem in how large language models are developed.

“Sustainable ecosystems need to be designed so that both creators and developers can benefit from generative AI,” the authors wrote. “Otherwise, model developers are likely to rapidly plateau in their progress, especially as newer content becomes produced less and less by humans.”

https://www.fastcompany.com/91310223/an-ai-watchdog-accused-openai-of-using-copyrighted-books-without-permission?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Creado 5mo | 2 abr 2025, 20:30:07

Inicia sesión para agregar comentarios

Otros mensajes en este grupo.

AI gives students more reasons to not read books. It’s hurting their literacy

A perfect storm is brewing for reading.

AI arrived as both

17 ago 2025, 10:20:08 | Fast company - tech

Older Americans like using AI, but trust issues remain, survey shows

Artificial intelligence is a lively topic of conversation in schools and workplaces, which could lead you to believe that only younger people use it. However, older Americans are also using

17 ago 2025, 10:20:06 | Fast company - tech

From ‘AI washing’ to ‘sloppers,’ 5 AI slang terms you need to know

While Sam Altman, Elon Musk, and other AI industry leaders can’t stop

16 ago 2025, 11:10:08 | Fast company - tech

AI-generated errors set back this murder case in an Australian Supreme Court

A senior lawyer in Australia has apologized to a judge for

15 ago 2025, 16:40:03 | Fast company - tech

This $200 million sports streamer is ready to take on ESPN and Fox

Recent Nielsen data confirmed what many of us had already begun to sense: Streaming services

15 ago 2025, 11:50:09 | Fast company - tech

This new flight deck technology is making flying safer, reducing delays, and curbing emissions

Ever wondered what goes on behind the scenes in a modern airliner’s cockpit? While you’re enjoying your in-flight movie, a quiet technological revolution is underway, one that’s

15 ago 2025, 11:50:07 | Fast company - tech

The case for personality-free AI

Hello again, and welcome to Fast Company’s Plugged In.

For as long as there’s been software, upgrades have been emotionally fraught. When people grow accustomed to a pr

15 ago 2025, 11:50:07 | Fast company - tech

Tomas_r2