An AI watchdog accused OpenAI of using copyrighted books without permission

An artificial intelligence watchdog is accusing OpenAI of training its default ChatGPT model on copyrighted book content without permission.

In a new paper published this week, the AI Disclosures Project alleges that OpenAI likely trained its GPT-4o model using nonpublic material from O’Reilly Media. The researchers used a legally obtained dataset of 34 copyrighted O’Reilly books and found that GPT-4o showed “strong recognition” of the company’s paywalled content. By contrast, GPT-3.5 Turbo appeared more familiar with publicly accessible O’Reilly book samples.

“These results highlight the urgent need for increased corporate transparency regarding pre-training data sources as a means to develop formal licensing frameworks for AI content training,” the authors wrote in the paper. Tim O’Reilly, one of the paper’s authors, is a cofounder and CEO of O’Reilly Media.

An OpenAI spokesperson didn’t immediately respond to Fast Company‘s request for comment.

Training data lies at the heart of all artificial intelligence models. Large language models (LLMs) require an incredible amount of information that it uses to guide back on when it churns out text or images for users.

OpenAI has struck up some licensing deals to be able to train their models on certain content. But the company, which recently fundraised and is worth $300 billion, has also come under fire for sourcing certain content. The New York Times, for example, is leading a charge against OpenAI and minority owner Microsoft over alleged copyright infringement.

The researchers acknowledged limitations in their study but argued that the issue is likely part of a broader systemic problem in how large language models are developed.

“Sustainable ecosystems need to be designed so that both creators and developers can benefit from generative AI,” the authors wrote. “Otherwise, model developers are likely to rapidly plateau in their progress, especially as newer content becomes produced less and less by humans.”

https://www.fastcompany.com/91310223/an-ai-watchdog-accused-openai-of-using-copyrighted-books-without-permission?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Erstellt 4mo | 02.04.2025, 20:30:07

Melden Sie sich an, um einen Kommentar hinzuzufügen

Andere Beiträge in dieser Gruppe

Russia restricts WhatsApp and Telegram calls

Russian authorities announced Wednesday they were “partially” restricting calls in messaging apps Telegram and WhatsApp, the latest step in an

13.08.2025, 20:30:08 | Fast company - tech

Amazon expands same-day perishable grocery delivery

Amazon is rolling out a service where its Prime members can now order their blueberries and milk at the same time as basic items like batte

13.08.2025, 20:30:07 | Fast company - tech

Most people are using ChatGPT totally wrong—and OpenAI’s CEO just proved it

How did you react to the August 7 release of GPT-5, OpenAI’s latest version of ChatGPT? The company behind the model h

13.08.2025, 18:20:04 | Fast company - tech

This mine feeds the tech world and fuels a rebel war

Under the watchful eye of M23 rebels in the hills around the Congolese town of Rubaya, a line of men in rubber boots ferry sacks full of crushed rocks up winding paths cut into the slopes.

13.08.2025, 18:20:03 | Fast company - tech

This free web timer puts your computer’s Clock app to shame

For something as simple as setting a timer, the built-in apps on our computers can be awfully fiddly.

Usually you have to open a Clock app first, then navigate to a separate tab for time

13.08.2025, 11:20:08 | Fast company - tech

Is agentic AI more than hype? This company thinks it knows how to find out

Over the past five years, advances in AI models’ data processing and r

13.08.2025, 11:20:06 | Fast company - tech

How AI can finally fix prior authorization

If you’ve ever been a patient waiting—days, sometimes more than a week—for treatment approval, or a clinician stuck chasing it, you know what prior authorization feels like. Patients sit in limbo,

13.08.2025, 11:20:04 | Fast company - tech

Tomas_r2