An AI watchdog accused OpenAI of using copyrighted books without permission

An artificial intelligence watchdog is accusing OpenAI of training its default ChatGPT model on copyrighted book content without permission.

In a new paper published this week, the AI Disclosures Project alleges that OpenAI likely trained its GPT-4o model using nonpublic material from O’Reilly Media. The researchers used a legally obtained dataset of 34 copyrighted O’Reilly books and found that GPT-4o showed “strong recognition” of the company’s paywalled content. By contrast, GPT-3.5 Turbo appeared more familiar with publicly accessible O’Reilly book samples.

“These results highlight the urgent need for increased corporate transparency regarding pre-training data sources as a means to develop formal licensing frameworks for AI content training,” the authors wrote in the paper. Tim O’Reilly, one of the paper’s authors, is a cofounder and CEO of O’Reilly Media.

An OpenAI spokesperson didn’t immediately respond to Fast Company‘s request for comment.

Training data lies at the heart of all artificial intelligence models. Large language models (LLMs) require an incredible amount of information that it uses to guide back on when it churns out text or images for users.

OpenAI has struck up some licensing deals to be able to train their models on certain content. But the company, which recently fundraised and is worth $300 billion, has also come under fire for sourcing certain content. The New York Times, for example, is leading a charge against OpenAI and minority owner Microsoft over alleged copyright infringement.

The researchers acknowledged limitations in their study but argued that the issue is likely part of a broader systemic problem in how large language models are developed.

“Sustainable ecosystems need to be designed so that both creators and developers can benefit from generative AI,” the authors wrote. “Otherwise, model developers are likely to rapidly plateau in their progress, especially as newer content becomes produced less and less by humans.”


https://www.fastcompany.com/91310223/an-ai-watchdog-accused-openai-of-using-copyrighted-books-without-permission?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Létrehozva 3mo | 2025. ápr. 2. 20:30:07


Jelentkezéshez jelentkezzen be

EGYÉB POSTS Ebben a csoportban

Five truths about being a female founder in 2025

Rarely has Silicon Valley experienced a more profound period of transformation than it has in the past handful of years. The big VC boom of 2020–2021. The great VC hangover starting in 2022. The g

2025. júl. 8. 10:40:05 | Fast company - tech
YouTube to Hollywood: We are going to eat you

A YouTube executive needed only 27 minutes to make the case that the company is taking over all aspects of how people create and consume video online.

That was the length of a recent tal

2025. júl. 8. 10:40:04 | Fast company - tech
How AI is advancing even faster than sci-fi visionaries imagined

Every time I read about another advance in AI technology, I feel like another figment

2025. júl. 8. 8:20:06 | Fast company - tech
This planet is drawing huge flares from its young star

Scientists are tracking a large gas planet experiencing quite a quandary as it orbits extremely close to a young star – a predicament never previously observed.

This exoplanet, as

2025. júl. 7. 20:40:06 | Fast company - tech
5 lesser-known Google Pixel phone tricks to make your life a little easier

Journey with me back to the good old days, if you will. There was a time that, when you’d buy a gadget, it’d come with a sometimes verbose but often helpful “instruction manual.”

Not a q

2025. júl. 7. 18:30:04 | Fast company - tech
Everything you need to know about Elon Musk’s ‘America Party’

After more than a week of threats, Elon Musk formally launched the America

2025. júl. 7. 18:30:02 | Fast company - tech