An AI watchdog accused OpenAI of using copyrighted books without permission

An artificial intelligence watchdog is accusing OpenAI of training its default ChatGPT model on copyrighted book content without permission.

In a new paper published this week, the AI Disclosures Project alleges that OpenAI likely trained its GPT-4o model using nonpublic material from O’Reilly Media. The researchers used a legally obtained dataset of 34 copyrighted O’Reilly books and found that GPT-4o showed “strong recognition” of the company’s paywalled content. By contrast, GPT-3.5 Turbo appeared more familiar with publicly accessible O’Reilly book samples.

“These results highlight the urgent need for increased corporate transparency regarding pre-training data sources as a means to develop formal licensing frameworks for AI content training,” the authors wrote in the paper. Tim O’Reilly, one of the paper’s authors, is a cofounder and CEO of O’Reilly Media.

An OpenAI spokesperson didn’t immediately respond to Fast Company‘s request for comment.

Training data lies at the heart of all artificial intelligence models. Large language models (LLMs) require an incredible amount of information that it uses to guide back on when it churns out text or images for users.

OpenAI has struck up some licensing deals to be able to train their models on certain content. But the company, which recently fundraised and is worth $300 billion, has also come under fire for sourcing certain content. The New York Times, for example, is leading a charge against OpenAI and minority owner Microsoft over alleged copyright infringement.

The researchers acknowledged limitations in their study but argued that the issue is likely part of a broader systemic problem in how large language models are developed.

“Sustainable ecosystems need to be designed so that both creators and developers can benefit from generative AI,” the authors wrote. “Otherwise, model developers are likely to rapidly plateau in their progress, especially as newer content becomes produced less and less by humans.”


https://www.fastcompany.com/91310223/an-ai-watchdog-accused-openai-of-using-copyrighted-books-without-permission?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Erstellt 3mo | 02.04.2025, 20:30:07


Melden Sie sich an, um einen Kommentar hinzuzufügen

Andere Beiträge in dieser Gruppe

Perplexity’s new AI features are a game changer. Here’s how to make the most of them

This article is republished with permission from Wonder Tools, a newsletter that helps you discover the most useful sites and apps. 

22.06.2025, 12:10:04 | Fast company - tech
Those security codes you ask to receive via text leave your accounts vulnerable. Do this instead

Do you receive login security codes for your online accounts via text message? These are the six- or seven-digit numbers sent via SMS that you need to enter along with your password when trying to

21.06.2025, 10:40:03 | Fast company - tech
This is the best online file converter—and it’s totally free

We were supposed to be finished with files by now.

For years, tech companies (well, certain tech companies) tooted their horns about a future in which files didn’t matter. You d

21.06.2025, 10:40:02 | Fast company - tech
Astroworld is back in the spotlight and survivors are sharing haunting stories on TikTok

Astroworld is back in the news, and social media has some thoughts.

In November 2021, a

20.06.2025, 23:10:03 | Fast company - tech
Your reliance on ChatGPT might be really bad for your brain

If you value critical thinking, you may want to rethink your use of ChatGPT.

As graduates

20.06.2025, 18:30:02 | Fast company - tech
What is ‘office chair butt’? TikTok’s viral term for a real health problem

Rather than the Sunday scaries or toxic bosses, employees have unlocked a new workplace fear: office chair butt.

While not a new concern, the term has resurfaced on TikTok to describe ho

20.06.2025, 16:10:07 | Fast company - tech