An AI watchdog accused OpenAI of using copyrighted books without permission

An artificial intelligence watchdog is accusing OpenAI of training its default ChatGPT model on copyrighted book content without permission.

In a new paper published this week, the AI Disclosures Project alleges that OpenAI likely trained its GPT-4o model using nonpublic material from O’Reilly Media. The researchers used a legally obtained dataset of 34 copyrighted O’Reilly books and found that GPT-4o showed “strong recognition” of the company’s paywalled content. By contrast, GPT-3.5 Turbo appeared more familiar with publicly accessible O’Reilly book samples.

“These results highlight the urgent need for increased corporate transparency regarding pre-training data sources as a means to develop formal licensing frameworks for AI content training,” the authors wrote in the paper. Tim O’Reilly, one of the paper’s authors, is a cofounder and CEO of O’Reilly Media.

An OpenAI spokesperson didn’t immediately respond to Fast Company‘s request for comment.

Training data lies at the heart of all artificial intelligence models. Large language models (LLMs) require an incredible amount of information that it uses to guide back on when it churns out text or images for users.

OpenAI has struck up some licensing deals to be able to train their models on certain content. But the company, which recently fundraised and is worth $300 billion, has also come under fire for sourcing certain content. The New York Times, for example, is leading a charge against OpenAI and minority owner Microsoft over alleged copyright infringement.

The researchers acknowledged limitations in their study but argued that the issue is likely part of a broader systemic problem in how large language models are developed.

“Sustainable ecosystems need to be designed so that both creators and developers can benefit from generative AI,” the authors wrote. “Otherwise, model developers are likely to rapidly plateau in their progress, especially as newer content becomes produced less and less by humans.”


https://www.fastcompany.com/91310223/an-ai-watchdog-accused-openai-of-using-copyrighted-books-without-permission?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Établi 5mo | 2 avr. 2025, 20:30:07


Connectez-vous pour ajouter un commentaire

Autres messages de ce groupe

5 excellent free podcast apps for iOS and Android

I’m going to go out on a limb and assume you’ve been on the internet before. If so, you’ve likely stumbled upon a podcast or two. There are almost 5 million of them out there, after all.

<p

18 août 2025, 23:30:03 | Fast company - tech
Philips CEO Jeff DiLullo on how AI is changing healthcare today

AI is quietly reshaping the efficiency, power, and potential of U.S. h

18 août 2025, 21:10:07 | Fast company - tech
How satellites and orbiting weapons make space the new battlefield

As Russia held its Victory Day parade this year, hackers backing the Kremlin hijacked an orbiting satel

18 août 2025, 21:10:06 | Fast company - tech
Meta spent $27 million protecting Mark Zuckerberg last year, more than any other CEO

The targeted murder of United Healthcare CEO Brian Thompson last December put the business w

18 août 2025, 21:10:05 | Fast company - tech
Tesla lowers monthly lease fee due to UK sales slump

British motorists can now lease a Tesla

18 août 2025, 21:10:05 | Fast company - tech
Google fined $36 million for anticompetitive deals with Australia’s largest telcos

Google has agreed to pay a 55 million Australian dollar ($36 million) fine for signing anticompetitive deals with Australia’s two largest telecommun

18 août 2025, 18:50:02 | Fast company - tech
‘Pips,’ a new logic puzzle from New York Times Games, might just be your next ‘Wordle’

On an average day, tens of millions of people visit The New York Times Games section to solve the latest crossword puzzle, keep their

18 août 2025, 16:30:05 | Fast company - tech