Why data will always be a precious commodity in the AI world

The New York Times lawsuit against OpenAI late last year over the tech company’s use of the newspaper’s journalism to train its large language model (LLM) represented a major move in unprecedented times. It also could portend a shift in the Big Tech/content creator relationship—one that was fraught to begin with and might now turn increasingly litigious. At the heart of the suit is the question of data, and whether the companies behind LLMs can claim “fair use” in gobbling up that data.

When we think about the amount of data that is needed to train LLMs it stands to reason that organizations will be protective over how their proprietary data is used and credited. LLMs require vast amounts of data, and despite OpenAI CEO Sam Altman’s recent claims, OpenAI and ChatGPT need access to a wide range of data to strengthen the model—and this may include both proprietary and non-copyright work. The high quality and reliability of the New York Times content precisely strengthens ChatGPT outputs.

That was the company’s own position three weeks ago, according to The Telegraph, which shared a submission from OpenAI to the House of Lords communications and digital select committee. In the submission, the company admitted that it could not train LLMs like ChatGPT without access to copyrighted work. In fact, it would be “impossible.”

Data is the backbone of AI and all models rely on patterns and correlations established by vast amounts of training data. Generative AI tools need high quality training data—like copyrighted content from the New York Times and other notable publishers—to provide high-quality and enough quantity of training data also reduces hallucinations, actually making responses relevant.

While the New York Times’ case against Open AI and Microsoft is probably the most visible challenge involving intellectual property implications of AI, it is hardly the only one. Plaintiffs have filed multiple lawsuits claiming the training process for AI programs infringed upon their copyrights in written and visual works. These include lawsuits by the Authors Guild and authors Paul Tremblay, Michael Chabon, Sarah Silverman, and others against OpenAI. Michael Chabon, Sarah Silverman, and other content creators have also initiated suits against Meta. There are proposed class action lawsuits against Alphabet Inc., Stability AI, and Midjourney, as well as a lawsuit by Getty Images against Stability AI.

As AI use continues to proliferate, there will be increasing pressure to resolve these copyright issues. And litigation involving intellectual property rights is just the tip of the iceberg. The number of cases centered on AI-related accuracy, safety, and discrimination are likely to rise.

Given the complexity and sheer volume of all of these cases, it will likely take years before these matters are resolved. For now, all we can say for sure is that ordinary companies rolling out AI tools would be wise to exercise care to track and monitor their use of the tech. Should a particular AI tool come under regulatory or judicial scrutiny and thus come off the market, companies will want to be able to adapt quickly and smoothly.

https://www.fastcompany.com/91021586/why-data-will-always-be-a-precious-and-protected-commodity-in-ai?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Creată 1y | 2 feb. 2024, 21:20:04

Autentifică-te pentru a adăuga comentarii

Alte posturi din acest grup

This new app makes using your iPhone camera tons more fun

I have not found much joy in iPhone photography of late. Between the flat,

7 iul. 2025, 11:30:04 | Fast company - tech

Here’s how far-right extremists hide in TikTok’s earworms

Far-right extremists are exploiting TikTok’s “use-this-sound” feature as a Trojan

7 iul. 2025, 11:30:03 | Fast company - tech

Plane yoga is going viral on EasyJet and Spirit Airlines

The last place you’d think of doing a downward dog? An airplane.

That might soon change, as plane yoga is apparently now a thing.

6 iul. 2025, 12:20:03 | Fast company - tech

How AI is transforming corporate finance

The role of the CFO is evolving—and fast. In today’s volatile business environment, finance leaders are navigating everything from unpredictable tariffs to tightening regulations and rising geopol

5 iul. 2025, 13:10:03 | Fast company - tech

Want to move data between Apple and Google Maps? Try this workaround

In June, Google released its newest smartphone operating system, Android 16. The same month, Apple previewed its next smartphone oper

5 iul. 2025, 10:40:07 | Fast company - tech

Tally lets you design great free surveys in 60 seconds

This article is republished with permission from Wonder Tools, a newsletter that helps you discover the most useful sites and apps.

4 iul. 2025, 13:50:03 | Fast company - tech

How China is leading the humanoid robots race

I’ve worked at the bleeding edge of robotics innovation in the United States for almost my entire professional life. Never before have I seen another country advance so quickly.

4 iul. 2025, 09:20:03 | Fast company - tech

Tomas_r2