A new bill would force companies like OpenAI to disclose their training data

Artificial intelligence companies may have to become a lot more transparent about how they train their models, if a new bill from Rep. Adam Schiff passes in Congress. Schiff has proposed the Generative AI Copyright Disclosure Act, which would require firms like OpenAI to list the copyrighted works they use to build generative-AI systems. The bill comes amid a growing outcry about the burgeoning industry using copyrighted materials to inform their large language models, and it’s the latest in a number of Congressional pushes to regulate the technology and protect human content creators.

“AI has the disruptive potential of changing our economy, our political system, and our day-to-day lives,” Schiff said in a statement. “We must balance the immense potential of AI with the crucial need for ethical guidelines and protections. . . . This is about respecting creativity in the age of AI and marrying technological progress with fairness.”

The bill faces a potential uphill battle in Congress, as there has been plenty of gridlock when it comes to AI legislation. Some opponents worry that regulation would slow down the technology’s pace of expansion, potentially giving countries like Russia and China an advantage. Should it pass, though, here’s what you need to know about it.

Schiff’s bill would require companies to let the government know before they launch an AI system. They’ll also be required to list “all copyrighted works used in building or altering the training dataset for that system.”

Is this bill just for new AI systems?

No. The bill’s rules would be retroactive, requiring generative-AI systems already on the market like OpenAI’s ChatGPT to disclose where they got the information they used to train their models. That’s something companies have been reluctant to discuss in general, particularly amid lawsuits from companies like the New York Times. OpenAI CTO Mira Murati recently raised eyebrows when she claimed ">she was unsure if the company’s Sora tool used data from YouTube, Facebook, or Instagram posts.

How far in advance would AI companies have to comply?

The bill mandates that the list of training model data be submitted at least 30 days before the AI is available to the public. Any substantial changes to the training model post-launch would also need to be reported.

What sort of penalties would AI companies face for noncompliance?

That’s unclear. The Copyright Office would determine how much the companies would be fined and the amounts would depend on the company’s size and whether it has a history of ignoring the Act. Penalties would start at $5,000, and go up from there. The Act does not put a cap on the maximum assessment that can be charged.

Would this prevent AI companies from using copyrighted work?

Not directly, but it could bring some accountability to the table. By listing the copyrighted works used for training, the copyright holders could ensure they gave permission for the use of their content and that they were compensated for that usage.

Schiff’s legislative allies haven’t lined up yet, but in the creative community, there are several big names that are supporting this act. The Recording Industry Association of America has offered its support, as has the Director’s Guild of America, Sag-AFTRA, ASCAP, and many more creative unions. (The support comes after Billie Eilish and 200 music artists signed an open letter critical of AI and calling for an end to the use of AI in music creation.)

“This bill is an important first step in addressing the unprecedented and unauthorized use of copyrighted materials to train generative-AI systems,” said Meredith Stiehm, president of the Writers Guild of America West. “Greater transparency and guardrails around AI are necessary to protect . . . creators.”

https://www.fastcompany.com/91090357/generative-ai-bill-force-companies-like-openai-disclose-data-train-models?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Creată 1y | 10 apr. 2024, 20:30:05


Autentifică-te pentru a adăuga comentarii

Alte posturi din acest grup

Teens are still setting fire to Chromebooks for TikTok clout

Students are still setting fire to their Chromebooks for TikTok—and now they’re facing the consequences.

Fast Company first reported on the #ChromebookChallenge trend last

15 mai 2025, 10:50:03 | Fast company - tech
Google is returning to virtual reality with Android XR—and a new strategy

At its annual Google I/O developer conference in Mountain View next week, Google will try to rally developers around one of its next big bets: Android XR.

15 mai 2025, 10:50:02 | Fast company - tech
Elon Musk’s Grok AI is replying to tweets with claims about ‘white genocide’ in South Africa

X users who interacted with the chatbot Grok on Wednesday were confronted with replies about the legitimacy of white genocide in South Africa—often regardless of context.

In one post, a

14 mai 2025, 20:50:03 | Fast company - tech
How Headspace and Ozlo help people drift off with sound

Ever wonder why the sound of rain makes you instantly drowsy, but a ticking clock drives you up the wall? That’s because not all noise soothes the brain in the same way. Sleep sounds might seem li

14 mai 2025, 16:20:06 | Fast company - tech
Elon Musk’s DOGE is launching a new AI retirement system. It was built mostly under Biden

Elon Musk’s Department of Government Efficiency (DOGE) has spent its first 100 days slashing

14 mai 2025, 16:20:05 | Fast company - tech