A new bill would force companies like OpenAI to disclose their training data

Artificial intelligence companies may have to become a lot more transparent about how they train their models, if a new bill from Rep. Adam Schiff passes in Congress. Schiff has proposed the Generative AI Copyright Disclosure Act, which would require firms like OpenAI to list the copyrighted works they use to build generative-AI systems. The bill comes amid a growing outcry about the burgeoning industry using copyrighted materials to inform their large language models, and it’s the latest in a number of Congressional pushes to regulate the technology and protect human content creators.

“AI has the disruptive potential of changing our economy, our political system, and our day-to-day lives,” Schiff said in a statement. “We must balance the immense potential of AI with the crucial need for ethical guidelines and protections. . . . This is about respecting creativity in the age of AI and marrying technological progress with fairness.”

The bill faces a potential uphill battle in Congress, as there has been plenty of gridlock when it comes to AI legislation. Some opponents worry that regulation would slow down the technology’s pace of expansion, potentially giving countries like Russia and China an advantage. Should it pass, though, here’s what you need to know about it.

What would the Generative AI Copyright Disclosure Act require AI companies to do?

Schiff’s bill would require companies to let the government know before they launch an AI system. They’ll also be required to list “all copyrighted works used in building or altering the training dataset for that system.”

Is this bill just for new AI systems?

No. The bill’s rules would be retroactive, requiring generative-AI systems already on the market like OpenAI’s ChatGPT to disclose where they got the information they used to train their models. That’s something companies have been reluctant to discuss in general, particularly amid lawsuits from companies like the New York Times. OpenAI CTO Mira Murati recently raised eyebrows when she claimed ">she was unsure if the company’s Sora tool used data from YouTube, Facebook, or Instagram posts.

How far in advance would AI companies have to comply?

The bill mandates that the list of training model data be submitted at least 30 days before the AI is available to the public. Any substantial changes to the training model post-launch would also need to be reported.

What sort of penalties would AI companies face for noncompliance?

That’s unclear. The Copyright Office would determine how much the companies would be fined and the amounts would depend on the company’s size and whether it has a history of ignoring the Act. Penalties would start at $5,000, and go up from there. The Act does not put a cap on the maximum assessment that can be charged.

Would this prevent AI companies from using copyrighted work?

Not directly, but it could bring some accountability to the table. By listing the copyrighted works used for training, the copyright holders could ensure they gave permission for the use of their content and that they were compensated for that usage.

Who is backing the Generative AI Copyright Disclosure Act?

Schiff’s legislative allies haven’t lined up yet, but in the creative community, there are several big names that are supporting this act. The Recording Industry Association of America has offered its support, as has the Director’s Guild of America, Sag-AFTRA, ASCAP, and many more creative unions. (The support comes after Billie Eilish and 200 music artists signed an open letter critical of AI and calling for an end to the use of AI in music creation.)

“This bill is an important first step in addressing the unprecedented and unauthorized use of copyrighted materials to train generative-AI systems,” said Meredith Stiehm, president of the Writers Guild of America West. “Greater transparency and guardrails around AI are necessary to protect . . . creators.”

https://www.fastcompany.com/91090357/generative-ai-bill-force-companies-like-openai-disclose-data-train-models?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Creată 1y | 10 apr. 2024, 20:30:05

Autentifică-te pentru a adăuga comentarii

Alte posturi din acest grup

Why the AI pin won’t be the next iPhone

One of the most frequent questions I’ve been getting from business execs lately is whether the

12 iul. 2025, 12:10:02 | Fast company - tech

Microsoft will soon delete your Authenticator passwords. Here are 3 password manager alternatives

Users of Microsoft apps are having a rough year. First, in May, the Windows maker

12 iul. 2025, 09:40:03 | Fast company - tech

Yahoo Creators platform hits record revenue as publisher bets big on influencer-led content

Yahoo’s bet on creator-led content appears to be paying off. Yahoo Creators, the media company’s publishing platform for creators, had its most lucrative month yet in June.

Launched in M

11 iul. 2025, 17:30:04 | Fast company - tech

GameStop’s Nintendo Switch 2 stapler sells for more than $100,000 on eBay after viral mishap

From being the face of memestock mania to going viral for inadvertently stapling the screens of brand-new video game consoles, GameStop is no stranger to infamy.

Last month, during the m

11 iul. 2025, 12:50:04 | Fast company - tech

Don’t take the race for ‘superintelligence’ too seriously

The technology industry has always adored its improbably audacious goals and their associated buzzwords. Meta CEO Mark Zuckerberg is among the most enamored. After all, the name “Meta” is the resi

11 iul. 2025, 12:50:02 | Fast company - tech

Why AI-powered hiring may create legal headaches

Even as AI becomes a common workplace tool, its use in

11 iul. 2025, 12:50:02 | Fast company - tech

Gen Zers are posting their unemployment era on TikTok—and it’s way too real

Finding a job is hard right now. To cope, Gen Zers are documenting the reality of unemployment in 2025.

“You look sadder,” one TikTok po

11 iul. 2025, 10:30:04 | Fast company - tech

Tomas_r2