OpenAI promises greater transparency on model hallucinations and harmful content

OpenAI has launched a new web page called the safety evaluations hub to publicly share information related to things like the hallucination rates of its models. The hub will also highlight if a model produces harmful content, how well it behaves as instructed and attempted jailbreaks.

The tech company claims this new page will provide additional transparency on OpenAI, a company that, for context, has faced multiple lawsuits alleging it illegally used copyrighted material to train its AI models. Oh, yeah, and it's worth mentioning that The New York Times claims the tech company accidentally deleted evidence in the newspaper's plagiarism case against it.

The safety evaluations hub is meant to expand on OpenAI's system cards. They only outline a development's safety measures at launch, whereas the hub should provide ongoing updates.

"As the science of AI evaluation evolves, we aim to share our progress on developing more scalable ways to measure model capability and safety," OpenAI states in its announcement. "By sharing a subset of our safety evaluation results here, we hope this will not only make it easier to understand the safety performance of OpenAI systems over time, but also support community efforts⁠ to increase transparency across the field." OpenAI adds that its working to have more proactive communication in this area throughout the company.

Introducing the Safety Evaluations Hub—a resource to explore safety results for our models.

While system cards share safety metrics at launch, the Hub will be updated periodically as part of our efforts to communicate proactively about safety.https://t.co/c8NgmXlC2Y
— OpenAI (@OpenAI) May 14, 2025

Interested parties can look at each of the hub's sections and see information on relevant models, such as GPT-4.1 through 4.5. OpenAI notes that the information provided in this hub is only a "snapshot" and that interested parties should look at its system cards. assessments and other releases for further details.

One of the big buts to the entire safety evaluation hub is that OpenAI is the entity doing these tests and choosing what information to share publicly. As a result, there isn't any way to guarantee that the company will share all its issues or concerns with the public.

This article originally appeared on Engadget at https://www.engadget.com/ai/openai-promises-greater-transparency-on-model-hallucinations-and-harmful-content-184545691.html?src=rss https://www.engadget.com/ai/openai-promises-greater-transparency-on-model-hallucinations-and-harmful-content-184545691.html?src=rss

Erstellt 1mo | 14.05.2025, 19:10:06

Melden Sie sich an, um einen Kommentar hinzuzufügen

Andere Beiträge in dieser Gruppe

ZeniMax and Microsoft ratify union agreement

Members of the ZeniMax Workers United-CWA union have ratified the contract agreement with parent company Microsoft. This agreement with the union representing the video game studio's quality assura

20.06.2025, 00:20:15 | Engadget

Tesla's robotaxi debut will reportedly be limited to only 10 cars in very specific areas

The

19.06.2025, 22:10:06 | Engadget

The golden Trump Phone is almost certainly not made in the US

Not content with a real estate empire and the presidency of the United States, the Trump family is wading into the phone wars like it's 2011 with a shiny gold monstrosity called the T1,

19.06.2025, 19:40:33 | Engadget

Steam adds more accessibility features

Steam has introduced

19.06.2025, 19:40:31 | Engadget

Midjourney adds AI video generation

AI company Midjourney has released its first video model. This initial take on AI

19.06.2025, 19:40:30 | Engadget

Get a free $30 Amazon gift card when you buy the Sony WH-1000XM6 headphones

Noise-cancelling headphones are a must-have for anyone who travels often and wants to drown out airplane noise, commuters who want some peace and quiet amongst the crowds and anyone else looking to

19.06.2025, 17:20:18 | Engadget

Netflix signs deal to host live TV channels in France

Everything old is new again. Netflix just inked a deal to

19.06.2025, 17:20:17 | Engadget

Techie