Why Anthropic’s decision to share the rules behind its Claude 3 chatbot is a big deal—sort of

The latest step forward in the development of large language models (LLMs) took place earlier this week, with the release of a new version of Claude, the LLM developed by AI company Anthropic—whose founders left OpenAI in late 2020 over concerns about the company’s pace of development.

But alongside the release of Claude 3, which sets new records in popular tests used to assess the prowess of LLMs, there was a second, more unusual innovation. Two days after Anthropic released Claude 3 to the world, Amanda Askell, a philosopher and ethicist researching AI alignment at Anthropic, and who worked on the LLM, shared the model’s system prompt on X (formerly Twitter).

Claude’s system prompt is just over 200 words, but outlines its worldview. “It should give concise responses to very simple questions, but provide thorough responses to more complex and open-ended questions,” the prompt reads. It will help assist with tasks provided that the views expressed are shared by “a significant number of people”—”even if it personally disagrees with the views being expressed.” And it doesn’t engage in stereotyping, “including the negative stereotyping of majority groups.”

In addition to sharing the text, Askell went on to contextualize the decisions the company made in writing the system prompt. The paragraph encouraging Claude to help, provided a significant number share the same viewpoint, was specifically inserted because Claude was a little more likely to refuse tasks if the user expressed right-wing views, Askell admitted.

Rumman Chowdhury, cofounder and CEO of Humane Intelligence, welcomes the transparency behind sharing the system prompt and thinks more companies ought to outline the foundational principles behind how their models are coded to respond. “I think there’s an appropriate ask for transparency and it’s a good step to be sharing prompts,” she says.

Others are also pleasantly surprised by Anthropic’s openness. “It’s really refreshing to see one of the big AI vendors demonstrate more transparency about how their system works,” says Simon Willison, a British programmer and AI watcher. “System prompts for other systems such as ChatGPT can be read through prompt-leaking hacks, but given how useful they are for understanding how best to use these tools, it’s frustrating that we have to use advanced tricks to read them.”

Anthropic, the maker of Claude 3, declined to make Askell available for an interview and is the only major LLM developer to share its system prompt.

Mike Katell, ethics fellow at the Alan Turing Institute, is cautiously supportive of Anthropic’s decision. “It is possible that system prompts will help developers implement Claude in more contextually sensitive ways, which could make Claude more useful in some settings,” he says. However, Katell says “this doesn’t do much to address the underlying problems of model design and training that lead to undesirable outputs, such as the racism, misogyny, falsehoods, and conspiracy-theory content that chat agents frequently spew out.”

Katell also worries that such radical transparency has an ulterior motive—either deliberately or accidentally. “Making system prompts available also clouds the lines of responsibility for such outputs,” he says. “Anthropic would like to shift all responsibility for the model onto downstream users and developers, and providing the appearance of configurability is one way to do that.”

On that front, Chowdhury agrees. While this is transparency of a type—and anything is better than nothing—it’s far from the whole story when it comes to how these models work. “It’s good to know what the system prompt is but it’s not a complete picture of model activity,” says Chowdhury. As with everything to do with the current set of generative-AI tools, it’s far more complicated than that, she explains: “Much of it will be based on training data, fine tuning, safeguards, and user interaction.”

https://www.fastcompany.com/91053339/anthropic-claude-3-system-prompt-transparency?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Created 1y | Mar 8, 2024, 11:10:03 PM


Login to add comment

Other posts in this group

Yahoo Creators platform hits record revenue as publisher bets big on influencer-led content

Yahoo’s bet on creator-led content appears to be paying off. Yahoo Creators, the media company’s publishing platform for creators, had its most lucrative month yet in June.

Launched in M

Jul 11, 2025, 5:30:04 PM | Fast company - tech
GameStop’s Nintendo Switch 2 stapler sells for more than $100,000 on eBay after viral mishap

From being the face of memestock mania to going viral for inadvertently stapling the screens of brand-new video game consoles, GameStop is no stranger to infamy.

Last month, during the m

Jul 11, 2025, 12:50:04 PM | Fast company - tech
Don’t take the race for ‘superintelligence’ too seriously

The technology industry has always adored its improbably audacious goals and their associated buzzwords. Meta CEO Mark Zuckerberg is among the most enamored. After all, the name “Meta” is the resi

Jul 11, 2025, 12:50:02 PM | Fast company - tech
Why AI-powered hiring may create legal headaches

Even as AI becomes a common workplace tool, its use in

Jul 11, 2025, 12:50:02 PM | Fast company - tech
Gen Zers are posting their unemployment era on TikTok—and it’s way too real

Finding a job is hard right now. To cope, Gen Zers are documenting the reality of unemployment in 2025.

“You look sadder,” one TikTok po

Jul 11, 2025, 10:30:04 AM | Fast company - tech
The most effective AI tools for research, writing, planning, and creativity

This article is republished with permission from Wonder Tools, a newsletter that helps you discover the most useful sites and apps. 

Jul 11, 2025, 10:30:04 AM | Fast company - tech