Why Anthropic’s decision to share the rules behind its Claude 3 chatbot is a big deal

The latest step forward in the development of large language models (LLMs) took place earlier this week, with the release of a new version of Claude, the LLM developed by AI company Anthropic—whose founders left OpenAI in late 2020 over concerns about the company’s pace of development.

But alongside the release of Claude 3, which sets new records in popular tests used to assess the prowess of LLMs, there was a second, more unusual innovation. Two days after Anthropic released Claude 3 to the world, Amanda Askell, a philosopher and ethicist researching AI alignment at Anthropic, and who worked on the LLM, shared the model’s system prompt on X (formerly Twitter).

Claude’s system prompt is just over 200 words, but outlines its worldview. “It should give concise responses to very simple questions, but provide thorough responses to more complex and open-ended questions,” the prompt reads. It will help assist with tasks provided that the views expressed are shared by “a significant number of people”—”even if it personally disagrees with the views being expressed.” And it doesn’t engage in stereotyping, “including the negative stereotyping of majority groups.”

In addition to sharing the text, Askell went on to contextualize the decisions the company made in writing the system prompt. The paragraph encouraging Claude to help, provided a significant number share the same viewpoint, was specifically inserted because Claude was a little more likely to refuse tasks if the user expressed right-wing views, Askell admitted.

Rumman Chowdhury, cofounder and CEO of Humane Intelligence, welcomes the transparency behind sharing the system prompt and thinks more companies ought to outline the foundational principles behind how their models are coded to respond. “I think there’s an appropriate ask for transparency and it’s a good step to be sharing prompts,” she says.

Others are also pleasantly surprised by Anthropic’s openness. “It’s really refreshing to see one of the big AI vendors demonstrate more transparency about how their system works,” says Simon Willison, a British programmer and AI watcher. “System prompts for other systems such as ChatGPT can be read through prompt-leaking hacks, but given how useful they are for understanding how best to use these tools, it’s frustrating that we have to use advanced tricks to read them.”

Anthropic, the maker of Claude 3, declined to make Askell available for an interview and is the only major LLM developer to share its system prompt.

Mike Katell, ethics fellow at the Alan Turing Institute, is cautiously supportive of Anthropic’s decision. “It is possible that system prompts will help developers implement Claude in more contextually sensitive ways, which could make Claude more useful in some settings,” he says. However, Katell says “this doesn’t do much to address the underlying problems of model design and training that lead to undesirable outputs, such as the racism, misogyny, falsehoods, and conspiracy-theory content that chat agents frequently spew out.”

Katell also worries that such radical transparency has an ulterior motive—either deliberately or accidentally. “Making system prompts available also clouds the lines of responsibility for such outputs,” he says. “Anthropic would like to shift all responsibility for the model onto downstream users and developers, and providing the appearance of configurability is one way to do that.”

On that front, Chowdhury agrees. While this is transparency of a type—and anything is better than nothing—it’s far from the whole story when it comes to how these models work. “It’s good to know what the system prompt is but it’s not a complete picture of model activity,” says Chowdhury. As with everything to do with the current set of generative-AI tools, it’s far more complicated than that, she explains: “Much of it will be based on training data, fine tuning, safeguards, and user interaction.”

https://www.fastcompany.com/91053339/anthropic-claude-3-system-prompt-transparency?partner=rss&utm_source=rss&utm_medium=feed&utm_campaign=rss+fastcompany&utm_content=rss

Creată 1y | 8 mar. 2024, 23:10:03

Autentifică-te pentru a adăuga comentarii

Alte posturi din acest grup

Linda Yaccarino was supposed to tame X. Elon Musk wouldn’t let her

Some news stories are gobsmackingly obvious in their importance. Others are complete nonstories. So what to make of the

9 iul. 2025, 19:10:07 | Fast company - tech

Musk’s Grok makes antisemitic comments on X

Elon Musk’s

9 iul. 2025, 19:10:05 | Fast company - tech

Canceling subscriptions was about to get easier, but a federal court blocked the FTC rule

A “click-to-cancel” rule, which would have required busines

9 iul. 2025, 16:40:09 | Fast company - tech

Apple’s next CEO: A new look at Tim Cook’s potential successors after latest exec shakeup

Yesterday, Apple unexpectedly announced the most radical shakeup to its C-suite in years. The company revealed that Jeff Williams, its current chief operating officer (COO), will be departing the

9 iul. 2025, 16:40:09 | Fast company - tech

PBS chief Paula Kerger warns public broadcasting could collapse in small communities if Congress strips federal funding

As Congress moves to make massive cuts to public broadcasting this week, Paula Kerger, president and CEO of the Public Broadcasting Service (PBS), gives an unflinching look at the organization’s f

9 iul. 2025, 14:30:04 | Fast company - tech

These personality types are most likely to cheat using AI

As recent graduates proudly showcase their use of ChatGPT for final projects, some may wonder: What kind of person turns to

9 iul. 2025, 14:30:04 | Fast company - tech

Samsung fixed everything you hated about foldable phones—except the price

Just over a month ago, Samsung did something strange to start hyping up its next foldable phone announcements.

Those phones, which Samsung revealed today, are officially called the Samsu

9 iul. 2025, 14:30:04 | Fast company - tech

Tomas_r2

Why Anthropic’s decision to share the rules behind its Claude 3 chatbot is a big deal—sort of

Alte posturi din acest grup