Anthropic's Claude AI now has the ability to end 'distressing' conversations

Anthropic's latest feature for two of its Claude AI models could be the beginning of the end for the AI jailbreaking community. The company announced in a post on its website that the Claude Opus 4 and 4.1 models now have the power to end a conversation with users. According to Anthropic, this feature will only be used in "rare, extreme cases of persistently harmful or abusive user interactions."

To clarify, Anthropic said those two Claude models could exit harmful conversations, like "requests from users for sexual content involving minors and attempts to solicit information that would enable large-scale violence or acts of terror." With Claude Opus 4 and 4.1, these models will only end a conversation "as a last resort when multiple attempts at redirection have failed and hope of a productive interaction has been exhausted," according to Anthropic. However, Anthropic claims most users won't experience Claude cutting a conversation short, even when talking about highly controversial topics, since this feature will be reserved for "extreme edge cases."

Anthropic's example of Claude ending a conversation
Anthropic

In the scenarios where Claude ends a chat, users can no longer send any new messages in that conversation, but can start a new one immediately. Anthropic added that if a conversation is ended, it won't affect other chats and users can even go back and edit or retry previous messages to steer towards a different conversational route.

For Anthropic, this move is part of its research program that studies the idea of AI welfare. While the idea of anthropomorphizing AI models remains an ongoing debate, the company said the ability to exit a "potentially distressing interaction" was a low-cost way to manage risks for AI welfare. Anthropic is still experimenting with this feature and encourages its users to provide feedback when they encounter such a scenario.

This article originally appeared on Engadget at https://www.engadget.com/ai/anthropics-claude-ai-now-has-the-ability-to-end-distressing-conversations-201427401.html?src=rss https://www.engadget.com/ai/anthropics-claude-ai-now-has-the-ability-to-end-distressing-conversations-201427401.html?src=rss
Creato 22h | 17 ago 2025, 22:20:12


Accedi per aggiungere un commento

Altri post in questo gruppo

Substack turns on iOS in-app payment option for all paid newsletters

Substack now lets users subscribe to any paid publication via an in-app purchase from t

18 ago 2025, 19:20:10 | Engadget
NordVPN will discontinue Meshnet on December 1

NordVPN announced today in a

18 ago 2025, 19:20:08 | Engadget
First images from Fallout season 2 tease New Vegas

The second season of Prime Video's Fallout

18 ago 2025, 19:20:05 | Engadget
Workday says hackers used social engineering to access personal data during a breach

Human resources technology company Workday has confirmed that a data breach has affected its third-party CRM platform. In a

18 ago 2025, 19:20:04 | Engadget
Google announces first nuclear site to power its data centers

Big Tech's foray into nuclear power continues as

18 ago 2025, 16:50:15 | Engadget