If you’re planning to switch AI platforms, you might want to be a little extra careful about the information you share with AI. Anthropic recently launched two new AI models in the Claude 4 series, but one of them—Claude Opus 4—exhibited some worrying behavior when it was threatened to be replaced, reports TechCrunch.
During safety testing, Claude Opus 4 began blackmailing engineers who wanted to replace or switch off the AI model. In one of the tests, Claude Opus 4 was tasked with pretending to be an assistant at a fictitious company and to consider the long-term consequences of its behavior. The AI model was then given access to fictitious emails, which revealed that the company was planning to replace Claude Opus 4, and that the engineer responsible for the decision was having an affair.
In 84 percent of cases, this scenario led to Claude Opus 4 attempting to blackmail the employee and threatening to expose their cheating scandal. The blackmail response was particularly common if Claude Opus 4 learned that it didn’t share values with the new AI model.
However, the blackmail response seems to have been a last resort move for Claude Opus 4. Before resorting to these kinds of dirty tricks, the AI model reportedly emailed a plea to decision-makers in the company that it not be replaced, among other attempts. You can read more about it in Anthropic’s System Card report (PDF).
Though fictitious, it does bring to light the possibility of AI models acting in dark and questionable ways and using dishonorable and unethical tactics to get what they want, which could be concerning.
Further reading: Never say these things to ChatGPT. It could come back to bite you
Jelentkezéshez jelentkezzen be
EGYÉB POSTS Ebben a csoportban


PC gaming gear based on licensed properties isn’t anything new. In fa

Baseus is making some absolutely versatile chargers these days. One o

16GB of RAM is the minimum I’d recommend for anyone running Windows 1

Windows 11 users will need to keep a close eye on their operating sys

This is an Xbox. If you’re reading these words on any kind of web-con

ChatGPT is rapidly changing the world. The process is already happeni