Anthropic’s Claude Opus 4 model can work autonomously for nearly a full workday

Anthropic kicked off its first-ever Code with Claude conference today with the announcement of a new frontier AI system. The company is calling Claude Opus 4 the best coding model in the world. According to Anthropic, Opus 4 is dramatically better at tasks that require it to complete thousands of separate steps, giving it the ability to work continuously for several hours in one go. Additionally, the new model can use multiple software tools in parallel, and it's better at following instructions more precisely.

In combination, Anthropic says those capabilities make Opus 4 ideal for powering upcoming AI agents. For the unfamiliar, agentic systems are AIs that are designed to plan and carry out complicated tasks without human supervision. They represent an important step towards the promise of artificial general intelligence (AGI). In customer testing, Anthropic saw Opus 4 work on its own seven hours, or nearly a full workday. That's an important milestone for the type of agentic systems the company wants to build.  

Claude Plays Pokemon
Anthropic

Another reason Anthropic thinks Opus 4 is ready to enable the creation of better AI agents is because the model is 65 percent less likely to use a shortcut or loophole when completing tasks. The company says the system also demonstrates significantly better "memory capabilities," particularly when developers grant Claude local file access. To encourage devs to try Opus 4, Anthropic is making Claude Code, its AI coding agent, widely available. It has also added new integrations with Visual Studio Code and JetBrains.

Even if you're not a coder, Anthropic might have something for you. That's because alongside Opus 4, the company announced a new version of its Sonnet model. Like Claude 3.7 Sonnet before it and Opus 4, the new system is a hybrid reasoning model, meaning it can execute prompts nearly instantaneously and engage in extended thinking. As a user, this gives you a best of both worlds chatbot that's better equipped to tackle complex problems when needed. It also incorporates many of the same improvements found in Opus 4, including the ability to use tools in parallel and follow instructions more faithfully. 

Sonnet 3.7 was so popular among users Anthropic ended up introducing a Max plan in response, which starts at $100 per month. The good news is you won't need to pay anywhere near that much to use Sonnet 4, as Anthropic is making it available to free users.

Claude 4 benchmarks
Anthropic

For those who want to use Sonnet 4 for a project, API pricing is staying at $3 per one million input tokens and $15 for the same amount of output tokens. Notably, outside of all the usual places you'll find Anthropic's models, including Amazon Bedrock and Google Vertex AI, Microsoft is making Sonnet 4 the default model for the new coding agent it's offering through GitHub Copilot. Both Opus 4 and Sonnet 4 are available to use today. 

Today's announcement comes during what's already been a busy week in the AI industry. On Tuesday, Google kicked off its I/O 2025 conference, announcing, among other things, that it was rolling out AI Mode to all Search users in the US. A day later, OpenAI said it was spending $6.5 billion to buy Jony Ive’s hardware startup.

This article originally appeared on Engadget at https://www.engadget.com/ai/anthropics-claude-opus-4-model-can-work-autonomously-for-nearly-a-full-workday-164526696.html?src=rss https://www.engadget.com/ai/anthropics-claude-opus-4-model-can-work-autonomously-for-nearly-a-full-workday-164526696.html?src=rss
Creato 6d | 22 mag 2025, 17:10:25


Accedi per aggiungere un commento

Altri post in questo gruppo

Anthropic brings web search to free Claude users

Anthropic is continuing to trickle down features to its free users. The latest one to make the leap out of subscriber-only mode is web search, which the company

28 mag 2025, 00:40:11 | Engadget
Apple buys the maker of Sneaky Sasquatch

Apple has bought a game studio for the first time. Digital Trends

27 mag 2025, 22:20:13 | Engadget
Texas enacts age-verification law for app stores

Texas is the latest state to adopt an age-verification law for app stores. Despite lobbying from big tech, reportedly including a personal

27 mag 2025, 20:10:14 | Engadget
Video Games Weekly: Grand Theft Auto is no friend to the queer community

Welcome to the initial installment of Video Games Weekly on Engadget. Expect a new story every Monday (yes, we realize today is Tuesday), broken into two parts. The first is a space for short e

27 mag 2025, 20:10:13 | Engadget
EU regulators are investigating Pornhub and three other sites

European regulators are investigating Pornhub. On Monday, the European Commission (EC) accused

27 mag 2025, 20:10:12 | Engadget
Cities: Skylines II delays Bridges & Ports DLC to Q4

The saga of Cities: Skylines II has involved a lot of frustration, and the years-long ride still doesn't appear to be over. The latest news from the developer is yet another delay, this ti

27 mag 2025, 20:10:11 | Engadget
The Browser Company stops active development of Arc in favor of new AI-focused product

The Browser Company has stopped active development of the popular Arc web browser, according

27 mag 2025, 17:40:19 | Engadget