Claude Sonnet 4.6: Benchmark performance, how to try it

Anthropic comes from released his latest Large Language Model (LLM), Claude Sonnett 4.6. Tuesday’s release quickly follows the launch of Claude Opus 4.6, the company’s premium AI model, on February 5.
According to Anthropic, “Claude Sonnet 4.6 is our highest performing Sonnet model to date.” The company claims that Sonnet 4.6 has a million token pop-up in beta. Importantly, Anthropic reports that Sonnet 4.6 performed well in internal security testing, showing a low tendency toward hallucinations and sycophancy.
“Sonnet 4.6 brings much-improved coding skills to more of our users,” Anthropic said, referring to Claude’s popularity among developers who use AI to code.
If you want to use Anthropic’s latest AI model, the company has made it very easy. Here’s how to access Sonnet clause 4.6.
How to use Claude Sonnet 4.6
For Free and Pro users, Claude Sonnett 4.6 is now available as the default template on claude.ai and Claude Cowork. Anthropic has also deployed the model through its API and all major cloud platforms.
Crushable speed of light
Free users will benefit from limited usage rates that depend on current demand. Limits are reset every five hours. For those who need higher limits, Claude Sonnet 4.6 costs the same as the previous model. The Claude Pro plan costs $20 per month or $17 per month if paid annually. If you go through the API, Claude Sonnett 4.6 starts at $3 per million input tokens and $15 per million output tokens.
Claude Sonnet 4.6 benchmark performance
According to Anthropic’s benchmark tests, Claude Sonnet 4.6 is the company’s most powerful model for agentic financial analysis and office tasks, beating out competitors like Google’s Gemini 3 Pro and OpenAI’s GPT 5.2.
On these tasks, Claude Sonnet 4.6 also beats Anthropic’s Opus 4.6, Anthropic’s most powerful AI model.
In its release announcement, Anthropic stated that many developers with early access to Claude Sonnet 4.6 preferred this model, not only to its predecessor, Claude Sonnet 4.5, but also to Claude Opus 4.5. According to the Sonnet 4.6 system board, the new model improves key criteria such as the latest humanity review, although Claude Opus 4.6 scored higher.
Benchmark performance
-
Diamond GPQA: 89.9 percent
-
ARC-AGI-2: 58.3 percent
-
MMMLU: 89.3 percent
-
Verified SWE Bench: 79.6 percent
-
HLE (Humanity’s Last Examination): With tools 49.0 percent, without tools 33.2 percent
AI-powered insurance company Pace told VentureBeat that Sonnet 4.6 scored the highest among all Claude models on its complex insurance computer usability benchmark.
These results are remarkable insofar as Claude Opus’ models are generally the most intelligent and preferable for complex reasoning.
Claude Sonnet 4.6 is not only more powerful than some Opus models, but also more affordable. As mentioned previously, Claude Sonnet 4.6 is priced at $3/$15, while the Opus 4.6 is priced at $5/$25.
Topics
Artificial intelligence



