Claude Sonnet 4.6: Benchmark performance, how to try it

0 0 2 minutes read

Claude Sonnet 4.6: Benchmark performance, how to try it

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

Anthropic comes from released his latest Large Language Model (LLM), Claude Sonnett 4.6. Tuesday’s release quickly follows the launch of Claude Opus 4.6, the company’s premium AI model, on February 5.

According to Anthropic, “Claude Sonnet 4.6 is our highest performing Sonnet model to date.” The company claims that Sonnet 4.6 has a million token pop-up in beta. Importantly, Anthropic reports that Sonnet 4.6 performed well in internal security testing, showing a low tendency toward hallucinations and sycophancy.

“Sonnet 4.6 brings much-improved coding skills to more of our users,” Anthropic said, referring to Claude’s popularity among developers who use AI to code.

If you want to use Anthropic’s latest AI model, the company has made it very easy. Here’s how to access Sonnet clause 4.6.

How to use Claude Sonnet 4.6

For Free and Pro users, Claude Sonnett 4.6 is now available as the default template on claude.ai and Claude Cowork. Anthropic has also deployed the model through its API and all major cloud platforms.

Crushable speed of light

Free users will benefit from limited usage rates that depend on current demand. Limits are reset every five hours. For those who need higher limits, Claude Sonnet 4.6 costs the same as the previous model. The Claude Pro plan costs $20 per month or $17 per month if paid annually. If you go through the API, Claude Sonnett 4.6 starts at $3 per million input tokens and $15 per million output tokens.

Claude Sonnet 4.6 benchmark performance

According to Anthropic’s benchmark tests, Claude Sonnet 4.6 is the company’s most powerful model for agentic financial analysis and office tasks, beating out competitors like Google’s Gemini 3 Pro and OpenAI’s GPT 5.2.

On these tasks, Claude Sonnet 4.6 also beats Anthropic’s Opus 4.6, Anthropic’s most powerful AI model.

In its release announcement, Anthropic stated that many developers with early access to Claude Sonnet 4.6 preferred this model, not only to its predecessor, Claude Sonnet 4.5, but also to Claude Opus 4.5. According to the Sonnet 4.6 system board, the new model improves key criteria such as the latest humanity review, although Claude Opus 4.6 scored higher.

Benchmark performance

Diamond GPQA: 89.9 percent
ARC-AGI-2: 58.3 percent
MMMLU: 89.3 percent
Verified SWE Bench: 79.6 percent
HLE (Humanity’s Last Examination): With tools 49.0 percent, without tools 33.2 percent

AI-powered insurance company Pace told VentureBeat that Sonnet 4.6 scored the highest among all Claude models on its complex insurance computer usability benchmark.

These results are remarkable insofar as Claude Opus’ models are generally the most intelligent and preferable for complex reasoning.

Claude Sonnet 4.6 is not only more powerful than some Opus models, but also more affordable. As mentioned previously, Claude Sonnet 4.6 is priced at $3/$15, while the Opus 4.6 is priced at $5/$25.

Topics
Artificial intelligence

abdulmanannet77@gmail.com2 weeks ago

0 0 2 minutes read