OpenAI’s GPT-5.5 vs Claude Opus 4.7: Which is better?

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c

OpenAI released its latest model, GPT-5.5, on April 23, just a week after Anthropic introduced Claude Opus 4.7.

As two flagship models from the two leading AI labs, we wanted to see how the new models compare.

Spoiler alert: We think Claude Opus 4.7 has an advantage in advanced and agentic coding, but GPT-5.5 performs better on most benchmarks.

SEE ALSO:

Anthropic says Claude Opus 4.7 has a 92% honesty rating, less sycophancy

Want to learn more about how to get the most out of your technology? Sign up for Mashable’s Top Stories and Deals newsletters Today.

GPT-5.5 and Opus 4.7: rankings

GPT-5.5 is not yet ranked in all AI rankings, but it should be very competitive with Claude Opus 4.7. In verified benchmark test rankings such as Arc Prize, GPT-5.5 beats Opus 4.7 (more on that below).

In the popular Arena ranking, based on user tests, Claude Opus 4.7 Thinking takes first place overall. Interestingly, Opus 4.7 is currently ranked below Opus 4.6, although this will likely change over time. Currently, the new Anthropic models occupy the first four places in the overall ranking. Additionally, Anthropic’s unreleased Claude Mythos is unrated, and Anthropic says it’s even better than Opus 4.7.

In the Epoch Capabilities Index (ECI) ranking, GPT-5.4 Pro currently has the best score. (ECI combines multiple benchmarks into a single score.) You’ll find Gemini 3.1 Pro and GPT-5.4 in second and third positions.

SEE ALSO:

AI’s ability to detect major software bugs increases 490% year over year

GPT-5.5 and Opus 4.7: benchmarks

How do new models perform in common benchmark tests? We must rely primarily on self-reported scores from OpenAI and Anthropic for these tests. They both get high marks, as you would expect, but GPT-5.5 definitely has the edge.

Here’s how they compare to some of the best AI benchmarks:

  • SWE-Bench Pro: GPT-5.5 scored 58.6; Opus 4.7 scored 64.3 percent

  • Terminal Bench 2.0: GPT-5.5 scored 82.7%; Opus 4.7 scored 69.4 percent

  • Humanity’s Last Examination: GPT-5.5 scored 40.6%; Opus 4.7 scored 31.2 percent*

  • Humanity’s Last Exam (with tools): GPT-5.5 scored 52.2%; Opus 4.7 got 54.7 percent

  • BrowseComp: GPT-5.5 scored 84.4%; Opus 4.7 scored 79.3 percent

  • GPQA Diamond: GPT-5.5 scored 93.6%; Opus 4.7 scored 94.2 percent

  • ARC-AGI-1 (Verified): GPT-5.5 (High) scored 94.5%; Claude 4.7 (High) scored 92 percent**

  • ARC-AGI-2 (Verified): GPT-5.5 (High) scored 83.3%; Claude 4.7 (High) scored 68.3 percent**

*For humanity’s final review, we cite HLE results verified by artificial analysis.. Notably, Anthropic reports that the Opus 4.7 scored 46.9% on this test.

**See the full results on the Prix Arc website.

GPT 5.5 and Opus 4.7: Availability and prices

OpenAI claims that GPT 5.5 is “our smartest and most intuitive model to use yet.” Claude Opus 4.7 is the most advanced Anthropic model available to Claude users, although Anthropic claims that the unreleased Claude Mythos Preview is the most capable model overall.

As such, only paid subscribers can access these frontier models.

GPT 5.5 is only available to OpenAI Plus, Pro, Business, and Enterprise users in ChatGPT and Codex (sorry, ChatGPT Go users). Pro, Business, and Enterprise users can also access GPT-5.5 Pro, while Plus, Pro, Business, and Enterprise customers can access GPT-5.5 Thinking.

OpenAI is increasing prices for GPT-5.5 in its API, although the company claims it is more token efficient. API pricing starts at “$5 per million input tokens and $30 per 1 million output tokens, with a 1 million popup.”

Opus 4.7 is available for Pro and Max customers; via the API, it is available for “$5 per million input tokens and $25 per million output tokens.”

GPT-5.5 and Opus 4.7: feature set

OpenAI claims that GPT-5.5 brings notable improvements in “agent coding, computer usage, knowledge work, and early scientific research.” Anthropic claims that Claude Opus 4.7 improves advanced coding, visual intelligence and document analysis.

ChatGPT and Claude have similar overall feature sets, with a few exceptions. Generally speaking, you can use these two AI chatbots for research, coding, creative projects, and daily professional work. You can also use OpenAI’s two new models and Anthropic’s coding platforms, Codex and Claude Code.

It’s easier to talk about the differences than the similarities. Although GPT-5.5 is not an image template, in ChatGPT you can use the new ChatGPT Images 2.0 template. Anthropic recently rolled out Claude Design, but it only offers data visualizations, charts, and slides, not full image generation. So if you need to generate interactive images or graphics for a project, GPT-5.5 will have more tools available.

interactive visualization of the orbits of Orion, the Moon and the Sun created by chatgpt

GPT-5.5 can be used to create complex and interactive data visualizations.
Credit: OpenAI

ChatGPT offers more app integrations and shopping, but with its recent acquisition of OpenClaw, Anthropic has the edge in agentic capabilities.

TL;DR: If we had to choose one of these models for everyday professional work, GPT-5.5 would have the edge thanks to ChatGPT’s broader feature set. However, for advanced and agentic coding, we would opt for Claude Opus 4.7.


Disclosure: Ziff Davis, the parent company of Mashable, filed a lawsuit in April 2025 against OpenAI, alleging that it had violated Ziff Davis’ copyrights in the training and operation of its AI systems.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button