Is Claude Opus 4.7 better than 4.6?

Yes, across the board. SWE-bench Verified jumped from 80.8% to 87.6%, CursorBench from 58% to 70%, and vision accuracy from 54.5% to 98.5%. The gains are real, not marketing. The catch is a new tokenizer that can increase effective costs by 5-35% on code-heavy workloads.

Does Claude Opus 4.7 cost more than 4.6?

The list price is identical: $5 per million input tokens, $25 per million output tokens. But Opus 4.7 uses a new tokenizer that maps the same content to 1.0-1.35x more tokens. On code-heavy prompts, your actual bill can run 15-35% higher than with 4.6 at the same workload.

What is the xhigh effort level in Claude Opus 4.7?

xhigh is a new reasoning tier between 'high' and 'max', now the default in Claude Code. It allocates up to 100K thinking tokens per request, giving the model more room to reason through complex problems without hitting the latency ceiling of max effort. For most coding tasks, xhigh delivers better results than high at lower cost than max.

How do I access Claude Opus 4.7 via API?

The model ID is anthropic/claude-opus-4.7. Through ofox.ai, you can access it with an OpenAI-compatible endpoint at api.ofox.ai/v1 using your existing OpenAI SDK — just change the base URL and API key. No separate Anthropic billing required.

Should I migrate from Claude Opus 4.6 to 4.7 immediately?

For new projects, yes. For existing production systems, test first. The new tokenizer and more literal instruction-following mean prompts tuned for 4.6 may behave differently. Run a representative sample of your prompts through 4.7 before switching, and budget for 10-20% higher token costs on code-heavy workloads.

Claude Opus 4.7 Review: Real Costs & What Changed

TL;DR — Opus 4.7 is a real upgrade: 87.6% SWE-bench Verified (up from 80.8%), 3x better vision, and a new xhigh effort level now default in Claude Code. The part getting less attention: the new tokenizer means the same prompt costs 5-35% more in practice. Same sticker price, higher real bill. Worth it for most teams, but test before migrating production.

What Anthropic Actually Shipped

Anthropic released Claude Opus 4.7 on April 16, 2026. Two days later it’s already the default Opus route on most API platforms, including ofox.ai.

Benchmark	Opus 4.6	Opus 4.7	Change
SWE-bench Verified	80.8%	87.6%	+6.8pp
SWE-bench Pro	53.4%	64.3%	+10.9pp
CursorBench	58%	70%	+12pp
Vision accuracy	54.5%	98.5%	+44pp
Max image resolution	~1MP	3.75MP	3.75x

The vision jump stands out. Opus 4.6 was mediocre at reading screenshots, diagrams, and dense charts. Opus 4.7 handles them well — 98.5% accuracy on the standard vision benchmark, accepting images up to 3.75 megapixels. If you’ve been routing vision tasks to GPT-5.4 because Claude’s image handling was unreliable, that’s worth revisiting.

On coding, 64.3% on SWE-bench Pro puts Opus 4.7 ahead of GPT-5.4 (57.7%) and Gemini 3.1 Pro (54.2%) on real-world GitHub issue resolution. SWE-bench Pro uses actual open-source repositories, not synthetic tasks.

The Tokenizer Problem

Opus 4.7 ships with a new tokenizer. Anthropic’s migration guide says it uses “roughly 1.0 to 1.35x as many tokens” as 4.6 for the same content. That range matters a lot depending on what you’re building.

Natural language prose: ~1.0-1.05x (negligible)
Mixed code and text: ~1.1-1.2x (10-20% more tokens)
Dense code, especially Python or TypeScript: ~1.2-1.35x (20-35% more tokens)

The list price is unchanged at $5/$25 per million tokens. But code-heavy workloads will cost more. A team spending $2,000/month on Opus 4.6 for a code review pipeline should budget $2,200-2,700/month for the same volume on 4.7. The performance gains likely justify it. Just don’t get surprised by the invoice.

The xhigh Effort Level

Opus 4.7 adds a new reasoning tier: xhigh. It sits between high and max and is now the default in Claude Code for all plans.

high: Fast, limited thinking budget
xhigh: Up to 100K thinking tokens, balances depth and latency
max: Uncapped thinking, slowest and most expensive

For most coding work, xhigh is the right setting. It gives the model enough room to work through multi-step problems without the latency of max effort. Anthropic’s testing shows xhigh at 100K tokens matches medium-effort 4.6 on quality while being faster.

To match Claude Code’s default behavior when calling the API directly:

response = client.messages.create(
    model="claude-opus-4-7",
    thinking={"type": "enabled", "budget_tokens": 100000},
    ...
)

Four Breaking Changes Before You Migrate

Opus 4.7 isn’t a drop-in replacement. Anthropic documented four API changes that can break existing integrations.

More literal instruction-following. Opus 4.7 interprets prompts more literally than 4.6. If your system prompt says “respond in JSON,” 4.7 will respond in JSON even when a brief explanation would have been more useful. Prompts that relied on 4.6’s tendency to add helpful context may need adjustment.

Stricter output format adherence. Related — 4.7 is less likely to deviate from specified formats, even when deviation would improve the response. Good for structured output pipelines, potentially annoying for conversational use cases.

Tokenizer change affects cache hit rates. If you’re using prompt caching, hit rates will drop initially after migration because the tokenized representation of your prompts has changed. Cache rebuilds over time, but expect higher costs during the transition.

Vision input handling. The new 3.75MP image support changes how the model processes image tokens. If you have hardcoded token estimates for image inputs, recalculate.

Pricing in Context

At $5/$25 per million tokens, Opus 4.7 is the premium tier. Here’s where it sits:

Model	Input / 1M	Output / 1M	SWE-bench Verified
Claude Opus 4.7	$5.00	$25.00	87.6%
Claude Opus 4.6	$5.00	$25.00	80.8%
GPT-5.4	$2.50	$15.00	~57.7%
Gemini 3.1 Pro	$1.25	$10.00	~54.2%
Claude Sonnet 4.6	$3.00	$15.00	79.6%

Prices via ofox.ai/models, April 2026.

Sonnet 4.6 is worth a second look here. It costs 40% less on both input and output, and scores 79.6% on SWE-bench Verified — 8 points behind Opus 4.7. For most production workloads, that gap doesn’t justify the price difference. Opus 4.7 earns its premium on the hardest tasks: complex multi-file refactoring, long autonomous agent runs, and vision-heavy workflows.

If you’re on Opus 4.6 and happy with results, upgrading to 4.7 gets you better performance at the same sticker price — just account for the tokenizer overhead. If you’re on Sonnet 4.6 and considering a step up, the question is whether your workload actually hits the ceiling where Opus’s extra capability pays off.

When to Upgrade vs. When to Wait

Upgrade immediately if you’re doing vision-heavy work. The 3x resolution improvement is the biggest single change in this release, and if you’ve been working around Claude’s image limitations, that workaround is now unnecessary. Same if you’re starting a new project — there’s no reason to start on 4.6.

Test before migrating if you have production prompts tuned for 4.6’s behavior, or if you’re using prompt caching. Cache hit rates will drop temporarily after migration. Run a representative sample of your prompts through 4.7 first and check the output quality and token counts before committing.

Stay on 4.6 for now if you’re cost-sensitive and the performance gains don’t justify 10-20% higher effective costs on code-heavy workloads. The model is better, but “better” doesn’t always mean “worth paying more for.”

Accessing Opus 4.7 via ofox.ai

The model ID is anthropic/claude-opus-4.7. Through ofox.ai, it’s available on the same OpenAI-compatible endpoint as every other model — no separate Anthropic account or billing required.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.ofox.ai/v1",
    api_key="your-ofox-key"
)

response = client.chat.completions.create(
    model="anthropic/claude-opus-4.7",
    messages=[{"role": "user", "content": "Review this code..."}]
)

For thinking/xhigh features, use the Anthropic native protocol:

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.ofox.ai/anthropic",
    api_key="your-ofox-key"
)

Going through an aggregator also makes A/B testing easier. You can compare Opus 4.7 against 4.6 or Sonnet 4.6 with the same API key and endpoint, without juggling multiple billing accounts. Useful during the migration period when you’re figuring out whether 4.7’s improvements justify the tokenizer overhead for your specific workload.

Verdict

Opus 4.7 is the best coding model available right now. The SWE-bench numbers are real, the vision upgrade is substantial, and xhigh effort is a better default than anything 4.6 offered.

“Same price” is technically accurate and practically misleading. Budget for 10-20% higher costs on code-heavy workloads, test your existing prompts before migrating production, and watch cache hit rates for the first week after switching.

For new projects, start on 4.7. For existing production systems, a careful migration over a week or two is the right call.

Related: Claude Haiku 4 API Guide — when to use Haiku instead of Opus for cost savings. Claude Opus 4.6 API Review — the predecessor that set the bar. Claude vs GPT vs Gemini: How to Pick the Right Model — full comparison across all frontier models. Best AI Model for Coding 2026 — where Opus 4.7 fits in the coding model landscape.

What Anthropic Actually Shipped

The Tokenizer Problem

The xhigh Effort Level

Four Breaking Changes Before You Migrate

Pricing in Context

When to Upgrade vs. When to Wait

Accessing Opus 4.7 via ofox.ai

Verdict

Related Articles

Claude Opus 4.6 API Review: Pricing, Strengths, and When It's Worth the Premium

Claude Fable 5 vs Sonnet 5 (2026): 5x Pricier, When It Pays

Claude Sonnet 5 vs Opus 4.8 (2026): 60% Cheaper on Paper