UseRightAI
UseRightAI logo
HomeModelsComparePricingWhat's New
UseRightAI
Cut through AI hype. Pick what works.
UseRightAI logo
Cut through AI hype. Pick what works.

Independent AI model tracker. Live pricing, real benchmarks, zero vendor bias.

X (Twitter)LinkedInUpdatesContact

Compare

ChatGPT vs ClaudeGPT-4o vs Claude SonnetClaude vs GeminiDeepSeek vs ChatGPTMistral vs ClaudeGemini Flash vs GPT-4o MiniLlama vs ChatGPTBuild your own →

Best For

CodingWritingDevelopersProduct ManagersDesignersSalesBest Cheap AIBest Free AI

Pricing & Data

API Token PricingPrice HistoryBenchmark ScoresPrivacy & SafetySubscription PlansCost CalculatorWhich AI is Cheapest?

Company

About UseRightAIContactWhat ChangedAll ModelsDisclosuresPrivacy PolicyTerms of Service

© 2026 UseRightAI. Independent · Free forever · Not affiliated with any AI provider.

Affiliate links are clearly labeled. See disclosures.

HomeClaude Opus 4.7API Guide
Developer guideUpdated Apr 16, 2026

Claude Opus 4.7 API

Model IDs, effort levels, prompt caching, tokenizer migration, and code examples for Python and TypeScript.

Model ID
claude-opus-4-7-20260416
Context window
1,000,000 tokens
Max output
32,000 tokens
Input / output
$5 / $25 per 1M

Model IDs by platform

PlatformModel ID
Anthropic APIclaude-opus-4-7-20260416
AWS Bedrockanthropic.claude-opus-4-7-20260416-v1:0
Google Vertex AIclaude-opus-4-7@20260416
Microsoft Foundry (Azure)claude-opus-4-7-20260416 (check Azure docs)

Basic usage — Python

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-opus-4-7-20260416",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Review this PR diff and identify any bugs."
        }
    ]
)

print(message.content[0].text)

Basic usage — TypeScript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const message = await client.messages.create({
  model: "claude-opus-4-7-20260416",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: "Review this PR diff and identify any bugs."
    }
  ]
});

console.log(message.content[0].text);

Effort levels

Opus 4.7 adds a new xhigh effort level between high and max.

LevelLatencyQualityBest for
lowFastestBasicSimple factual queries, classification
mediumFastGoodSummarisation, drafting, light reasoning
highModerateStrongCoding, analysis, most production tasks
xhighNEWSlowVery strongHard coding problems, complex reasoning NEW
maxSlowestMaximumFrontier tasks, novel research, highest accuracy
# Using xhigh effort for hard coding tasks
message = client.messages.create(
    model="claude-opus-4-7-20260416",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "effort": "xhigh"   # new in Opus 4.7
    },
    messages=[{"role": "user", "content": prompt}]
)

Prompt caching — 90% off input costs

Cache your system prompt, large documents, or tool definitions. Cached tokens cost $0.50/1M instead of $5/1M.

message = client.messages.create(
    model="claude-opus-4-7-20260416",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a senior software engineer...",
            "cache_control": {"type": "ephemeral"}  # cache this block
        }
    ],
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": large_codebase_context,   # also cache this
                    "cache_control": {"type": "ephemeral"}
                },
                {
                    "type": "text",
                    "text": "Now review the following diff..."
                }
            ]
        }
    ]
)

Migration note: Opus 4.7's tokenizer changes where token boundaries fall. If you migrated from Opus 4.6, re-verify that your cache_control breakpoints still align with where you expect — cache misses can double your costs.

Vision — 98.5% accuracy, 2,576px images

Opus 4.7 handles images up to 2,576px on the long edge. Pass them as base64 or via URL.

import base64

with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")

message = client.messages.create(
    model="claude-opus-4-7-20260416",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Analyse this financial chart and summarise the trend."
                }
            ],
        }
    ],
)

Tokenizer migration checklist

Before moving production traffic from Opus 4.6 to 4.7.

Count tokens on your 10 most-used prompts using the Anthropic token counting API — compare 4.6 vs 4.7
Model your cost delta: if p99 prompts see >1.15× token count, recalculate your monthly API budget
Re-test prompt caching: cache boundaries shift with the new tokenizer, re-verify hit rates
Diff 50–100 outputs: Opus 4.7 is more literal — prompts relying on lenient interpretation may produce different results
Update monitoring: alert on token count spikes in the first week after migration

Related

Claude Opus 4.7 overview
Features, benchmarks, full pricing
Opus 4.7 vs 4.6
Should you migrate?
Is Opus 4.7 worth it?
By use case — coding, research, writing
All model pricing
Compare every provider at once
Best AI for coding
Opus 4.7 vs all coding alternatives
What is Claude Mythos?
Anthropic's even more powerful unreleased model

Frequently asked questions

What is the Claude Opus 4.7 model ID?

The full model ID is claude-opus-4-7-20260416. Use this string in your API requests. The date suffix (20260416) pins you to the exact version snapshot — Anthropic may update the underlying model without changing the base name.

What effort levels does Claude Opus 4.7 support?

Claude Opus 4.7 supports five effort levels: low, medium, high, xhigh (new in 4.7), and max. Use xhigh when you need deeper reasoning than high but don't want the full latency of max. For production apps, high or xhigh are the recommended starting points.

How do I use prompt caching with Claude Opus 4.7?

Prompt caching reduces input token costs by up to 90%. Add cache_control: { type: 'ephemeral' } to the content blocks you want cached (system prompts, large documents, tool definitions). The cache TTL is 5 minutes by default. Note: because Opus 4.7 uses a new tokenizer, re-verify your cache boundaries if you're migrating from Opus 4.6.

How does the Opus 4.7 tokenizer differ from Opus 4.6?

Opus 4.7's tokenizer maps the same text to up to 1.35× more tokens than Opus 4.6. The per-token price is unchanged at $5/$25 per million, but your token count — and therefore bill — will be higher. Code and technical text sees the largest impact (up to 1.35×); plain prose sees minimal impact (1.0–1.05×).

What is the context window for Claude Opus 4.7?

Claude Opus 4.7 supports a 1,000,000 token context window — 5× larger than Opus 4.6's 200K. The maximum output length is 32,000 tokens (32K).

Is Claude Opus 4.7 available on AWS Bedrock and Google Vertex AI?

Yes. Claude Opus 4.7 is available on AWS Bedrock (model ID: anthropic.claude-opus-4-7-20260416-v1:0), Google Vertex AI, and Microsoft Foundry (Azure AI). Each platform uses different model identifiers — check their docs for the exact string.