Model IDs, effort levels, prompt caching, tokenizer migration, and code examples for Python and TypeScript.
Model ID
claude-opus-4-7-20260416
Context window
1,000,000 tokens
Max output
32,000 tokens
Input / output
$5 / $25 per 1M
Model IDs by platform
Platform
Model ID
Anthropic API
claude-opus-4-7-20260416
AWS Bedrock
anthropic.claude-opus-4-7-20260416-v1:0
Google Vertex AI
claude-opus-4-7@20260416
Microsoft Foundry (Azure)
claude-opus-4-7-20260416 (check Azure docs)
Basic usage — Python
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-7-20260416",
max_tokens=1024,
messages=[
{
"role": "user",
"content": "Review this PR diff and identify any bugs."
}
]
)
print(message.content[0].text)
Basic usage — TypeScript
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const message = await client.messages.create({
model: "claude-opus-4-7-20260416",
max_tokens: 1024,
messages: [
{
role: "user",
content: "Review this PR diff and identify any bugs."
}
]
});
console.log(message.content[0].text);
Effort levels
Opus 4.7 adds a new xhigh effort level between high and max.
Level
Latency
Quality
Best for
low
Fastest
Basic
Simple factual queries, classification
medium
Fast
Good
Summarisation, drafting, light reasoning
high
Moderate
Strong
Coding, analysis, most production tasks
xhighNEW
Slow
Very strong
Hard coding problems, complex reasoning NEW
max
Slowest
Maximum
Frontier tasks, novel research, highest accuracy
# Using xhigh effort for hard coding tasks
message = client.messages.create(
model="claude-opus-4-7-20260416",
max_tokens=16000,
thinking={
"type": "enabled",
"effort": "xhigh" # new in Opus 4.7
},
messages=[{"role": "user", "content": prompt}]
)
Prompt caching — 90% off input costs
Cache your system prompt, large documents, or tool definitions. Cached tokens cost $0.50/1M instead of $5/1M.
message = client.messages.create(
model="claude-opus-4-7-20260416",
max_tokens=1024,
system=[
{
"type": "text",
"text": "You are a senior software engineer...",
"cache_control": {"type": "ephemeral"} # cache this block
}
],
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": large_codebase_context, # also cache this
"cache_control": {"type": "ephemeral"}
},
{
"type": "text",
"text": "Now review the following diff..."
}
]
}
]
)
Migration note: Opus 4.7's tokenizer changes where token boundaries fall. If you migrated from Opus 4.6, re-verify that your cache_control breakpoints still align with where you expect — cache misses can double your costs.
Vision — 98.5% accuracy, 2,576px images
Opus 4.7 handles images up to 2,576px on the long edge. Pass them as base64 or via URL.
The full model ID is claude-opus-4-7-20260416. Use this string in your API requests. The date suffix (20260416) pins you to the exact version snapshot — Anthropic may update the underlying model without changing the base name.
What effort levels does Claude Opus 4.7 support?
Claude Opus 4.7 supports five effort levels: low, medium, high, xhigh (new in 4.7), and max. Use xhigh when you need deeper reasoning than high but don't want the full latency of max. For production apps, high or xhigh are the recommended starting points.
How do I use prompt caching with Claude Opus 4.7?
Prompt caching reduces input token costs by up to 90%. Add cache_control: { type: 'ephemeral' } to the content blocks you want cached (system prompts, large documents, tool definitions). The cache TTL is 5 minutes by default. Note: because Opus 4.7 uses a new tokenizer, re-verify your cache boundaries if you're migrating from Opus 4.6.
How does the Opus 4.7 tokenizer differ from Opus 4.6?
Opus 4.7's tokenizer maps the same text to up to 1.35× more tokens than Opus 4.6. The per-token price is unchanged at $5/$25 per million, but your token count — and therefore bill — will be higher. Code and technical text sees the largest impact (up to 1.35×); plain prose sees minimal impact (1.0–1.05×).
What is the context window for Claude Opus 4.7?
Claude Opus 4.7 supports a 1,000,000 token context window — 5× larger than Opus 4.6's 200K. The maximum output length is 32,000 tokens (32K).
Is Claude Opus 4.7 available on AWS Bedrock and Google Vertex AI?
Yes. Claude Opus 4.7 is available on AWS Bedrock (model ID: anthropic.claude-opus-4-7-20260416-v1:0), Google Vertex AI, and Microsoft Foundry (Azure AI). Each platform uses different model identifiers — check their docs for the exact string.