Home Claude Opus 4.7API Guide

Developer guideUpdated Apr 16, 2026

Claude Opus 4.7 API

Model IDs, effort levels, prompt caching, tokenizer migration, and code examples for Python and TypeScript.

Model ID

claude-opus-4-7-20260416

Context window

1,000,000 tokens

Max output

32,000 tokens

Input / output

$5 / $25 per 1M

Model IDs by platform

Platform	Model ID
Anthropic API	claude-opus-4-7-20260416
AWS Bedrock	anthropic.claude-opus-4-7-20260416-v1:0
Google Vertex AI	claude-opus-4-7@20260416
Microsoft Foundry (Azure)	claude-opus-4-7-20260416 (check Azure docs)

Basic usage — Python

import anthropic

client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-opus-4-7-20260416",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Review this PR diff and identify any bugs."
        }
    ]
)

print(message.content[0].text)

Basic usage — TypeScript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const message = await client.messages.create({
  model: "claude-opus-4-7-20260416",
  max_tokens: 1024,
  messages: [
    {
      role: "user",
      content: "Review this PR diff and identify any bugs."
    }
  ]
});

console.log(message.content[0].text);

Effort levels

Opus 4.7 adds a new xhigh effort level between high and max.

Level	Latency	Quality	Best for
low	Fastest	Basic	Simple factual queries, classification
medium	Fast	Good	Summarisation, drafting, light reasoning
high	Moderate	Strong	Coding, analysis, most production tasks
xhighNEW	Slow	Very strong	Hard coding problems, complex reasoning NEW
max	Slowest	Maximum	Frontier tasks, novel research, highest accuracy

# Using xhigh effort for hard coding tasks
message = client.messages.create(
    model="claude-opus-4-7-20260416",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "effort": "xhigh"   # new in Opus 4.7
    },
    messages=[{"role": "user", "content": prompt}]
)

Prompt caching — 90% off input costs

Cache your system prompt, large documents, or tool definitions. Cached tokens cost $0.50/1M instead of $5/1M.

message = client.messages.create(
    model="claude-opus-4-7-20260416",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "You are a senior software engineer...",
            "cache_control": {"type": "ephemeral"}  # cache this block
        }
    ],
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": large_codebase_context,   # also cache this
                    "cache_control": {"type": "ephemeral"}
                },
                {
                    "type": "text",
                    "text": "Now review the following diff..."
                }
            ]
        }
    ]
)

Migration note: Opus 4.7's tokenizer changes where token boundaries fall. If you migrated from Opus 4.6, re-verify that your cache_control breakpoints still align with where you expect — cache misses can double your costs.

Vision — 98.5% accuracy, 2,576px images

Opus 4.7 handles images up to 2,576px on the long edge. Pass them as base64 or via URL.

import base64

with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")

message = client.messages.create(
    model="claude-opus-4-7-20260416",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Analyse this financial chart and summarise the trend."
                }
            ],
        }
    ],
)

Tokenizer migration checklist

Before moving production traffic from Opus 4.6 to 4.7.

Count tokens on your 10 most-used prompts using the Anthropic token counting API — compare 4.6 vs 4.7

Model your cost delta: if p99 prompts see >1.15× token count, recalculate your monthly API budget

Re-test prompt caching: cache boundaries shift with the new tokenizer, re-verify hit rates

Diff 50–100 outputs: Opus 4.7 is more literal — prompts relying on lenient interpretation may produce different results

Update monitoring: alert on token count spikes in the first week after migration

Claude Opus 4.7 overview

Features, benchmarks, full pricing

Opus 4.7 vs 4.6

Should you migrate?

Is Opus 4.7 worth it?

By use case — coding, research, writing

All model pricing

Compare every provider at once

Best AI for coding

Opus 4.7 vs all coding alternatives

What is Claude Mythos?

Anthropic's even more powerful unreleased model

Frequently asked questions

What is the Claude Opus 4.7 model ID?

The full model ID is claude-opus-4-7-20260416. Use this string in your API requests. The date suffix (20260416) pins you to the exact version snapshot — Anthropic may update the underlying model without changing the base name.

What effort levels does Claude Opus 4.7 support?

Claude Opus 4.7 supports five effort levels: low, medium, high, xhigh (new in 4.7), and max. Use xhigh when you need deeper reasoning than high but don't want the full latency of max. For production apps, high or xhigh are the recommended starting points.

How do I use prompt caching with Claude Opus 4.7?

Prompt caching reduces input token costs by up to 90%. Add cache_control: { type: 'ephemeral' } to the content blocks you want cached (system prompts, large documents, tool definitions). The cache TTL is 5 minutes by default. Note: because Opus 4.7 uses a new tokenizer, re-verify your cache boundaries if you're migrating from Opus 4.6.

How does the Opus 4.7 tokenizer differ from Opus 4.6?

Opus 4.7's tokenizer maps the same text to up to 1.35× more tokens than Opus 4.6. The per-token price is unchanged at $5/$25 per million, but your token count — and therefore bill — will be higher. Code and technical text sees the largest impact (up to 1.35×); plain prose sees minimal impact (1.0–1.05×).

What is the context window for Claude Opus 4.7?

Claude Opus 4.7 supports a 1,000,000 token context window — 5× larger than Opus 4.6's 200K. The maximum output length is 32,000 tokens (32K).

Is Claude Opus 4.7 available on AWS Bedrock and Google Vertex AI?

Yes. Claude Opus 4.7 is available on AWS Bedrock (model ID: anthropic.claude-opus-4-7-20260416-v1:0), Google Vertex AI, and Microsoft Foundry (Azure AI). Each platform uses different model identifiers — check their docs for the exact string.

Claude Opus 4.7 API

Model IDs by platform

Basic usage — Python

Basic usage — TypeScript

Effort levels

Prompt caching — 90% off input costs

Vision — 98.5% accuracy, 2,576px images

Tokenizer migration checklist

Related

Frequently asked questions

What is the Claude Opus 4.7 model ID?

What effort levels does Claude Opus 4.7 support?

How do I use prompt caching with Claude Opus 4.7?

How does the Opus 4.7 tokenizer differ from Opus 4.6?

What is the context window for Claude Opus 4.7?

Is Claude Opus 4.7 available on AWS Bedrock and Google Vertex AI?

Claude Opus 4.7 API

Model IDs by platform

Basic usage — Python

Basic usage — TypeScript

Effort levels

Prompt caching — 90% off input costs

Vision — 98.5% accuracy, 2,576px images

Tokenizer migration checklist

Related

Frequently asked questions

What is the Claude Opus 4.7 model ID?

What effort levels does Claude Opus 4.7 support?

How do I use prompt caching with Claude Opus 4.7?

How does the Opus 4.7 tokenizer differ from Opus 4.6?

What is the context window for Claude Opus 4.7?

Is Claude Opus 4.7 available on AWS Bedrock and Google Vertex AI?