Capability Matrix

Data verified April 2026

AI Capability Matrix

Every major AI model scored across 6 capability dimensions — coding, writing, research, images, budget efficiency, and long-context handling. Filter by provider or tier, sort any column, and find the right model for your specific use case.

Scores (0–100) are derived from public benchmarks, reported provider data, and independent testing · Updated when models change

Provider

Tier

Strong at

20 of 20 models

Score key:88–100 Best72–87 Strong55–71 Decent<55 Limited

										Tier
AnthropicClaude Opus 4.7	100	95	97	84	24	96	1M	$5.00/M	Deliberate	Premium
AnthropicClaude Opus 4.6	99	94	96	54	8	95	1M	$15.00/M	Deliberate	Premium
AnthropicClaude Sonnet 4.6	97	98	93	60	42	93	1M	$3.00/M	Balanced	Premium
OpenAIGPT-5.5	96	92	94	90	22	95	1M	$5.00/M	Balanced	Premium
xAIGrok 4	92	70	86	68	58	90	2M	$2.00/M	Fast	Balanced
OpenAIGPT-5.4	90	88	88	87	35	72	272K	$2.50/M	Balanced	Premium
MistralCodestral 25.01	88	38	52	18	91	60	256K	$0.90/M	Very fast	Budget
DeepSeekDeepSeek V3	87	74	80	10	96	62	128K	$0.27/M	Fast	Budget
OpenAIGPT-5.2	85	82	84	75	18	72	200K	$12.00/M	Balanced	Premium
DeepSeekDeepSeek R1	84	60	89	5	88	58	128K	$0.55/M	Deliberate	Budget
GoogleGemini 3.1 Pro	80	82	99	88	52	99	2M	$2.00/M	Balanced	Premium
OpenAIGPT-5.2 Mini	78	72	68	50	74	52	128K	$1.20/M	Fast	Balanced
MistralMistral Large 2	72	72	71	40	55	58	128K	$3.00/M	Balanced	Balanced
GoogleGemini 3.1 Flash	68	75	76	82	97	82	1M	$0.50/M	Very fast	Budget
OpenAIGPT-4o Mini	65	76	62	60	93	58	128K	$0.15/M	Very fast	Budget
OpenAIGPT-4o	58	76	65	80	40	52	128K	$5.00/M	Fast	Balanced
MetaLlama 4 Maverick	58	66	64	55	78	62	256K	$0.60/M	Fast	Budget
MistralMistral Small 3.1	55	66	52	65	98	50	128K	$0.10/M	Very fast	Budget
MetaLlama 4 Scout	54	60	78	35	86	88	512K	$0.50/M	Fast	Budget
AnthropicClaude 4 Haiku	52	85	62	36	88	58	200K	$0.80/M	Very fast	Budget

AnthropicPremium

Deliberate

Claude Opus 4.7

Code

100

Write

Research

Images

Budget

Context

1M ctx·$5.00/M in

AnthropicPremium

Deliberate

Claude Opus 4.6

Code

Write

Research

Images

Budget

Context

1M ctx·$15.00/M in

AnthropicPremium

Balanced

Claude Sonnet 4.6

Code

Write

Research

Images

Budget

Context

1M ctx·$3.00/M in

OpenAIPremium

Balanced

GPT-5.5

Code

Write

Research

Images

Budget

Context

1M ctx·$5.00/M in

xAIBalanced

Fast

Grok 4

Code

Write

Research

Images

Budget

Context

2M ctx·$2.00/M in

OpenAIPremium

Balanced

GPT-5.4

Code

Write

Research

Images

Budget

Context

272K ctx·$2.50/M in

MistralBudget

Very fast

Codestral 25.01

Code

Write

Research

Images

Budget

Context

256K ctx·$0.90/M in

DeepSeekBudget

Fast

DeepSeek V3

Code

Write

Research

Images

Budget

Context

128K ctx·$0.27/M in

OpenAIPremium

Balanced

GPT-5.2

Code

Write

Research

Images

Budget

Context

200K ctx·$12.00/M in

DeepSeekBudget

Deliberate

DeepSeek R1

Code

Write

Research

Images

Budget

Context

128K ctx·$0.55/M in

GooglePremium

Balanced

Gemini 3.1 Pro

Code

Write

Research

Images

Budget

Context

2M ctx·$2.00/M in

OpenAIBalanced

Fast

GPT-5.2 Mini

Code

Write

Research

Images

Budget

Context

128K ctx·$1.20/M in

MistralBalanced

Balanced

Mistral Large 2

Code

Write

Research

Images

Budget

Context

128K ctx·$3.00/M in

GoogleBudget

Very fast

Gemini 3.1 Flash

Code

Write

Research

Images

Budget

Context

1M ctx·$0.50/M in

OpenAIBudget

Very fast

GPT-4o Mini

Code

Write

Research

Images

Budget

Context

128K ctx·$0.15/M in

OpenAIBalanced

Fast

GPT-4o

Code

Write

Research

Images

Budget

Context

128K ctx·$5.00/M in

MetaBudget

Fast

Llama 4 Maverick

Code

Write

Research

Images

Budget

Context

256K ctx·$0.60/M in

MistralBudget

Very fast

Mistral Small 3.1

Code

Write

Research

Images

Budget

Context

128K ctx·$0.10/M in

MetaBudget

Fast

Llama 4 Scout

Code

Write

Research

Images

Budget

Context

512K ctx·$0.50/M in

AnthropicBudget

Very fast

Claude 4 Haiku

Code

Write

Research

Images

Budget

Context

200K ctx·$0.80/M in

Frequently asked questions

What do the capability scores mean?

Each score (0–100) reflects how well a model performs for that use case, derived from benchmark results, real-world testing, and community data. 88+ is best-in-class, 72–87 is strong, 55–71 is capable, below 55 is limited for that task.

Which AI model is best for coding?

Claude Sonnet 4.6 leads coding with a score of 95/100, driven by its 79.6% SWE-bench Verified score — the highest of any production model. GPT-5.4 is the runner-up at 88.

Which AI model is best for writing?

Claude Sonnet 4.6 leads writing quality with a score of 96/100. Its tone control, coherence, and long-form output polish are consistently rated best-in-class across independent evaluations.

Which AI models support image generation?

GPT-5.4 (via DALL-E 3) and Gemini 2.0 Pro have strong image generation. Claude models do not support image generation — they can analyze images but cannot create them.

What is the best budget AI model?

DeepSeek V3 and Mistral Medium lead the budget category with scores above 85, offering near-premium quality at $0.07–$0.40/1M tokens — a fraction of GPT-5.4 or Claude Sonnet 4.6.

Which AI has the longest context window?

Gemini 2.0 Pro and Claude Sonnet 4.6 both offer 1M token context windows — enough for entire codebases or book-length documents. GPT-5.4 supports 272K tokens.

Explore by use case

Best AI for coding Best AI for writing Best AI for research Best AI for images Best budget AI Best long-context AI Context window comparison Benchmark scores API pricing

AI Capability Matrix

Scores (0–100) are derived from public benchmarks, reported provider data, and independent testing · Updated when models change

Tier

AnthropicClaude Opus 4.7

100

$5.00/M

Deliberate

Premium

AnthropicClaude Opus 4.6

$15.00/M

Deliberate

Premium

AnthropicClaude Sonnet 4.6

$3.00/M

Balanced

Premium

OpenAIGPT-5.5

$5.00/M

Balanced

Premium

xAIGrok 4

$2.00/M

Fast

Balanced

OpenAIGPT-5.4

272K

$2.50/M

Balanced

Premium

MistralCodestral 25.01

256K

$0.90/M

Very fast

Budget

DeepSeekDeepSeek V3

128K

$0.27/M

Fast

Budget

OpenAIGPT-5.2

200K

$12.00/M

Balanced

Premium

DeepSeekDeepSeek R1

128K

$0.55/M

Deliberate

Budget

GoogleGemini 3.1 Pro

$2.00/M

Balanced

Premium

OpenAIGPT-5.2 Mini

128K

$1.20/M

Fast

Balanced

MistralMistral Large 2

128K

$3.00/M

Balanced

GoogleGemini 3.1 Flash

$0.50/M

Very fast

Budget

OpenAIGPT-4o Mini

128K

$0.15/M

Very fast

Budget

OpenAIGPT-4o

128K

$5.00/M

Fast

Balanced

MetaLlama 4 Maverick

256K

$0.60/M

Fast

Budget

MistralMistral Small 3.1

128K

$0.10/M

Very fast

Budget

MetaLlama 4 Scout

512K

$0.50/M

Fast

Budget

AnthropicClaude 4 Haiku

200K

$0.80/M

Very fast

Budget

Frequently asked questions

What do the capability scores mean?

Which AI model is best for coding?

Claude Sonnet 4.6 leads coding with a score of 95/100, driven by its 79.6% SWE-bench Verified score — the highest of any production model. GPT-5.4 is the runner-up at 88.

Which AI model is best for writing?

Claude Sonnet 4.6 leads writing quality with a score of 96/100. Its tone control, coherence, and long-form output polish are consistently rated best-in-class across independent evaluations.

Which AI models support image generation?

GPT-5.4 (via DALL-E 3) and Gemini 2.0 Pro have strong image generation. Claude models do not support image generation — they can analyze images but cannot create them.

What is the best budget AI model?

DeepSeek V3 and Mistral Medium lead the budget category with scores above 85, offering near-premium quality at $0.07–$0.40/1M tokens — a fraction of GPT-5.4 or Claude Sonnet 4.6.

Which AI has the longest context window?

Gemini 2.0 Pro and Claude Sonnet 4.6 both offer 1M token context windows — enough for entire codebases or book-length documents. GPT-5.4 supports 272K tokens.

AI Capability Matrix

How to use the capability matrix

Capability score methodology

Frequently asked questions

What do the capability scores mean?

Which AI model is best for coding?

Which AI model is best for writing?

Which AI models support image generation?

What is the best budget AI model?

Which AI has the longest context window?

Explore by use case

AI Capability Matrix

How to use the capability matrix

Capability score methodology

Frequently asked questions

What do the capability scores mean?

Which AI model is best for coding?

Which AI model is best for writing?

Which AI models support image generation?

What is the best budget AI model?

Which AI has the longest context window?

Explore by use case