UseRightAI
UseRightAI logo
HomeModelsComparePricingWhat's New
UseRightAI
Cut through AI hype. Pick what works.
UseRightAI logo
Cut through AI hype. Pick what works.

Independent AI model tracker. Live pricing, real benchmarks, zero vendor bias.

X (Twitter)LinkedInUpdatesContact

Compare

ChatGPT vs ClaudeGPT-4o vs Claude SonnetClaude vs GeminiDeepSeek vs ChatGPTMistral vs ClaudeGemini Flash vs GPT-4o MiniLlama vs ChatGPTBuild your own →

Best For

CodingWritingDevelopersProduct ManagersDesignersSalesBest Cheap AIBest Free AI

Pricing & Data

API Token PricingPrice HistoryBenchmark ScoresPrivacy & SafetySubscription PlansCost CalculatorWhich AI is Cheapest?

Company

About UseRightAIContactWhat ChangedAll ModelsDisclosuresPrivacy PolicyTerms of Service

© 2026 UseRightAI. Independent · Free forever · Not affiliated with any AI provider.

Affiliate links are clearly labeled. See disclosures.

London, UK & Mountain View, CA · Founded 2023 (DeepMind + Google Brain merger)

Google DeepMind

The longest context window in AI, built into everything Google.

Google DeepMind builds the Gemini family. Gemini 3.1 Pro leads on long-context work with a 2M token window — twice any competitor. Gemini 3.1 Flash is the fastest quality model for high-volume production workloads.

Rankings refresh dailyScored on 6 criteriaNo paid rankings
  • Gemini 3.1 Pro has a 2M token context window — largest in the industry
  • Gemini 3.1 Flash is the fastest quality closed model for production APIs
  • Integrated into Google Workspace, Search, and Android
17 models

All Google DeepMind Models

Every Google DeepMind model in the directory, ranked by overall capability score.

GooglePremium

Gemini 3.1 Pro

Google's flagship with the largest context window of any frontier model at 2M tokens, Deep Think reasoning, and the best price-to-performance among premium models.

Verdict
Best for research and deep document analysis — 2M context at the best premium price.
Quality score
88%
Pricing
$2.00/1M in
$12.00/1M out

Google DeepMind API Pricing

Per 1 million tokens. Updated when providers change prices.

ModelInput / 1MOutput / 1MContextSpeed
Gemini 3.1 Pro
Premium
$2.00/1M$12.00/1M2MBalanced
Google: Gemini 2.5 Pro
Balanced
$1.25/1M$10.00/1M1.048576MBalanced
Google: Gemini 2.0 Flash
Budget
$0.10/1M$0.40/1M1.048576MVery fast
Gemini 3.1 Flash
Budget
$0.50/1M$3.00/1M1MVery fast
Google: Gemini 2.5 Pro Preview 05-06

Google DeepMind Subscription Plans

Consumer plans for access without the API.

Best value for Google users
Google One AI Premium
$19.99/mo
  • Gemini 3.1 Pro in Gemini app and Google products
  • Gemini in Gmail, Docs, Sheets, Slides
  • NotebookLM Plus (extended features)
  • 2TB Google One storage
Get Google One AI Premium
Compare all subscription plans →

Compare Google DeepMind Models

Head-to-head comparisons for the most-searched questions.

Gemini 3.1 Pro vs Google: Gemini 2.5 ProGemini 3.1 Pro vs Google: Gemini 2.0 FlashGoogle: Gemini 2.5 Pro vs Google: Gemini 2.0 FlashGoogle: Gemini 2.5 Pro vs Gemini 3.1 FlashGoogle: Gemini 2.0 Flash vs Gemini 3.1 FlashGoogle: Gemini 2.0 Flash vs Google: Gemini 2.5 Pro Preview 05-06Gemini 3.1 Flash vs Google: Gemini 2.5 Pro Preview 05-06Gemini 3.1 Flash vs Google: Gemini 2.5 FlashOpen compare tool →

Newsletter

Get notified when Google DeepMind releases new models

Pricing changes, new releases, and ranking shifts — straight to your inbox.

No spam. Useful updates only. Affiliate disclosures always clearly labeled.

Google DeepMind FAQ

What is Google's best AI model in 2026?

Gemini 3.1 Pro is Google's most capable model — it has the largest context window (2M tokens) and strong performance across research, writing, and coding. Gemini 3.1 Flash is the better pick for speed-sensitive production use cases.

What is Gemini's context window?

Gemini 3.1 Pro has a 2M token context window — the largest of any frontier model. That's roughly 1.5 million words, or an entire large codebase in a single prompt. Gemini 3.1 Flash also supports 1M tokens.

How does Gemini compare to Claude and GPT?

Gemini 3.1 Pro leads on context window (2M vs 1M for Claude/GPT). Claude Opus 4.7 leads on coding (SWE-Bench Pro). GPT-5.4 leads on agentic/computer-use workflows. For large document analysis and long-context research, Gemini 3.1 Pro is the strongest choice.

Is Gemini free to use?

Yes — Gemini 3.1 Flash has a free tier in Google AI Studio. Gemini Advanced ($19.99/month) provides access to Gemini 3.1 Pro in Google's consumer apps. The API charges per token for production use.

Explore other providers

OpenAIAnthropicxAIMetaMistralDeepSeekBrowse all models →
Speed
Balanced
Best for research, deep document analysis, and long-context reasoning at competitive pricing
Context
2M tokens
The 2M context window is a genuine competitive advantage — no other frontier model gets close for document-heavy workflows.
Research leader2M contextBest value premiumDeep Think
Best for
Research, deep document analysis, and long-context reasoning at competitive pricing
View model
GoogleBalanced

Google: Gemini 2.5 Pro

Gemini 2.5 Pro is Google's flagship reasoning-capable model with a massive 1M token context window, designed for complex analysis, coding, and multimodal tasks. It balances frontier-level intelligence with competitive mid-tier pricing.

Verdict
The best Google model for serious, complex work — especially when you need to fit an entire codebase or document corpus into a single prompt.
Quality score
87%
Pricing
$1.25/1M in
$10.00/1M out
Speed
Balanced
Best for deep reasoning over very long documents, complex codebases, or multimodal inputs where context size is a constraint with other models.
Context
1.0M tokens
Pricing shown is for prompts under 200K tokens; inputs over 200K tokens are billed at $2.50/1M input and $15/1M output. Gemini 2.5 Pro includes built-in 'thinking' (reasoning) mode which can increase latency and cost further.
FlagshipLong ContextMultimodalReasoningGoogle
Best for
Deep reasoning over very long documents, complex codebases, or multimodal inputs where context size is a constraint with other models.
View model
GoogleBudget

Google: Gemini 2.0 Flash

Gemini 2.0 Flash is Google's high-speed, cost-efficient multimodal model built for high-volume production workloads, offering a massive 1M token context window at near-throwaway pricing. It supports text, image, audio, and video inputs with strong instruction-following and tool-use capabilities.

Verdict
The best bang-for-buck multimodal workhorse for developers who need speed, scale, and a massive context window.
Quality score
76%
Pricing
$0.10/1M in
$0.40/1M out
Speed
Very fast
Best for high-throughput pipelines and agentic tasks where speed and cost matter more than peak reasoning quality.
Context
1.0M tokens
Pricing listed is for standard (non-cached) input/output. Context caching is available and can reduce costs significantly for repeated long-context calls. Image and audio inputs are priced separately. Free tier available via Google AI Studio.
BudgetFastLong ContextMultimodalGoogle
Best for
High-throughput pipelines and agentic tasks where speed and cost matter more than peak reasoning quality.
View model
GoogleBudget

Gemini 3.1 Flash

Fast, low-cost model with a 1M token context window — the best budget default for teams running high prompt volumes.

Verdict
Best cheap AI for broad day-to-day work — now with 1M context.
Quality score
75%
Pricing
$0.50/1M in
$3.00/1M out
Speed
Very fast
Best for high-volume everyday ai usage where speed and cost both matter
Context
1M tokens
The default budget pick for startups watching cost. The 1M context at this price is unmatched.
Best budgetFast1M contextScalable
Best for
High-volume everyday AI usage where speed and cost both matter
View model
GoogleBalanced

Google: Gemini 2.5 Pro Preview 05-06

Gemini 2.5 Pro Preview 05-06 is Google's latest frontier reasoning model featuring a massive 1M token context window and strong multimodal capabilities. It targets developers and researchers needing deep analytical power with competitive pricing relative to its capability tier.

Verdict
The go-to model when you need a frontier brain and a million-token memory, at a price that won't immediately break your budget.
Quality score
86%
Pricing
$1.25/1M in
$10.00/1M out
Speed
Deliberate
Best for complex multi-document analysis, long-context reasoning, and advanced coding tasks where a massive context window is essential.
Context
1.0M tokens
This is a preview model (05-06 date suffix indicates a versioned snapshot); Google may deprecate or change it without long notice. Confirm production readiness before building critical pipelines on this endpoint. The 1M context window applies to text and multimodal inputs combined.
Long ContextReasoningMultimodalFrontierPreview
Best for
Complex multi-document analysis, long-context reasoning, and advanced coding tasks where a massive context window is essential.
View model
GoogleBudget

Google: Gemini 2.5 Flash

Gemini 2.5 Flash is Google's fast, cost-efficient multimodal model built for high-throughput tasks requiring a million-token context window at budget pricing. It balances speed and capability across text, code, and vision tasks without the cost of flagship models like Gemini 2.5 Pro.

Verdict
The go-to budget model for long-context and multimodal workloads where speed and scale matter.
Quality score
76%
Pricing
$0.30/1M in
$2.50/1M out
Speed
Very fast
Best for high-volume document processing, summarization, and coding assistance where cost and speed matter more than peak accuracy.
Context
1.0M tokens
Output cost ($2.5/1M) is disproportionately higher than input cost ($0.3/1M), so generation-heavy use cases may see costs add up faster than expected. Thinking/reasoning mode may be available but incurs additional cost.
BudgetFastLong ContextMultimodalGoogle
Best for
High-volume document processing, summarization, and coding assistance where cost and speed matter more than peak accuracy.
View model
GoogleBudget

Google: Gemini 3 Flash Preview

Gemini 3 Flash Preview is Google's budget-tier multimodal model optimized for high-throughput, low-latency tasks at scale. It offers a massive 1M token context window at aggressive pricing, making it a strong contender for cost-sensitive production workloads.

Verdict
A fast, affordable workhorse for long-context and high-volume tasks — just don't build critical systems on a Preview model.
Quality score
74%
Pricing
$0.50/1M in
$3.00/1M out
Speed
Very fast
Best for high-volume document processing, summarization pipelines, and long-context tasks where cost efficiency matters more than frontier-level reasoning.
Context
1.0M tokens
This is a preview model and may have limited availability, unstable rate limits, and pricing that changes before general availability. Output cost at $3/1M is notably higher than input cost, so applications generating long outputs should budget accordingly.
BudgetLong ContextFastMultimodalPreview
Best for
High-volume document processing, summarization pipelines, and long-context tasks where cost efficiency matters more than frontier-level reasoning.
View model
GoogleBalanced

Google: Gemini 2.5 Pro Preview 06-05

Gemini 2.5 Pro Preview 06-05 is Google's most capable reasoning-focused model, featuring a massive 1M token context window and strong performance across code, math, and complex analysis tasks. It represents Google's top-tier offering in the Gemini 2.5 generation, optimized for depth over speed.

Verdict
Google's most capable model — a top-tier reasoning and coding powerhouse with an unmatched context window, held back only by its preview status and output cost.
Quality score
83%
Pricing
$1.25/1M in
$10.00/1M out
Speed
Deliberate
Best for complex multi-step reasoning, large codebase analysis, and tasks requiring deep synthesis across very long documents.
Context
1.0M tokens
This is a preview model (06-05 date suffix indicates a versioned snapshot); Google may deprecate or modify it before a stable GA release. Pricing tiers differ based on prompt length — prompts over 200K tokens are charged at $2.50/1M input and $15/1M output, significantly increasing cost for very long-context use cases.
FlagshipLong ContextReasoningCodingPreview
Best for
Complex multi-step reasoning, large codebase analysis, and tasks requiring deep synthesis across very long documents.
View model
GoogleBudget

Gemma 4 31B

Gemma 4 31B is Google's open-weight instruction-tuned model offering a strong balance of capability and cost efficiency at just $0.14/$0.40 per million tokens. It features a 262K context window and is designed for developers who need capable on-premise or API-hosted inference without flagship pricing.

Verdict
A well-priced, long-context open-weight model that's ideal for high-volume developer workloads but won't match frontier models on complex reasoning.
Quality score
66%
Pricing
$0.14/1M in
$0.40/1M out
Speed
Fast
Best for cost-conscious developers needing a capable open-weight model for coding assistance, summarization, and document analysis at scale.
Context
262k tokens
As an open-weight model, Gemma 4 31B can be self-hosted via Ollama or Hugging Face in addition to Google's API. Pricing shown is for hosted inference. No image input capability confirmed at launch.
Open WeightBudgetLong ContextCodingSelf-Hostable
Best for
Cost-conscious developers needing a capable open-weight model for coding assistance, summarization, and document analysis at scale.
View model
GoogleBalanced

Google: Nano Banana Pro (Gemini 3 Pro Image Preview)

Gemini 3 Pro Image Preview is Google's image-focused multimodal model designed for advanced visual understanding and generation tasks. It sits in the balanced price tier, targeting professional workflows that require strong image comprehension alongside text reasoning.

Verdict
A capable image-first multimodal model held back by a small context window and preview-stage instability.
Quality score
64%
Pricing
$2.00/1M in
$12.00/1M out
Speed
Balanced
Best for teams needing robust image analysis, visual question answering, and multimodal workflows at a mid-range price point.
Context
66k tokens
This is a preview model — API behavior, pricing, and availability may change before general release. The 65K context window is unusually constrained for a Gemini Pro-tier model; double-check if your use case requires longer contexts before committing.
VisionMultimodalGooglePreviewImage Analysis
Best for
Teams needing robust image analysis, visual question answering, and multimodal workflows at a mid-range price point.
View model
GoogleBudget

Google: Gemini 2.5 Flash Lite Preview 09-2025

Gemini 2.5 Flash Lite Preview 09-2025 is Google's most cost-optimized variant of the Gemini 2.5 Flash family, designed for high-throughput, latency-sensitive applications at near-commodity pricing. It offers a massive 1M token context window at just $0.10/1M input tokens, positioning it as one of the cheapest long-context models available.

Verdict
The go-to model for cost-sensitive, high-volume pipelines that need a massive context window without breaking the budget.
Quality score
62%
Pricing
$0.10/1M in
$0.40/1M out
Speed
Very fast
Best for high-volume document processing, classification pipelines, and lightweight coding tasks where cost per token matters more than peak quality.
Context
1.0M tokens
This is a preview model (09-2025 versioned) and may be subject to breaking changes or deprecation. Pricing is approximate based on listed rates. Not recommended for production systems requiring SLA guarantees. Check Google AI Studio or Vertex AI for GA alternatives.
budgetlong-contextfasthigh-throughputpreview
Best for
High-volume document processing, classification pipelines, and lightweight coding tasks where cost per token matters more than peak quality.
View model
GoogleBudget

Google: Gemini 2.5 Flash Lite

Gemini 2.5 Flash Lite is Google's lightest and most cost-efficient model in the 2.5 family, designed for high-throughput tasks where speed and price matter more than peak intelligence. It retains the massive 1M token context window from its larger siblings while cutting costs to a fraction of Gemini 2.5 Pro.

Verdict
The best cheap model for long-document pipelines, but don't expect flagship-level reasoning.
Quality score
57%
Pricing
$0.10/1M in
$0.40/1M out
Speed
Very fast
Best for high-volume, latency-sensitive applications like document triage, chatbot pipelines, and content classification at scale.
Context
1.0M tokens
Pricing is approximate based on listed rates. As a 'Lite' model, it may not support all multimodal features available in full Flash or Pro variants. Check Google AI Studio for feature availability and rate limits.
BudgetFastLong ContextHigh VolumeGoogle
Best for
High-volume, latency-sensitive applications like document triage, chatbot pipelines, and content classification at scale.
View model
GoogleBudget

Google: Gemini 2.0 Flash Lite

Gemini 2.0 Flash Lite is Google's ultra-budget, high-speed model designed for high-volume, cost-sensitive applications. It sits below Gemini 2.0 Flash in capability but offers the lowest price point in the Gemini 2.0 family with a massive 1M token context window.

Verdict
The go-to model when cost and throughput are everything and task complexity is low.
Quality score
57%
Pricing
$0.07/1M in
$0.30/1M out
Speed
Very fast
Best for high-throughput, cost-sensitive pipelines where speed and price matter more than top-tier reasoning quality.
Context
1.0M tokens
Pricing is among the lowest available in any major provider's lineup as of mid-2025. Context window of 1M tokens is a significant differentiator at this price tier. Check Google AI Studio and Vertex AI for rate limits on high-volume usage.
Ultra-budgetHigh-speedLong contextHigh-volumeGoogle
Best for
High-throughput, cost-sensitive pipelines where speed and price matter more than top-tier reasoning quality.
View model
GoogleBudget

Gemma 4 26B A4B

Gemma 4 26B A4B is a sparse mixture-of-experts open model from Google, activating only ~4B parameters per forward pass despite having 26B total parameters. It offers a 262K context window at budget pricing, making it one of the more capable open-weight models for its cost tier.

Verdict
A lean, fast, and surprisingly capable budget model best suited for high-volume text tasks where cost efficiency trumps peak quality.
Quality score
59%
Pricing
$0.13/1M in
$0.40/1M out
Speed
Fast
Best for cost-sensitive applications needing long-context processing with reasonable quality, such as document summarization pipelines or lightweight coding assistants.
Context
262k tokens
As an open-weight model, Gemma 4 26B can also be self-hosted, making API pricing largely irrelevant at scale. The 'A4B' suffix denotes the active parameter count in its MoE configuration. Listed as superseding Gemini 3 Flash Preview, though Gemini 2.0 Flash remains a stronger hosted alternative.
Open-weightBudgetMoELong ContextGoogle
Best for
Cost-sensitive applications needing long-context processing with reasonable quality, such as document summarization pipelines or lightweight coding assistants.
View model
GoogleBudget

Google: Nano Banana (Gemini 2.5 Flash Image)

A budget-tier image-capable variant of Gemini 2.5 Flash, optimized for cost-effective multimodal tasks involving image understanding. Despite the whimsical internal name, it delivers Gemini 2.5 Flash's vision capabilities at a low price point.

Verdict
A scrappy budget image model that's fast and cheap on ingestion but constrained by a tiny context window.
Quality score
42%
Pricing
$0.30/1M in
$2.50/1M out
Speed
Very fast
Best for budget-conscious teams needing fast image analysis and visual question answering without flagship pricing.
Context
33k tokens
The 32,768 token context window is unusually small even for a budget model — verify this limit hasn't changed before deploying in production. The 'Nano Banana' name appears to be an internal or experimental identifier; confirm model availability and stability via Google AI Studio or Vertex AI before relying on it in critical workflows.
budgetimage-analysismultimodalflashgoogle
Best for
Budget-conscious teams needing fast image analysis and visual question answering without flagship pricing.
View model
GoogleBalanced

Google: Gemma 2 27B

Gemma 2 27B is Google's largest open-weight model in the Gemma 2 family, designed for high-quality text generation, reasoning, and instruction-following at a mid-range price point. It punches above its weight class for an open model, rivaling some proprietary mid-tier offerings.

Verdict
A strong open-weight performer for short-context coding and reasoning, hobbled by an outdated 8K context limit.
Quality score
55%
Pricing
$0.65/1M in
$0.65/1M out
Speed
Fast
Best for teams that need strong open-weight model performance for coding and reasoning tasks without paying flagship prices.
Context
8k tokens
Symmetric input/output pricing at $0.65/1M tokens is straightforward but positions it oddly — it's pricier than GPT-4o Mini while lacking its multimodal features. Available via multiple inference providers including Google Vertex AI and third-party APIs.
Open WeightMid-RangeText OnlyCodingInstruction Following
Best for
Teams that need strong open-weight model performance for coding and reasoning tasks without paying flagship prices.
View model
GoogleBudget

Google: Gemma 2 9B

Gemma 2 9B is Google's open-weight 9-billion parameter model designed for efficient on-device and API deployment. It punches above its weight class for instruction-following and general language tasks at an exceptionally low cost.

Verdict
A capable open-weight budget model hamstrung by a frustratingly small context window.
Quality score
45%
Pricing
$0.03/1M in
$0.09/1M out
Speed
Very fast
Best for lightweight text tasks, classification, and summarization where cost matters more than frontier-level quality.
Context
8k tokens
Pricing reflects API access through third-party providers; Google also offers Gemma 2 9B weights for free download and self-hosting. The 8,192 token limit is a hard architectural constraint of this version.
Open WeightBudgetSmall ModelGoogleOn-Device
Best for
Lightweight text tasks, classification, and summarization where cost matters more than frontier-level quality.
View model
Balanced
$1.25/1M
$10.00/1M
1.048576M
Deliberate
Google: Gemini 2.5 Flash
Budget
$0.30/1M$2.50/1M1.048576MVery fast
Google: Gemini 3 Flash Preview
Budget
$0.50/1M$3.00/1M1.048576MVery fast
Google: Gemini 2.5 Pro Preview 06-05
Balanced
$1.25/1M$10.00/1M1.048576MDeliberate
Gemma 4 31B
Budget
$0.14/1M$0.40/1M262KFast
Google: Nano Banana Pro (Gemini 3 Pro Image Preview)
Balanced
$2.00/1M$12.00/1M66KBalanced
Google: Gemini 2.5 Flash Lite Preview 09-2025
Budget
$0.10/1M$0.40/1M1.048576MVery fast
Google: Gemini 2.5 Flash Lite
Budget
$0.10/1M$0.40/1M1.048576MVery fast
Google: Gemini 2.0 Flash Lite
Budget
$0.07/1M$0.30/1M1.048576MVery fast
Gemma 4 26B A4B
Budget
$0.13/1M$0.40/1M262KFast
Google: Nano Banana (Gemini 2.5 Flash Image)
Budget
$0.30/1M$2.50/1M33KVery fast
Google: Gemma 2 27B
Balanced
$0.65/1M$0.65/1M8KFast
Google: Gemma 2 9B
Budget
$0.03/1M$0.09/1M8KVery fast
Compare all providers →