Gemini 3.1 Pro
Google's flagship with the largest context window of any frontier model at 2M tokens, Deep Think reasoning, and the best price-to-performance among premium models.
The longest context window in AI, built into everything Google.
Google DeepMind builds the Gemini family. Gemini 3.1 Pro leads on long-context work with a 2M token window — twice any competitor. Gemini 3.1 Flash is the fastest quality model for high-volume production workloads.
Every Google DeepMind model in the directory, ranked by overall capability score.
Google's flagship with the largest context window of any frontier model at 2M tokens, Deep Think reasoning, and the best price-to-performance among premium models.
Per 1 million tokens. Updated when providers change prices.
| Model | Input / 1M | Output / 1M | Context | Speed |
|---|---|---|---|---|
| Gemini 3.1 Pro Premium | $2.00/1M | $12.00/1M | 2M | Balanced |
| Google: Gemini 2.5 Pro Balanced | $1.25/1M | $10.00/1M | 1.048576M | Balanced |
| Google: Gemini 2.0 Flash Budget | $0.10/1M | $0.40/1M | 1.048576M | Very fast |
| Gemini 3.1 Flash Budget | $0.50/1M | $3.00/1M | 1M | Very fast |
| Google: Gemini 2.5 Pro Preview 05-06 |
Consumer plans for access without the API.
Head-to-head comparisons for the most-searched questions.
Newsletter
Pricing changes, new releases, and ranking shifts — straight to your inbox.
No spam. Useful updates only. Affiliate disclosures always clearly labeled.
Gemini 3.1 Pro is Google's most capable model — it has the largest context window (2M tokens) and strong performance across research, writing, and coding. Gemini 3.1 Flash is the better pick for speed-sensitive production use cases.
Gemini 3.1 Pro has a 2M token context window — the largest of any frontier model. That's roughly 1.5 million words, or an entire large codebase in a single prompt. Gemini 3.1 Flash also supports 1M tokens.
Gemini 3.1 Pro leads on context window (2M vs 1M for Claude/GPT). Claude Opus 4.7 leads on coding (SWE-Bench Pro). GPT-5.4 leads on agentic/computer-use workflows. For large document analysis and long-context research, Gemini 3.1 Pro is the strongest choice.
Yes — Gemini 3.1 Flash has a free tier in Google AI Studio. Gemini Advanced ($19.99/month) provides access to Gemini 3.1 Pro in Google's consumer apps. The API charges per token for production use.
Gemini 2.5 Pro is Google's flagship reasoning-capable model with a massive 1M token context window, designed for complex analysis, coding, and multimodal tasks. It balances frontier-level intelligence with competitive mid-tier pricing.
Gemini 2.0 Flash is Google's high-speed, cost-efficient multimodal model built for high-volume production workloads, offering a massive 1M token context window at near-throwaway pricing. It supports text, image, audio, and video inputs with strong instruction-following and tool-use capabilities.
Fast, low-cost model with a 1M token context window — the best budget default for teams running high prompt volumes.
Gemini 2.5 Pro Preview 05-06 is Google's latest frontier reasoning model featuring a massive 1M token context window and strong multimodal capabilities. It targets developers and researchers needing deep analytical power with competitive pricing relative to its capability tier.
Gemini 2.5 Flash is Google's fast, cost-efficient multimodal model built for high-throughput tasks requiring a million-token context window at budget pricing. It balances speed and capability across text, code, and vision tasks without the cost of flagship models like Gemini 2.5 Pro.
Gemini 3 Flash Preview is Google's budget-tier multimodal model optimized for high-throughput, low-latency tasks at scale. It offers a massive 1M token context window at aggressive pricing, making it a strong contender for cost-sensitive production workloads.
Gemini 2.5 Pro Preview 06-05 is Google's most capable reasoning-focused model, featuring a massive 1M token context window and strong performance across code, math, and complex analysis tasks. It represents Google's top-tier offering in the Gemini 2.5 generation, optimized for depth over speed.
Gemma 4 31B is Google's open-weight instruction-tuned model offering a strong balance of capability and cost efficiency at just $0.14/$0.40 per million tokens. It features a 262K context window and is designed for developers who need capable on-premise or API-hosted inference without flagship pricing.
Gemini 3 Pro Image Preview is Google's image-focused multimodal model designed for advanced visual understanding and generation tasks. It sits in the balanced price tier, targeting professional workflows that require strong image comprehension alongside text reasoning.
Gemini 2.5 Flash Lite Preview 09-2025 is Google's most cost-optimized variant of the Gemini 2.5 Flash family, designed for high-throughput, latency-sensitive applications at near-commodity pricing. It offers a massive 1M token context window at just $0.10/1M input tokens, positioning it as one of the cheapest long-context models available.
Gemini 2.5 Flash Lite is Google's lightest and most cost-efficient model in the 2.5 family, designed for high-throughput tasks where speed and price matter more than peak intelligence. It retains the massive 1M token context window from its larger siblings while cutting costs to a fraction of Gemini 2.5 Pro.
Gemini 2.0 Flash Lite is Google's ultra-budget, high-speed model designed for high-volume, cost-sensitive applications. It sits below Gemini 2.0 Flash in capability but offers the lowest price point in the Gemini 2.0 family with a massive 1M token context window.
Gemma 4 26B A4B is a sparse mixture-of-experts open model from Google, activating only ~4B parameters per forward pass despite having 26B total parameters. It offers a 262K context window at budget pricing, making it one of the more capable open-weight models for its cost tier.
A budget-tier image-capable variant of Gemini 2.5 Flash, optimized for cost-effective multimodal tasks involving image understanding. Despite the whimsical internal name, it delivers Gemini 2.5 Flash's vision capabilities at a low price point.
Gemma 2 27B is Google's largest open-weight model in the Gemma 2 family, designed for high-quality text generation, reasoning, and instruction-following at a mid-range price point. It punches above its weight class for an open model, rivaling some proprietary mid-tier offerings.
Gemma 2 9B is Google's open-weight 9-billion parameter model designed for efficient on-device and API deployment. It punches above its weight class for instruction-following and general language tasks at an exceptionally low cost.
| $1.25/1M |
| $10.00/1M |
| 1.048576M |
| Deliberate |
| Google: Gemini 2.5 Flash Budget | $0.30/1M | $2.50/1M | 1.048576M | Very fast |
| Google: Gemini 3 Flash Preview Budget | $0.50/1M | $3.00/1M | 1.048576M | Very fast |
| Google: Gemini 2.5 Pro Preview 06-05 Balanced | $1.25/1M | $10.00/1M | 1.048576M | Deliberate |
| Gemma 4 31B Budget | $0.14/1M | $0.40/1M | 262K | Fast |
| Google: Nano Banana Pro (Gemini 3 Pro Image Preview) Balanced | $2.00/1M | $12.00/1M | 66K | Balanced |
| Google: Gemini 2.5 Flash Lite Preview 09-2025 Budget | $0.10/1M | $0.40/1M | 1.048576M | Very fast |
| Google: Gemini 2.5 Flash Lite Budget | $0.10/1M | $0.40/1M | 1.048576M | Very fast |
| Google: Gemini 2.0 Flash Lite Budget | $0.07/1M | $0.30/1M | 1.048576M | Very fast |
| Gemma 4 26B A4B Budget | $0.13/1M | $0.40/1M | 262K | Fast |
| Google: Nano Banana (Gemini 2.5 Flash Image) Budget | $0.30/1M | $2.50/1M | 33K | Very fast |
| Google: Gemma 2 27B Balanced | $0.65/1M | $0.65/1M | 8K | Fast |
| Google: Gemma 2 9B Budget | $0.03/1M | $0.09/1M | 8K | Very fast |