UseRightAI
UseRightAI logo
HomeModelsComparePricingWhat's New
UseRightAI
Cut through AI hype. Pick what works.
UseRightAI logo
Cut through AI hype. Pick what works.

Independent AI model tracker. Live pricing, real benchmarks, zero vendor bias.

X (Twitter)LinkedInUpdatesContact

Compare

ChatGPT vs ClaudeGPT-4o vs Claude SonnetClaude vs GeminiDeepSeek vs ChatGPTMistral vs ClaudeGemini Flash vs GPT-4o MiniLlama vs ChatGPTBuild your own →

Best For

CodingWritingDevelopersProduct ManagersDesignersSalesBest Cheap AIBest Free AI

Pricing & Data

API Token PricingPrice HistoryBenchmark ScoresPrivacy & SafetySubscription PlansCost CalculatorWhich AI is Cheapest?

Company

About UseRightAIContactWhat ChangedAll ModelsDisclosuresPrivacy PolicyTerms of Service

© 2026 UseRightAI. Independent · Free forever · Not affiliated with any AI provider.

Affiliate links are clearly labeled. See disclosures.

HomeModelsMistral: Voxtral Small 24B 2507
MistralBudget

Mistral: Voxtral Small 24B 2507

A purpose-built budget audio model that excels at voice tasks but stumbles on context length and general-purpose depth.

52
Coding
63
Writing
58
Research
0
Images
88
Value
28
Long Context
Use this when

Transcribing, analyzing, and responding to audio input cost-effectively without needing a separate speech-to-text pipeline.

Skip this if

You need long-context document analysis, image understanding, or top-tier reasoning performance, as the 32K window and 24B scale will bottleneck complex tasks.

Pricing
$0.10/1M in
$0.30/1M out
→0%since Mar 2026
Context
32k tokens
Speed
Fast
How to access
API
$0.09999999999999999/1M input tokens
Subscription = chat interface. API = build with it. Compare all subscription plans
Switch to instead if...
Best overall
Claude Opus 4.6
Cheaper option
Llama Guard 3 8B
Faster option
Mistral Small 3.1

Strengths

Native audio input support — understands spoken language directly without requiring external STT preprocessing

Extremely low cost at $0.10/$0.30 per 1M tokens, undercutting GPT-4o Audio and Gemini 1.5 Flash significantly

Solid multilingual audio handling, reflecting Mistral's strong European language coverage

Compact 24B size keeps inference fast despite multimodal capabilities

Weaknesses

32K context window is restrictive compared to competitors — Gemini 3.1 Pro offers 1M+ tokens, limiting long audio or document tasks

No image understanding despite the multimodal framing, narrowing its real-world versatility

Reasoning and complex coding benchmarks lag behind GPT-4o and Claude Sonnet 4.6 at their respective tiers

Monthly cost estimate

See what Mistral: Voxtral Small 24B 2507 actually costs at your usage level

Input tokens / month1M
10k50M
Output tokens / month500k
10k25M
Input cost
$0.100
Output cost
$0.150
Total / month
$0.250

Based on Mistral: Voxtral Small 24B 2507 API pricing: $0.09999999999999999/1M input · $0.3/1M output. Real costs vary by provider discounts and caching. Check the provider for exact current rates.

Price History

Mistral: Voxtral Small 24B 2507 pricing over time

→0% since Mar 27

$0.108$0.104$0.100$0.096$0.092Mar 27Mar 28

2 data points · tracked daily since Mar 27, 2026

Ready to try it?

Start using Mistral: Voxtral Small 24B 2507

Transcribing, analyzing, and responding to audio input cost-effectively without needing a separate speech-to-text pipeline.. Start free — no card required.

Try Mistral: Voxtral Small 24B 2507 freeCompare alternatives

Recommendations are made independently based on real-world use and public benchmarks. See our disclosures for details.

Compare alternatives

Similar models worth checking before you commit.

MistralBudget

Mistral: Mistral Medium 3.1

Mistral Medium 3.1 is a multimodal mid-tier model from Mistral that supersedes Mistral Large 2, offering vision capabilities alongside strong text performance at a significantly reduced price point. It targets the sweet spot between budget models and expensive flagships, with a 128K context window and competitive multilingual support.

Verdict
The best Mistral model for budget-conscious builders who still need multimodal capability and solid multilingual output.
Quality score
70%
Pricing
$0.40/1M in
$2.00/1M out
Speed
Fast
Best for cost-sensitive teams needing solid coding, instruction-following, and basic vision tasks without paying flagship prices.
Context
131k tokens
Officially supersedes Mistral Large 2, representing a generational shift in Mistral's lineup toward multimodal capability at lower cost tiers. Available via Mistral API and select cloud providers. No function calling limitations noted at this tier.
BudgetMultimodalMultilingualMid-tierVision
Best for
Cost-sensitive teams needing solid coding, instruction-following, and basic vision tasks without paying flagship prices.
View model
MetaBudget

Meta: Llama 3.2 11B Vision Instruct

Llama 3.2 11B Vision Instruct is Meta's open-weight multimodal model capable of understanding both text and images at an extremely low price point. It handles image captioning, visual question answering, and document analysis alongside standard text tasks.

Verdict
The go-to vision model when budget is the top constraint and good-enough accuracy is acceptable.
Quality score
57%
Pricing
$0.05/1M in
$0.05/1M out
Speed
Fast
Best for budget-conscious developers who need basic vision capabilities without paying premium multimodal prices.
Context
131k tokens
Available via multiple inference providers including Together AI, Fireworks, and OpenRouter. As an open-weight model, it can also be self-hosted for even lower marginal costs at scale. Part of Meta's Llama 3.2 family which also includes a 90B vision variant for heavier workloads.
Open-weightVisionBudgetMultimodalMeta
Best for
Budget-conscious developers who need basic vision capabilities without paying premium multimodal prices.
View model
MistralBudget

Mistral Small 3.1

Mistral's ultra-budget multimodal model — exceptionally cheap with vision support, built for high-volume lightweight tasks where cost is the primary constraint.

Verdict
Ultra-cheap multimodal model for massive-volume, low-complexity pipelines.
Quality score
57%
Pricing
$0.10/1M in
$0.30/1M out
Speed
Very fast
Best for ultra-high-volume classification, summarisation, and lightweight vision tasks
Context
128k tokens
At $0.10/1M input, the cost question disappears. The only question is whether the task complexity exceeds what Mistral Small can handle.
BudgetMultimodalUltra cheapMistral
Best for
Ultra-high-volume classification, summarisation, and lightweight vision tasks
View model

Change history

Pricing moves, ranking shifts, and capability updates.

New ModelMar 27, 2026

Mistral: Voxtral Small 24B 2507 — added to UseRightAI

Mistral: Voxtral Small 24B 2507 (Mistral) is now indexed. It supersedes Mistral Small 3.1. A purpose-built budget audio model that excels at voice tasks but stumbles on context length and general-purpose depth.

View model

FAQ

What is Mistral: Voxtral Small 24B 2507 best for?

Mistral: Voxtral Small 24B 2507 is best for transcribing, analyzing, and responding to audio input cost-effectively without needing a separate speech-to-text pipeline.. It is a strong fit when that workflow matters more than the tradeoffs around budget pricing and fast speed.

When should I avoid Mistral: Voxtral Small 24B 2507?

You need long-context document analysis, image understanding, or top-tier reasoning performance, as the 32K window and 24B scale will bottleneck complex tasks.

What is a cheaper alternative to Mistral: Voxtral Small 24B 2507?

Llama Guard 3 8B is the lower-cost option to compare first when you want a similar workflow fit with less token spend.

What is a faster alternative to Mistral: Voxtral Small 24B 2507?

Mistral Small 3.1 is the better pick when response time matters more than maximum depth or premium quality.

Newsletter

Get notified when Mistral: Voxtral Small 24B 2507 pricing changes

We track pricing daily. When this model drops or spikes, you'll know first.

No spam. Useful updates only. Affiliate disclosures always clearly labeled.