Menlo Park, CA · Founded 2023 (FAIR reorganised as Meta AI)
Meta AI
Open-weight models used by more developers than any other.
Meta's Llama 4 series are the most widely deployed open-weight AI models in the world. Llama 4 Maverick delivers frontier-class performance for free — self-hosted or via Groq, Together AI, and Fireworks.
Rankings refresh dailyScored on 6 criteriaNo paid rankings
Llama 4 Maverick is free to run — open Apache 2.0 license
Deployed by more developers than any closed-source model via Groq and Together AI
Llama 4 Scout is ultra-fast via Groq's LPU hardware
10 models
All Meta AI Models
Every Meta AI model in the directory, ranked by overall capability score.
MetaBudget
Llama 4 Scout
Long-window open-weight model that handles large document sets at a low price point.
Verdict
Best open-weight long-context option for self-hosted pipelines.
Quality score
64%
Pricing
$0.10/1M in
$0.30/1M out
Speed
Fast
Best for affordable self-hosted long-context workflows and analysis pipelines
Context
512k tokens
Worth considering for internal search, analysis, and review workflows where data sovereignty matters.
Long contextCheapOpen weightsMeta
Best for
Affordable self-hosted long-context workflows and analysis pipelines
Llama 3.2 11B Vision Instruct is Meta's open-weight multimodal model capable of understanding both text and images at an extremely low price point. It handles image captioning, visual question answering, and document analysis alongside standard text tasks.
Verdict
The go-to vision model when budget is the top constraint and good-enough accuracy is acceptable.
Quality score
57%
Pricing
$0.34/1M in
$0.34/1M out
Speed
Fast
Best for budget-conscious developers who need basic vision capabilities without paying premium multimodal prices.
Context
131k tokens
Available via multiple inference providers including Together AI, Fireworks, and OpenRouter. As an open-weight model, it can also be self-hosted for even lower marginal costs at scale. Part of Meta's Llama 3.2 family which also includes a 90B vision variant for heavier workloads.
Open-weightVisionBudgetMultimodalMeta
Best for
Budget-conscious developers who need basic vision capabilities without paying premium multimodal prices.
Meta's Llama 3.1 70B Instruct is a open-weight large language model with 70 billion parameters, fine-tuned for instruction following across coding, reasoning, and general-purpose tasks. It offers a strong balance of capability and cost at $0.40/1M tokens for both input and output.
Verdict
The go-to budget open-weight model for teams who need solid LLM capability without frontier model pricing.
Quality score
65%
Pricing
$0.40/1M in
$0.40/1M out
Speed
Fast
Best for teams needing capable open-weight llm performance at budget pricing for coding assistance, summarization, or rag pipelines.
Context
131k tokens
Pricing shown is via third-party API providers (e.g., OpenRouter, Together AI) — costs may vary. Meta releases Llama 3.1 weights publicly, enabling self-hosting at even lower cost. Not available directly from Meta as a hosted API.
Meta's Llama 3 70B Instruct is a 70-billion parameter open-weight language model fine-tuned for instruction following, representing Meta's most capable publicly available model at the time of release. It excels at general reasoning, coding assistance, and structured text tasks with strong multilingual support.
Verdict
A capable but now-outdated open-weight model undercut by its tiny context window and newer successors.
Quality score
53%
Pricing
$0.51/1M in
$0.74/1M out
Speed
Balanced
Best for developers and researchers who need a capable open-weight model for coding, analysis, and instruction-following tasks at a mid-range price point.
Context
8k tokens
This is the original Llama 3 70B, not the 3.1 or 3.3 variants. Llama 3.1 70B offers a 128K context window at comparable pricing and is strongly preferred. Consider this model only if you have a specific reason to pin to the original Llama 3 checkpoint.
Open-weightInstruction-tunedMid-rangeMetaLlama 3
Best for
Developers and researchers who need a capable open-weight model for coding, analysis, and instruction-following tasks at a mid-range price point.
Llama 3.1 8B Instruct is Meta's smallest production-ready open-weight model, optimized for fast, low-cost inference on everyday language tasks. It delivers surprisingly capable instruction-following for its size, making it a go-to for high-volume, cost-sensitive deployments.
Verdict
The right tool for cheap, fast, high-volume tasks — not for anything that requires serious thinking.
Quality score
43%
Pricing
$0.02/1M in
$0.03/1M out
Speed
Very fast
Best for high-throughput applications where cost and speed matter more than frontier-level quality, such as chatbots, content classification, and text summarization.
Context
16k tokens
Being open-weight, this model can be run locally or self-hosted via providers like Together AI, Fireworks, or Groq, often at even lower costs. The 16K context window is a meaningful limitation compared to other models in this price tier.
Open WeightBudgetFastSelf-HostableMeta
Best for
High-throughput applications where cost and speed matter more than frontier-level quality, such as chatbots, content classification, and text summarization.
Llama 3 8B Instruct is Meta's compact open-weight instruction-following model, optimized for efficiency and accessibility at extremely low cost. It handles everyday text tasks like summarization, Q&A, and light coding at a fraction of the price of frontier models.
Verdict
A dirt-cheap, fast open model for simple tasks — just don't expect frontier-level quality.
Quality score
39%
Pricing
$0.14/1M in
$0.14/1M out
Speed
Very fast
Best for high-volume, cost-sensitive applications where speed and price matter more than peak accuracy.
Context
8k tokens
As an open-weight model, Llama 3 8B can be self-hosted via platforms like Ollama, Replicate, or Together AI. The 8,192 token context window is a significant practical limitation. Pricing listed reflects hosted API inference; self-hosted costs vary.
Open-weightBudgetFastSelf-hostableCompact
Best for
High-volume, cost-sensitive applications where speed and price matter more than peak accuracy.
Llama 3.2 1B Instruct is Meta's smallest production language model, designed for lightweight text tasks with an extremely low cost footprint. It excels at simple instruction-following, text classification, and on-device or edge deployment scenarios.
Verdict
The go-to model when cost per token matters more than output quality.
Quality score
25%
Pricing
$0.03/1M in
$0.20/1M out
Speed
Very fast
Best for ultra-low-cost text classification, simple q&a, and high-volume automation pipelines where cost per token is critical.
Context
60k tokens
Output cost of ~$0.20/1M tokens is notably higher relative to input cost — factor this in for verbose generation tasks. Best suited for inference pipelines where outputs are short and structured. Available via multiple inference providers due to open-weight licensing.
Llama Guard 4 12B is Meta's specialized safety classification model designed to detect and filter harmful content in LLM inputs and outputs. It's purpose-built for content moderation pipelines, not general-purpose text generation.
Verdict
The go-to cheap, fast content moderation layer for production LLM pipelines.
Quality score
15%
Pricing
$0.18/1M in
$0.18/1M out
Speed
Very fast
Best for automated content safety screening and policy enforcement in llm-powered applications
Context
164k tokens
Llama Guard 4 supports the MLCommons hazard taxonomy and is designed to be used as a shield model in multi-model architectures. Not suitable as a standalone AI assistant. Available via Meta's open model ecosystem and third-party API providers.
Llama Guard 3 8B is a specialized safety classifier built on Meta's Llama 3 architecture, designed to detect and categorize harmful or policy-violating content in both user inputs and model outputs. It is purpose-built for content moderation pipelines, not general-purpose text generation.
Verdict
A hyper-specialized, ultra-cheap safety classifier — indispensable in the right pipeline, useless outside of it.
Quality score
14%
Pricing
$0.48/1M in
$0.03/1M out
Speed
Very fast
Best for automated content safety screening and moderation for ai application pipelines at minimal cost.
Context
131k tokens
This model is designed exclusively for content moderation and safety classification tasks. It follows the MLCommons AI Safety benchmark taxonomy. It should be deployed as a guardrail layer alongside generative models, not as a replacement for them. Not suitable for end-user-facing conversational applications.
SafetyContent ModerationClassifierBudgetMeta
Best for
Automated content safety screening and moderation for AI application pipelines at minimal cost.
Pricing changes, new releases, and ranking shifts — straight to your inbox.
No spam. Useful updates only. Affiliate disclosures always clearly labeled.
Meta AI FAQ
What is Meta's best AI model in 2026?
Llama 4 Maverick is Meta's most capable model — it delivers frontier-class coding and writing performance for free. Llama 4 Scout is Meta's faster, smaller model optimised for high-throughput applications and edge deployments.
Are Meta's AI models free to use?
Yes — Llama 4 Scout and Llama 4 Maverick are open-weight models available under the Llama community license. You can run them locally via Ollama, or use them free via Groq, Together AI, Fireworks, and other hosted providers.
How does Llama 4 compare to Claude and GPT?
Llama 4 Maverick is competitive with GPT-5.4 on most tasks — at zero API cost. For the highest coding quality, Claude Opus 4.7 and Claude Sonnet 4.6 still lead. For a free open-weight model that runs anywhere, Llama 4 Maverick is the strongest option available.