Reference
AI API Pricing.
Per-token pricing for every frontier model worth running an agent on. Filter by vendor or tier, verify against the source, plug into the cost calculator. Backed by an open dataset you can use in your own tools.
Vendor
Tier
Anthropic
Vendor pricing page ↗| Model | Inputper 1M tokens | Outputper 1M tokens | Tier |
|---|---|---|---|
Claude Opus 4.7 Best-in-class reasoning. Anthropic's flagship for the hardest agentic decisions. | $5 | $25 | frontier |
Claude Sonnet 4.6 The sweet spot. Handles ~90% of agent workflows at a fraction of Opus pricing. | $3 | $15 | balanced |
Claude Haiku 4.5 Fast and cheap. Best for high-volume classification and short replies. | $1 | $5 | value |
OpenAI
Vendor pricing page ↗| Model | Inputper 1M tokens | Outputper 1M tokens | Tier |
|---|---|---|---|
GPT-5.5 OpenAI's frontier line. Marketed for coding and professional work. | $5 | $30 | frontier |
GPT-5.4 mini OpenAI's value tier. Strong on coding, computer use, and subagent workloads. | $0.75 | $4.5 | value |
| Model | Inputper 1M tokens | Outputper 1M tokens | Tier |
|---|---|---|---|
Grok 4.20 xAI's current flagship. 2M context with live web access via X integration. | $2 | $6 | frontier |
Grok 4 xAI's previous flagship. Still available; most builders should default to Grok 4.20. | $3 | $15 | frontier |
Grok 4 Fast xAI's value tier. 10× cheaper than Grok 4.20 with the same 2M context. | $0.20 | $0.50 | value |
| Model | Inputper 1M tokens | Outputper 1M tokens | Tier |
|---|---|---|---|
Gemini 2.5 Pro Largest context window on the list. Pricing shown is for prompts ≤200k tokens; rises to $2.50/$15 above that. | $1.25 | $10 | balanced |
Gemini 2.5 Flash Hybrid reasoning model with 1M context and tunable thinking budgets. | $0.30 | $2.5 | value |
Gemini 3.1 Pro Preview Multimodal, agentic, strong on coding. Pricing shown is for prompts ≤200k tokens; rises to $4/$18 above that. | $2 | $12 | balanced |
Gemini 3.1 Flash-Lite Preview Google's most cost-efficient model. Optimised for high-volume agentic tasks and simple data processing. | $0.25 | $1.5 | value |
Meta / Together AI
Vendor pricing page ↗| Model | Inputper 1M tokens | Outputper 1M tokens | Tier |
|---|---|---|---|
Llama 3.3 70B (Together AI) Open weights. The price shown reflects Together AI; other inference providers host the same weights at different rates. | $0.88 | $0.88 | value |
DeepSeek
Vendor pricing page ↗| Model | Inputper 1M tokens | Outputper 1M tokens | Tier |
|---|---|---|---|
DeepSeek V4 Flash DeepSeek's value tier. Cache-hit pricing drops input to $0.0028/M on repeat queries. | $0.14 | $0.28 | value |
Moonshot AI
Vendor pricing page ↗| Model | Inputper 1M tokens | Outputper 1M tokens | Tier |
|---|---|---|---|
Kimi K2 Moonshot AI's frontier model. Strong on long-context work. | $0.57 | $2.3 | value |
| Model | Inputper 1M tokens | Outputper 1M tokens | Tier |
|---|---|---|---|
GLM-4.6 Formerly Zhipu AI. Competitive Chinese frontier model with strong coding and tool-use benchmarks. | $0.60 | $2.2 | value |
Mistral
Vendor pricing page ↗| Model | Inputper 1M tokens | Outputper 1M tokens | Tier |
|---|---|---|---|
Mistral Large 2.1 European frontier model. Stronger data residency story than US providers — useful for EU compliance-heavy workflows. | $2 | $6 | balanced |
Run your own numbers
Pricing is just the start. The real question is what your workflow actually costs.
The cost calculator combines this pricing data with realistic per-task token estimates across ten builder workflows — from ticket classification to code review — so you can see the monthly bill at your real volume, not just the headline rate.
Open the cost calculator →Open dataset
Use this pricing data in your own product.
The full dataset is published as pricing.json in a public GitHub repo. CC-BY-4.0. No API key, no rate limit, no auth. Updated automatically when prices change.
GitHub
lucaspowell8020/ai-agent-pricing ↗
README, methodology, JSON Schema, JavaScript and Python examples.
Raw JSON
raw.githubusercontent.com/.../pricing.json ↗
Single file, no auth. Fetch directly from your code or CI.
curl
curl -O https://raw.githubusercontent.com/lucaspowell8020/ai-agent-pricing/main/pricing.json
JavaScript
const data = await fetch( "https://raw.githubusercontent.com/lucaspowell8020/ai-agent-pricing/main/pricing.json" ).then((r) => r.json()); const opus = data.models.find((m) => m.slug === "claude-opus-4-7"); console.log(opus.inputPricePerMillion, opus.outputPricePerMillion);
Python
import json, urllib.request
data = json.loads(urllib.request.urlopen(
"https://raw.githubusercontent.com/lucaspowell8020/ai-agent-pricing/main/pricing.json"
).read())
opus = next(m for m in data["models"] if m["slug"] == "claude-opus-4-7")
print(opus["inputPricePerMillion"], opus["outputPricePerMillion"])Common questions
What builders ask before they pick a model.
How much does the Claude API cost?
Claude API pricing depends on the model. Claude Opus 4.7 is $5 per million input tokens and $25 per million output tokens. Claude Sonnet 4.6 is $3 input and $15 output. Claude Haiku 4.5 is $1 input and $5 output. Anthropic cut Opus pricing 66% in early 2026, making frontier-tier reasoning much more accessible. Prompt caching can drop input costs another 50–90% on repeat queries.
How much does GPT-5 cost?
GPT-5 is $1.25 per million input tokens and $10 per million output tokens. GPT-5 mini is $0.25 input and $2 output — about 5× cheaper. OpenAI's frontier line was rebranded from GPT-4 to GPT-5 in 2025. Cached input pricing is roughly 50% of standard input.
What is the cheapest LLM API?
DeepSeek V4 Flash is currently the cheapest credible model at $0.14 per million input tokens and $0.28 per million output tokens — roughly 35× cheaper than Claude Sonnet 4.6 for input. Gemini 2.5 Flash and Claude Haiku 4.5 are the cheapest options from major US vendors. For open-weight models, Llama 3.3 70B via Together AI is competitive at the value tier. The right cheap model depends on your task — value-tier models handle classification and short replies well but struggle with complex reasoning.
How do you keep this pricing data accurate?
An automated audit runs daily. Small drift (under 25% on both input and output) is auto-applied and published the same day. Larger changes — re-pricings, model deprecations, tier consolidations — go through human editorial review against the live vendor pricing page before publication. Vendor pages remain the canonical source of truth; this dataset is a clean machine-readable mirror, not a substitute. The exact verification date is shown on this page and embedded in the public dataset's JSON.
Can I use this pricing data in my own product?
Yes. The full dataset is published as pricing.json in a public GitHub repo (lucaspowell8020/ai-agent-pricing) under CC-BY-4.0. You can fetch it directly from the raw GitHub URL, no API key required. Use it in commercial products, comparison tools, calculators, or blog posts — attribution to agentshortlist.com is appreciated but not required by the licence.
Why not just check the vendor pricing pages directly?
You can, and you should verify before committing budget. The reason this dataset exists: vendor pricing pages disagree on units (some quote per token, some per thousand, some per million), use struck-through old prices for SEO, and change format frequently. Aggregating into one normalised file makes side-by-side comparison and programmatic use possible. We treat the vendor pages as authoritative — this dataset is a clean, machine-readable mirror.
Related
Tool
Cost Calculator
Pick a workflow, set the volume, see the monthly bill across every model in this dataset.
Tool
Agent Picker
Five questions, one platform recommendation. Pricing is only part of the choice.
Reviews
The 2026 Shortlist
Platforms tested across real builder workflows. Verdicts, ratings, decision matrix.