Question 1

How much does the Claude API cost?

Accepted Answer

Claude API pricing depends on the model. Claude Opus 4.7 is $5 per million input tokens and $25 per million output tokens. Claude Sonnet 4.6 is $3 input and $15 output. Claude Haiku 4.5 is $1 input and $5 output. Anthropic cut Opus pricing 66% in early 2026, making frontier-tier reasoning much more accessible. Prompt caching can drop input costs another 50–90% on repeat queries.

Question 2

How much does GPT-5 cost?

Accepted Answer

GPT-5 is $1.25 per million input tokens and $10 per million output tokens. GPT-5 mini is $0.25 input and $2 output — about 5× cheaper. OpenAI's frontier line was rebranded from GPT-4 to GPT-5 in 2025. Cached input pricing is roughly 50% of standard input.

Question 3

What is the cheapest LLM API?

Accepted Answer

DeepSeek V4 Flash is currently the cheapest credible model at $0.14 per million input tokens and $0.28 per million output tokens — roughly 35× cheaper than Claude Sonnet 4.6 for input. Gemini 2.5 Flash and Claude Haiku 4.5 are the cheapest options from major US vendors. For open-weight models, Llama 3.3 70B via Together AI is competitive at the value tier. The right cheap model depends on your task — value-tier models handle classification and short replies well but struggle with complex reasoning.

Question 4

How do you keep this pricing data accurate?

Accepted Answer

An automated audit runs daily. Small drift (under 25% on both input and output) is auto-applied and published the same day. Larger changes — re-pricings, model deprecations, tier consolidations — go through human editorial review against the live vendor pricing page before publication. Vendor pages remain the canonical source of truth; this dataset is a clean machine-readable mirror, not a substitute. The exact verification date is shown on this page and embedded in the public dataset's JSON.

Question 5

Can I use this pricing data in my own product?

Accepted Answer

Yes. The full dataset is published as pricing.json in a public GitHub repo (lucaspowell8020/ai-agent-pricing) under CC-BY-4.0. You can fetch it directly from the raw GitHub URL, no API key required. Use it in commercial products, comparison tools, calculators, or blog posts — attribution to agentshortlist.com is appreciated but not required by the licence.

Question 6

Why not just check the vendor pricing pages directly?

Accepted Answer

You can, and you should verify before committing budget. The reason this dataset exists: vendor pricing pages disagree on units (some quote per token, some per thousand, some per million), use struck-through old prices for SEO, and change format frequently. Aggregating into one normalised file makes side-by-side comparison and programmatic use possible. We treat the vendor pages as authoritative — this dataset is a clean, machine-readable mirror.

Model	Inputper 1M tokens	Outputper 1M tokens	Context	Tier
Claude Opus 4.7 Best-in-class reasoning. Anthropic's flagship for the hardest agentic decisions.	$5	$25	1M	frontier
Claude Sonnet 4.6 The sweet spot. Handles ~90% of agent workflows at a fraction of Opus pricing.	$3	$15	1M	balanced
Claude Haiku 4.5 Fast and cheap. Best for high-volume classification and short replies.	$1	$5	200k	value

Model	Inputper 1M tokens	Outputper 1M tokens	Context	Tier
GPT-5.5 OpenAI's frontier line. Marketed for coding and professional work.	$5	$30	400k	frontier
GPT-5.4 mini OpenAI's value tier. Strong on coding, computer use, and subagent workloads.	$0.75	$4.5	128k	value

Model	Inputper 1M tokens	Outputper 1M tokens	Context	Tier
Grok 4.20 xAI's current flagship. 2M context with live web access via X integration.	$2	$6	2M	frontier
Grok 4 xAI's previous flagship. Still available; most builders should default to Grok 4.20.	$3	$15	256k	frontier
Grok 4 Fast xAI's value tier. 10× cheaper than Grok 4.20 with the same 2M context.	$0.20	$0.50	2M	value

Model	Inputper 1M tokens	Outputper 1M tokens	Context	Tier
Gemini 2.5 Pro Largest context window on the list. Pricing shown is for prompts ≤200k tokens; rises to $2.50/$15 above that.	$1.25	$10	2M	balanced
Gemini 2.5 Flash Hybrid reasoning model with 1M context and tunable thinking budgets.	$0.30	$2.5	1M	value
Gemini 3.1 Pro Preview Multimodal, agentic, strong on coding. Pricing shown is for prompts ≤200k tokens; rises to $4/$18 above that.	$2	$12	2M	balanced
Gemini 3.1 Flash-Lite Preview Google's most cost-efficient model. Optimised for high-volume agentic tasks and simple data processing.	$0.25	$1.5	1M	value

AI API Pricing.

Anthropic

OpenAI

xAI

Google

Meta / Together AI

DeepSeek

Moonshot AI

z.ai

Mistral

Pricing is just the start. The real question is what your workflow actually costs.

Use this pricing data in your own product.

What builders ask before they pick a model.

How much does the Claude API cost?

How much does GPT-5 cost?

What is the cheapest LLM API?

How do you keep this pricing data accurate?

Can I use this pricing data in my own product?

Why not just check the vendor pricing pages directly?

Cost Calculator

Agent Picker

The 2026 Shortlist