Agent Shortlist

Reference

AI API Pricing.

Per-token pricing for every frontier model worth running an agent on. Filter by vendor or tier, verify against the source, plug into the cost calculator. Backed by an open dataset you can use in your own tools.

Verified: 2026-04-27·17 models·9 vendors·Open dataset on GitHub ↗
Always verify pricing directly with the vendor before committing budget. Vendors change prices, ship new tiers, and apply per-account discounts that a public dataset can't track. Each vendor block below links to its official pricing page.

Vendor

Tier

Showing 17 of 17 models
ModelInputper 1M tokensOutputper 1M tokensTier
Claude Opus 4.7

Best-in-class reasoning. Anthropic's flagship for the hardest agentic decisions.

$5$25frontier
Claude Sonnet 4.6

The sweet spot. Handles ~90% of agent workflows at a fraction of Opus pricing.

$3$15balanced
Claude Haiku 4.5

Fast and cheap. Best for high-volume classification and short replies.

$1$5value
ModelInputper 1M tokensOutputper 1M tokensTier
GPT-5.5

OpenAI's frontier line. Marketed for coding and professional work.

$5$30frontier
GPT-5.4 mini

OpenAI's value tier. Strong on coding, computer use, and subagent workloads.

$0.75$4.5value
ModelInputper 1M tokensOutputper 1M tokensTier
Grok 4.20

xAI's current flagship. 2M context with live web access via X integration.

$2$6frontier
Grok 4

xAI's previous flagship. Still available; most builders should default to Grok 4.20.

$3$15frontier
Grok 4 Fast

xAI's value tier. 10× cheaper than Grok 4.20 with the same 2M context.

$0.20$0.50value
ModelInputper 1M tokensOutputper 1M tokensTier
Gemini 2.5 Pro

Largest context window on the list. Pricing shown is for prompts ≤200k tokens; rises to $2.50/$15 above that.

$1.25$10balanced
Gemini 2.5 Flash

Hybrid reasoning model with 1M context and tunable thinking budgets.

$0.30$2.5value
Gemini 3.1 Pro Preview

Multimodal, agentic, strong on coding. Pricing shown is for prompts ≤200k tokens; rises to $4/$18 above that.

$2$12balanced
Gemini 3.1 Flash-Lite Preview

Google's most cost-efficient model. Optimised for high-volume agentic tasks and simple data processing.

$0.25$1.5value

Meta / Together AI

Vendor pricing page ↗
ModelInputper 1M tokensOutputper 1M tokensTier
Llama 3.3 70B (Together AI)

Open weights. The price shown reflects Together AI; other inference providers host the same weights at different rates.

$0.88$0.88value
ModelInputper 1M tokensOutputper 1M tokensTier
DeepSeek V4 Flash

DeepSeek's value tier. Cache-hit pricing drops input to $0.0028/M on repeat queries.

$0.14$0.28value
ModelInputper 1M tokensOutputper 1M tokensTier
Kimi K2

Moonshot AI's frontier model. Strong on long-context work.

$0.57$2.3value
ModelInputper 1M tokensOutputper 1M tokensTier
GLM-4.6

Formerly Zhipu AI. Competitive Chinese frontier model with strong coding and tool-use benchmarks.

$0.60$2.2value
ModelInputper 1M tokensOutputper 1M tokensTier
Mistral Large 2.1

European frontier model. Stronger data residency story than US providers — useful for EU compliance-heavy workflows.

$2$6balanced

Run your own numbers

Pricing is just the start. The real question is what your workflow actually costs.

The cost calculator combines this pricing data with realistic per-task token estimates across ten builder workflows — from ticket classification to code review — so you can see the monthly bill at your real volume, not just the headline rate.

Open the cost calculator →

Open dataset

Use this pricing data in your own product.

The full dataset is published as pricing.json in a public GitHub repo. CC-BY-4.0. No API key, no rate limit, no auth. Updated automatically when prices change.

curl

curl -O https://raw.githubusercontent.com/lucaspowell8020/ai-agent-pricing/main/pricing.json

JavaScript

const data = await fetch(
  "https://raw.githubusercontent.com/lucaspowell8020/ai-agent-pricing/main/pricing.json"
).then((r) => r.json());

const opus = data.models.find((m) => m.slug === "claude-opus-4-7");
console.log(opus.inputPricePerMillion, opus.outputPricePerMillion);

Python

import json, urllib.request

data = json.loads(urllib.request.urlopen(
    "https://raw.githubusercontent.com/lucaspowell8020/ai-agent-pricing/main/pricing.json"
).read())

opus = next(m for m in data["models"] if m["slug"] == "claude-opus-4-7")
print(opus["inputPricePerMillion"], opus["outputPricePerMillion"])

Common questions

What builders ask before they pick a model.

How much does the Claude API cost?

Claude API pricing depends on the model. Claude Opus 4.7 is $5 per million input tokens and $25 per million output tokens. Claude Sonnet 4.6 is $3 input and $15 output. Claude Haiku 4.5 is $1 input and $5 output. Anthropic cut Opus pricing 66% in early 2026, making frontier-tier reasoning much more accessible. Prompt caching can drop input costs another 50–90% on repeat queries.

How much does GPT-5 cost?

GPT-5 is $1.25 per million input tokens and $10 per million output tokens. GPT-5 mini is $0.25 input and $2 output — about 5× cheaper. OpenAI's frontier line was rebranded from GPT-4 to GPT-5 in 2025. Cached input pricing is roughly 50% of standard input.

What is the cheapest LLM API?

DeepSeek V4 Flash is currently the cheapest credible model at $0.14 per million input tokens and $0.28 per million output tokens — roughly 35× cheaper than Claude Sonnet 4.6 for input. Gemini 2.5 Flash and Claude Haiku 4.5 are the cheapest options from major US vendors. For open-weight models, Llama 3.3 70B via Together AI is competitive at the value tier. The right cheap model depends on your task — value-tier models handle classification and short replies well but struggle with complex reasoning.

How do you keep this pricing data accurate?

An automated audit runs daily. Small drift (under 25% on both input and output) is auto-applied and published the same day. Larger changes — re-pricings, model deprecations, tier consolidations — go through human editorial review against the live vendor pricing page before publication. Vendor pages remain the canonical source of truth; this dataset is a clean machine-readable mirror, not a substitute. The exact verification date is shown on this page and embedded in the public dataset's JSON.

Can I use this pricing data in my own product?

Yes. The full dataset is published as pricing.json in a public GitHub repo (lucaspowell8020/ai-agent-pricing) under CC-BY-4.0. You can fetch it directly from the raw GitHub URL, no API key required. Use it in commercial products, comparison tools, calculators, or blog posts — attribution to agentshortlist.com is appreciated but not required by the licence.

Why not just check the vendor pricing pages directly?

You can, and you should verify before committing budget. The reason this dataset exists: vendor pricing pages disagree on units (some quote per token, some per thousand, some per million), use struck-through old prices for SEO, and change format frequently. Aggregating into one normalised file makes side-by-side comparison and programmatic use possible. We treat the vendor pages as authoritative — this dataset is a clean, machine-readable mirror.

Related