If you're building an AI-powered application in 2026, you have three dominant API providers to choose from: Anthropic (Claude), OpenAI (GPT), and Google (Gemini). Each offers a multi-tier model lineup — from budget-friendly flash models to premium reasoning beasts. The pricing gaps between them are enormous, and picking the wrong tier can 5× your monthly bill.
This article breaks down the pricing structure of all three providers, tier by tier, with real per-token prices as of June 2026. All prices are per 1 million tokens, in USD.
Premium Tier: The Best Models Money Can Buy
The premium tier is where each provider puts their most capable model — the one they recommend for complex reasoning, coding, and analysis tasks. These are the flagships, and they come with flagship pricing.
| Model | Input $/M | Output $/M | Cache Read | Est. Monthly |
|---|---|---|---|---|
| Claude Fable 5 | $10.00 | $50.00 | $1.00 | $150.00 |
| GPT-5.6 Sol | $5.00 | $30.00 | $0.50 | $85.00 |
| Gemini 3.1 Pro | $2.00 | $12.00 | $0.20 | $34.00 |
Estimated monthly cost: 5M input + 2M output tokens, no caching. Your costs will vary.
At the premium tier, Gemini 3.1 Pro is the clear value winner— it costs 77% less than Claude Fable 5 for the same token volume. But raw price isn't everything: Claude Fable 5 leads on SWE-bench (88.7 vs 88.7 vs 80.6), so if coding accuracy is worth the premium, Anthropic makes a strong case.
Mid-Tier: The Sweet Spot for Most Apps
Most production applications don't need a premium model for every request. The mid-tier offers 80–90% of the capability at 40–60% of the cost. Each provider has 1–2 models in this range.
| Model | Input $/M | Output $/M | SWE-bench |
|---|---|---|---|
| Claude Sonnet 4.6 | $3.00 | $15.00 | 79.6 |
| GPT-5.6 Terra | $2.50 | $15.00 | 84.2 |
| GPT-5.4 | $2.50 | $15.00 | 78.0 |
Gemini 3.1 Pro (listed in premium above) is price-competitive with this tier at $2.00/$12.00.
The mid-tier is fiercely competitive. GPT-5.6 Terra and Claude Sonnet 4.6 are within 10% on pricing, and Terra edges ahead on benchmarks. The older GPT-5.4 is still widely used — it's priced identically to Terra but trails on knowledge cutoff (Dec 2025 vs Apr 2026).
Budget Tier: Surprisingly Capable
The budget tier has seen the most dramatic improvement in 2026. Models like GPT-5.6 Luna and Gemini 3.5 Flash now score 62–65 on SWE-bench — better than premium models from 2024 — at a fraction of the cost.
| Model | Input $/M | Output $/M | Est. Monthly |
|---|---|---|---|
| Claude Haiku 4.5 | $1.00 | $5.00 | $15.00 |
| GPT-5.6 Luna | $1.00 | $6.00 | $17.00 |
| Gemini 3.5 Flash | $0.25 | $1.50 | $4.25 |
| Gemini Flash Lite | $0.10 | $0.40 | $1.30 |
Google dominates the budget tier on price — Flash Lite at $1.30/month for typical usage is essentially free for prototyping. Anthropic and OpenAI budget models cost 10–15× more but offer caching and batch discounts that the Google budget models don't.
Which Provider Wins for Your Use Case?
- Best raw performance: Claude Fable 5 vs GPT-5.6 Sol — both score 88.7 on SWE-bench. Fable 5 leads on HumanEval (96.3 vs 96.0).
- Best value mid-tier: Gemini 3.1 Pro vs Claude Sonnet — Gemini is 33% cheaper on input, 20% cheaper on output.
- Best budget: Gemini Flash Lite for simple tasks, Gemini 3.5 Flash when you need image support. Both are dramatically cheaper than Anthropic/OpenAI equivalents.
- Best caching ecosystem: Anthropic and OpenAI offer cache writes at 1.25× input price and reads at 0.1×. If your app has a fixed system prompt, this can cut costs by 50–70%.
The right model depends on your exact token mix. See our calculator to plug in your own numbers — the rankings shift dramatically when you factor in caching, batch, and your specific input/output ratio.