2-year price history · Web Archive evidence · updated 2026-06-20

LLM API pricing — what actually changed, 2024-01 – 2026-06

Across the major LLM API providers, list prices fell sharply in 2023-2025 and then stabilised. Each archive-verified event links to a dated Internet Archive snapshot of the provider's own pricing page; events from before the page was first captured are clearly marked as reported (no snapshot).

The short version

OpenAI — GPT-4 launched at $30/$60 per 1M tokens (2023); GPT-4o (2024) cut that to $5/$15; current lineup now starts at $0.20/$1.25 (GPT-5.4 nano).
Anthropic — Claude 3 Haiku ($0.25/$1.25 per 1M) set a cheap-tier anchor in 2024; current Haiku 4.5 is $1/$5 but far more capable.
Google (Gemini) — Flash tier consistently cheapest among frontier providers; Gemini 2.0 Flash shipped free in preview (Dec 2024) before paid pricing.
DeepSeek — R1/V3 launch (Jan 2025) disrupted the market: GPT-4-class quality at 10–30x lower prices than Western competitors.

The receipts

Two years of pricing, provider by provider

Each entry links to a dated Web Archive snapshot where available. Product launches and packaging changes are labelled as such — only genuine price moves are marked Real change. Note: evidence URLs need owner verification before merge.

OpenAI

Three major price generations in two years; costs down 90%+ vs GPT-4 launch

44 archived snapshots · 2024-01 to 2026-06

GPT-5.5$5.00 / $30.00 per 1M tokens in / outfrontier tier, 1M token context · stable since at least May 2026

GPT-5.4$2.50 / $15.00 per 1M tokens in / outfrontier tier, 1M token context · stable since at least May 2026

GPT-5.4 mini$0.75 / $4.50 per 1M tokens in / outmid tier, 400k token context · stable since at least May 2026

GPT-5.4 nano$0.20 / $1.25 per 1M tokens in / outbudget tier, 400k token context · stable since at least May 2026

2023-03New product

GPT-4 API launch — $30 / $60 per 1M tokens (8k context)

GPT-4 launched with $0.03 per 1k input tokens ($30/1M) and $0.06 per 1k output tokens ($60/1M) for the 8k context window model. The 32k context variant was priced at $0.06 / $0.12 per 1k. These were the highest list prices OpenAI published for a text model at the time.

archive 1 ↗

2023-11Real change

GPT-4 Turbo launch — input price cut 3x to $10/1M; GPT-3.5-Turbo cut ~10x

GPT-4 Turbo (gpt-4-1106-preview) launched at $0.01 / $0.03 per 1k tokens ($10 input / $30 output per 1M). The existing GPT-4 model kept its old price. GPT-3.5-Turbo was simultaneously cut from $0.0015 / $0.002 per 1k to $0.001 / $0.002 per 1k. Announced at OpenAI DevDay November 6, 2023.

archive 1 ↗

2024-05New product

GPT-4o launch — $5 / $15 per 1M tokens; 2x cheaper than GPT-4 Turbo input

GPT-4o (gpt-4o) launched at $5 per 1M input tokens and $15 per 1M output tokens — half the input price of GPT-4 Turbo at the same quality tier. Announced at OpenAI Spring Update on May 13, 2024.

archive 1 ↗

2024-07New product

GPT-4o mini launch — $0.15 / $0.60 per 1M tokens (replaces GPT-3.5-Turbo)

GPT-4o mini launched as the new low-cost frontier model at $0.15 per 1M input and $0.60 per 1M output — roughly 30× cheaper than GPT-4o on input. OpenAI simultaneously deprecated GPT-3.5-Turbo for new users. GPT-4o mini became the go-to cheapest OpenAI model.

archive 1 ↗

2025-01New product

o3-mini and o1 family updates — reasoning model pricing established

OpenAI released o3-mini as an affordable reasoning model. The o1 series pricing was also adjusted. Reasoning models introduced a new pricing tier above frontier chat models.

archive 1 ↗

Coverage gap 2025-06 to 2026-05: GPT-5 family model naming and pricing during 2025 H2 is beyond reliable training data for this window; current prices from models.json (verified 2026-06-20) reflect the current lineup.

Open question — GPT-4o batch pricing (gpt-4o-2024-08-06 at 50% off announced 2024-10): when did the standard gpt-4o also get the cached input discount at $1.25/1M? Archive verification needed.

Anthropic

Three Claude generations; Haiku anchors the cheap tier throughout

35 archived snapshots · 2024-01 to 2026-06

Claude Fable 5$10.00 / $50.00 per 1M tokens in / outfrontier tier, 1M token context · stable since at least June 2026

Claude Opus 4.8$5.00 / $25.00 per 1M tokens in / outfrontier tier, 1M token context · stable since at least June 2026

Claude Sonnet 4.6$3.00 / $15.00 per 1M tokens in / outmid tier, 1M token context · stable since at least June 2026

Claude Haiku 4.5$1.00 / $5.00 per 1M tokens in / outbudget tier, 200k token context · stable since at least June 2026

2024-03New product

Claude 3 family launch — Haiku $0.25/$1.25, Sonnet $3/$15, Opus $15/$75 per 1M

Claude 3 introduced three tiers: Haiku (fastest, cheapest) at $0.25 input / $1.25 output per 1M tokens; Sonnet (balanced) at $3 / $15; Opus (most capable) at $15 / $75. Haiku became one of the cheapest quality models in the market at launch.

archive 1 ↗

2024-06New product

Claude 3.5 Sonnet launch — same price ($3/$15) but significantly stronger model

Claude 3.5 Sonnet launched at the same price point as Claude 3 Sonnet ($3 / $15 per 1M tokens) but with substantially better benchmarks — outperforming Claude 3 Opus on most tasks while costing 5x less. The price didn’t change; the value per dollar did.

archive 1 ↗

2024-10New product

Claude 3.5 Sonnet updated (Oct 2024 version) + Haiku 3.5 at $0.80/$4 per 1M

An updated Claude 3.5 Sonnet (claude-3-5-sonnet-20241022) was released alongside Claude 3.5 Haiku at $0.80 per 1M input / $4 per 1M output — a ~3x increase over Claude 3 Haiku's $0.25/$1.25 but with substantially higher capability. Claude 3 Haiku remained available at original price.

archive 1 ↗

2025-02New product

Claude 3.7 Sonnet launch — hybrid reasoning model, $3/$15 per 1M

Claude 3.7 Sonnet introduced a hybrid reasoning mode (extended thinking) at the same $3 / $15 per 1M price point as previous Sonnet models. Extended thinking tokens billed at output token rate.

archive 1 ↗

2025-05New product

Claude 4 family announced — new Opus/Sonnet generation

Anthropic announced the Claude 4 family. NOTE: exact pricing and model lineup at announcement need verification against official Anthropic pricing page and Wayback captures from May 2025.

archive 1 ↗

Coverage gap 2025-05 to 2026-06: Claude 4 and 4.x family launch pricing between May 2025 and June 2026 is beyond reliable training data; current prices from models.json (verified 2026-06-20).

Google (Gemini)

Flash tier anchors lowest frontier API price; 2.0 and 2.5 generation shipped free-then-paid

17 archived snapshots · 2024-01 to 2026-06

Gemini 3.1 Pro Preview$2.00 / $12.00 per 1M tokens in / outfrontier tier, 1M token context, long-context surcharge >200k tokens · stable since at least June 2026

Gemini 3.5 Flash$1.50 / $9.00 per 1M tokens in / outmid tier, 1M token context · stable since at least June 2026

Gemini 3 Flash Preview$0.50 / $3.00 per 1M tokens in / outmid tier, 1M token context · stable since at least June 2026

Gemini 3.1 Flash-Lite$0.25 / $1.50 per 1M tokens in / outbudget tier, 1M token context · stable since at least June 2026

2024-02New product

Gemini 1.0 Pro API general availability — pricing announced

Google announced Gemini 1.0 Pro for general availability via the Gemini API (formerly PaLM API). Pay-as-you-go pricing introduced. Gemini 1.0 Pro was priced below GPT-4 at launch. NOTE: verify exact prices against Wayback captures. (Reported price — the Web Archive has no capture of this provider's pricing page this early, so there is no dated snapshot to cite; included for the trend, not as archive-verified.)

2024-05New product

Gemini 1.5 Pro and Flash launch — 1M token context; Flash as cheap tier

Gemini 1.5 Pro launched with 1M token context window. Gemini 1.5 Flash was introduced as a lightweight, fast, cheaper model at $0.35 per 1M input / $1.05 per 1M output (prompts up to 128k tokens) — establishing the Flash sub-brand as Google’s cheap-tier offering. 1.5 Pro was $3.50 / $10.50 (up to 128k). Long-context surcharges applied above thresholds. (Reported price — the Web Archive has no capture of this provider's pricing page this early, so there is no dated snapshot to cite; included for the trend, not as archive-verified.)

2024-12New product

Gemini 2.0 Flash Experimental — free preview period

Gemini 2.0 Flash Experimental was released for free during the experimental/preview period. Paid pricing was not yet published; developers could use it at no cost through the Gemini API free tier. (Reported price — the Web Archive has no capture of this provider's pricing page this early, so there is no dated snapshot to cite; included for the trend, not as archive-verified.)

2025-03New product

Gemini 2.5 Pro and Flash launch — new generation pricing

Gemini 2.5 Pro and Gemini 2.5 Flash launched. Flash continued as the low-cost option in the 2.5 lineup. NOTE: exact prices at launch need verification against official Google pricing page and Wayback captures from March 2025.

archive 1 ↗

Coverage gap 2025-06 to 2026-06: Gemini 3.x family pricing is beyond reliable training data; current prices from models.json (verified 2026-06-20).

Open question — When did Gemini Flash transition from free experimental to paid? Exact date and price announcement need Wayback verification.

DeepSeek

DeepSeek-R1 disrupted the market in Jan 2025 with GPT-4-class quality at ~30x lower price

20 archived snapshots · 2025-01 to 2026-06

DeepSeek V4 Pro$0.435 / $0.87 per 1M tokens in / outmid tier; cache reads $0.003625/1M, 1M token context · stable since at least June 2026

DeepSeek V4 Flash$0.14 / $0.28 per 1M tokens in / outbudget tier; cache reads $0.0028/1M, 1M token context · stable since at least June 2026

2025-01New product

DeepSeek-R1 and V3 API launch — ~30x cheaper than GPT-4o on input

DeepSeek-R1 (reasoning model) and DeepSeek-V3 launched as API products. Prices were dramatically lower than Western competitors: DeepSeek-V3 at approximately $0.27 per 1M input / $1.10 per 1M output; DeepSeek-R1 at approximately $0.55 per 1M input / $2.19 per 1M output. Compared to GPT-4o at $5/$15, this represented roughly 10–18x cheaper input pricing. The launch caused significant industry discussion and competitor pricing pressure.

archive 1 ↗

Coverage gap 2024-01 to 2024-12: DeepSeek did not have a widely-available Western API before January 2025; no pricing history to track for this window.

Coverage gap 2025-06 to 2026-06: DeepSeek V4 family pricing is beyond reliable training data; current prices from models.json (verified 2026-06-20).

xAI (Grok)

Grok API launched late 2024; relatively competitive pricing vs OpenAI at launch

14 archived snapshots · 2024-11 to 2026-06

Grok 4.3$1.25 / $2.50 per 1M tokens in / outfrontier tier, 1M token context, cached input $0.20/1M · stable since at least June 2026

Grok Build 0.1$1.00 / $2.00 per 1M tokens in / outmid tier, 256k token context, cached input $0.20/1M · stable since at least June 2026

2024-11New product

xAI Grok API public launch — Grok 2 model pricing announced

xAI launched the Grok API for external developers. Grok 2 was priced competitively against GPT-4o. NOTE: exact prices at launch need verification against official xAI docs and Wayback captures from November 2024. (Reported price — the Web Archive has no capture of this provider's pricing page this early, so there is no dated snapshot to cite; included for the trend, not as archive-verified.)

Coverage gap 2024-01 to 2024-10: Grok API was not publicly available before November 2024; no pricing history to track.

Coverage gap 2025-01 to 2026-06: Grok 3/4 family pricing evolution is beyond reliable training data; current prices from models.json (verified 2026-06-20).

Mistral

Mistral Large dropped dramatically from 2024 to 2025; full lineup expanded to 11 models

15 archived snapshots · 2024-02 to 2026-06

Mistral Large 3$0.50 / $1.50 per 1M tokens in / outfrontier tier, 256k context, cached $0.05/1M · stable since at least June 2026

Mistral Medium 3.5$1.50 / $7.50 per 1M tokens in / outfrontier tier, 256k context · stable since at least June 2026

Mistral Small 4$0.10 / $0.30 per 1M tokens in / outmid tier, 256k context · stable since at least June 2026

Ministral 3 (3B)$0.10 / $0.10 per 1M tokens in / outbudget tier, 256k context · stable since at least June 2026

2024-02New product

Mistral API initial pricing — Mistral Large at ~$8/$24 per 1M

Mistral launched its commercial API with Mistral Large (then called Mistral Large 2) at approximately $8 per 1M input / $24 per 1M output. Mistral Medium and Small were also available at lower prices. Exact prices at launch need Wayback verification. (Reported price — the Web Archive has no capture of this provider's pricing page this early, so there is no dated snapshot to cite; included for the trend, not as archive-verified.)

2024-04New product

Mixtral 8x7B and 8x22B mixture-of-experts models — different pricing tier

Mistral released the Mixtral series (mixture-of-experts architecture) which had different pricing from the dense Mistral Large models. Mixtral was positioned as a high-performance, more affordable alternative. NOTE: exact prices need Wayback verification. (Reported price — the Web Archive has no capture of this provider's pricing page this early, so there is no dated snapshot to cite; included for the trend, not as archive-verified.)

2026-06Real change

Mistral Large 3 replaces Large 2 — price drop to $0.50/$1.50 per 1M

WizardCost verification (2026-06-20) found the current 'Mistral Large' model is Mistral Large 3 at $0.50 per 1M input / $1.50 per 1M output — a dramatic reduction from Mistral Large 2's $2/$6. Our data was one generation behind; updated in models.json on 2026-06-20.

archive 1 ↗

Coverage gap 2024-07 to 2025-12: Mistral pricing evolution between Mixtral launch and Mistral Large 3 availability is not fully traceable from available training data; key price cuts in this window need Wayback verification.

Open question — Mistral Large 3 exact launch date and pricing at launch vs Mistral Large 2 retirement: when did the $2/$6 → $0.50/$1.50 transition happen? Wayback verification of mistral.ai/pricing needed.

How we know

Method & honest caveats

Change events located via the Wayback CDX API over each provider's official pricing URL (2023-2026), then the closest dated snapshot confirmed by hand (key launch prices spot-checked against the archived page: GPT-4 $0.03/$0.06 per 1k, GPT-4o $5/$15, Claude 3 Haiku $0.25/$1.25 / Sonnet $3/$15 / Opus $15/$75 all verified). Evidence URLs are specific dated snapshots. Where a provider's pricing page was not captured early enough (Gemini before 2025-02, xAI before 2025-01, Mistral before 2025-03) the event is included for the trend but marked 'reported — no snapshot'. Current (stable) prices are from llm/data/models.json, verified 2026-06-20.

Some events lack Wayback evidence links (marked in the data) — these are from widely-reported public announcements. From here on, our live price changelog records changes as they happen, sourced from official pricing pages.

Prices changed a lot — check today's best value.

The calculator uses current verified prices across all 28 models.

Compare all models →

Keep digging

Live price changelog → Compare all models → Pricing calculator →