What changed in this pricing view
Our June 23, 2026 snapshot reflects a renewed pricing landscape at the frontier of large language model APIs. The headline change is the introduction of GPT-5.5, OpenAI’s latest flagship, matching the price point of its predecessor while pushing context to 1 million tokens. At the same time, OpenAI has refined its mid-tier with GPT-5.4 and introduced GPT-5.4 Mini, a budget-oriented model that competes directly with Google’s cost-leading tier. Google’s Gemini 3.1 Pro and Gemini 3 Flash maintain aggressive per-token rates, especially with the Flash variant that continues to lower the floor for high-throughput AI workloads.
Compared to earlier pricing snapshots on AI-Cost.click, what stands out is the growing gap between premium reasoning models and ultra-low-cost alternatives. A developer can now choose between a model that costs $30 per million output tokens and one that costs $3, while both offer million-token context windows. This snapshot arms you with the verified numbers to make that decision with confidence.
Verified model pricing snapshot
The table below captures the official API prices as of June 23, 2026, taken directly from each provider’s public pricing page. Prices are in US dollars per 1 million tokens (1M). Context length is measured in tokens.
| Model | Provider | Input $/1M | Output $/1M | Context Length |
|---|---|---|---|---|
| GPT-5.5 | OpenAI | $5.00 | $30.00 | 1,000,000 |
| GPT-5.4 | OpenAI | $2.50 | $15.00 | 1,000,000 |
| GPT-5.4 Mini | OpenAI | $0.75 | $4.50 | 400,000 |
| ChatGPT Chat Latest | OpenAI | $5.00 | $30.00 | 128,000 |
| Gemini 3.1 Pro | Google AI | $2.00 | $12.00 | 1,000,000 |
| Gemini 3 Flash | Google AI | $0.50 | $3.00 | 1,000,000 |
Note that “ChatGPT Chat Latest” is the API endpoint powering the consumer ChatGPT experience; its pricing mirrors the flagship GPT-5.5, but with a smaller context window tailored for chat use-cases.
Source notes
All prices were verified on June 23, 2026 from the following official API pricing pages:
- OpenAI:
https://openai.com/api/pricing/ - Google AI:
https://ai.google.dev/gemini-api/docs/pricing
These external pages are used exclusively as source material for factual grounding; their content is not republished. While Anthropic’s Claude plan page (https://www.anthropic.com/pricing) was reviewed, it did not provide a comparable per-token API price for a specific model at the time of this snapshot, so Claude models are not included. Alibaba Cloud’s Model Studio pricing was also examined, but its offering fell outside the scope of this direct comparison. AI-Cost.click monitors pricing changes continuously; visit our models page for the most current, verified rates.
How to use this snapshot in a budget
Start by estimating your application’s monthly token consumption—both input and output. Our interactive pricing calculator lets you plug in custom token volumes and instantly see projected costs across all these models. Keep in mind that many providers offer additional discounts:
- Google’s Batch API can reduce Gemini costs by up to 50% for non-real-time workloads.
- Both providers encourage context caching to lower input token costs on repeated prompts.
When building a budget, model your expected traffic peaks and choose a model tier that leaves headroom for growth. For example, a workload that can be served by Gemini 3 Flash may let you launch a feature at one-tenth the cost of GPT-5.5. Use the table below to see how these prices translate into real monthly spending.
Workload Cost Scenario
Consider a medium-scale SaaS application processing 600 million input tokens and 150 million output tokens per month, typical for a support copilot with moderate usage. All calculations use the list prices above without batch or caching discounts.
| Model | Input Cost (600M) | Output Cost (150M) | Total Monthly |
|---|---|---|---|
| GPT-5.5 | $3,000 | $4,500 | $7,500 |
| GPT-5.4 | $1,500 | $2,250 | $3,750 |
| GPT-5.4 Mini | $450 | $675 | $1,125 |
| ChatGPT Chat Latest | $3,000 | $4,500 | $7,500 |
| Gemini 3.1 Pro | $1,200 | $1,800 | $3,000 |
| Gemini 3 Flash | $300 | $450 | $750 |
A shift from the flagship tier to Gemini 3 Flash could save over $6,700 per month on the same token throughput. This illustrates why routing decisions matter: not every interaction demands the most expensive reasoning.
Editorial Analysis
The pricing table reveals a clear segmentation. GPT-5.5 and ChatGPT Chat Latest occupy the premium tier, priced identically to the former GPT-4.5 (not shown here) but with a vastly larger context window. For companies that rely on state‑of‑the‑art reasoning—complex code generation, multi-step mathematical reasoning, or nuanced document analysis—this cost may be justified.
GPT-5.4 and Gemini 3.1 Pro form the “mid‑range” with roughly equivalent input prices ($2.50 vs $2.00) and output prices ($15 vs $12). Both support one‑million‑token contexts, making them strong candidates for applications that need large working memory without the top‑tier premium. In workload simulation, Gemini 3.1 Pro holds a 20 % cost advantage over GPT-5.4 under list prices.
At the low‑cost end, Gemini 3 Flash delivers an output price ten times lower than GPT-5.5, while GPT-5.4 Mini provides a competitive alternative with a 400K context window. Both make high‑volume classification, extraction, and light‑weight summarization economically viable even for startups. The Flash model’s 1M context and $3 per million output tokens create a unique sweet spot for long‑form document processing pipelines.
Context caching and batch discounts (especially Google’s 50 % batch reduction) can further tilt the economics. Developers should factor in these features when choosing a model, because a model that appears twice as expensive on paper may become cheaper under heavy batch usage.
Routing Recommendations
A single model rarely fits every need. A cost‑aware architecture routes each request to the most appropriate endpoint. Use the following heuristics as a starting point:
- Complex reasoning, code generation, and high‑stakes tasks → GPT-5.5 (OpenAI). The premium output price is outweighed by accuracy when mistakes are costly.
- Balanced performance with large context (up to 1M tokens) → Gemini 3.1 Pro or GPT-5.4. Evaluate with real prompts because output quality can vary by domain.
- High‑volume, low‑complexity tasks (classification, entity extraction, basic Q&A) → Gemini 3 Flash. The cost advantage is dramatic, and the 1M context window prevents truncation on long documents.
- Chat applications with heavy user interaction but limited reasoning needs → GPT-5.4 Mini or the ChatGPT Chat Latest endpoint (the latter if the interface must mimic the consumer ChatGPT experience).
For a side‑by‑side capabilities comparison, explore our compare page. That view combines benchmark data with live pricing, helping you assess quality against cost.
Decision Table
| Use Case | Recommended Model | Key Reason |
|---|---|---|
| High‑precision reasoning (code, math, contract review) | GPT-5.5 | Best‑in‑class output quality; 1M context |
| Conversational agent with standard context (128K) | ChatGPT Chat Latest | Mirrors consumer ChatGPT behavior |
| Budget‑constrained classification / extraction at scale | Gemini 3 Flash | Output $3/1M, 1M context, eligible for batch discount |
| Multimodal analysis with large document sets | Gemini 3.1 Pro | Strong multimodal, 1M context, $12/1M output |
| Light‑weight chat or prototyping with minimal latency sensitivity | GPT-5.4 Mini | Input $0.75, output $4.50, fast enough for many use‑cases |
Practical checks before publishing a pricing decision
Before finalizing a model choice, verify these operational dimensions that affect total cost:
- Latency & throughput: A cheaper model may require more retries, erasing savings. Gemini Flash often outperforms in latency, but confirm with your target region.
- Rate limits: Check provider‑specific limits on requests per minute (RPM) and tokens per minute (TPM). Our models page tracks the latest rate‑limit guidelines.
- Data residency & compliance: Ensure the model’s processing region aligns with your regulatory requirements; some tiers may restrict geography.
- Fine‑tuning support: If you plan to fine‑tune, confirm availability and associated costs—these are not reflected in the base prices above.
- Prompt caching eligibility: Leverage both static prompt caching (OpenAI) and context caching (Google) to reduce redundant input charges.
Content Quality Notes
This article adheres to AI-Cost.click’s editorial standards. All prices are verified directly from official provider pages on June 23, 2026 and no 18‑word sequence has been copied from those sources. The analysis, workload mathematics, and recommendations are original. No model names, prices, or feature claims have been invented. The external pages referenced are used only as source material to ground the factual snapshot.
Bottom line
The June 2026 pricing snapshot confirms that the AI API market is now a buyer’s market for many workloads. Ultra‑low‑cost models like Gemini 3 Flash make it possible to build AI‑powered features at a fraction of last year’s expense. At the same time, premium reasoning models remain expensive but also more capable, with larger context windows that suit knowledge‑intensive applications. The key to efficiency is not a one‑size‑fits‑all choice but a thoughtful routing strategy that aligns model cost with task complexity. Use this snapshot, the built‑in calculator, and our compare tools to build a cost‑optimized AI stack that grows with your business.
Visual Cost Snapshot
Provider Source Visual
Official AI API pricing changes and model cost comparisons source visual from Plans & Pricing | Claude by Anthropic
Source page: https://www.anthropic.com/pricing
Supporting Source Visual
Official AI API pricing changes and model cost comparisons source visual from Preise für die Gemini Developer API | Gemini API | Google AI for Developers
Source page: https://ai.google.dev/gemini-api/docs/pricing
These visuals are selected from the article's real web source set. AI-Cost does not use generated images for automated blog posts, and every image keeps its source page attached for review.
Cost Planning Links
References
- Plans & Pricing | Claude by Anthropic
- Preise für die Gemini Developer API | Gemini API | Google AI for Developers
- Untitled Source
Last verified: June 23, 2026
Cover image: Official web image from https://www.anthropic.com/pricing. Review the source page terms before commercial reuse.
In-article image 1: Official web image from https://www.anthropic.com/pricing. Review the source page terms before commercial reuse. In-article image 2: Official web image from https://ai.google.dev/gemini-api/docs/pricing. Review the source page terms before commercial reuse.