Developers and finance owners need hard numbers, not marketing promises. This pricing snapshot compares the latest official API rates from OpenAI and Google AI, verified on June 23, 2026. We translate raw per-token prices into a realistic workload scenario so you can make a confident, data-driven choice.
Keep in mind that official pricing pages are subject to change. We recommend always cross-referencing the provider’s current page before committing to a model. For a regularly updated list of all models we track, visit our models directory.
What changed in this pricing view
This snapshot captures a pivotal moment: OpenAI has introduced the GPT-5.5 family while Google has upgraded Gemini to the 3.1 generation. The most notable shifts are:
- GPT-5.5 is priced identically to GPT-5.4 across input and output tokens, but likely represents a smarter or more capable architecture at the same price point.
- Gemini 3 Flash continues to undercut nearly every competitor with $0.50/1M input and $3/1M output, making it the cheapest high-context model on the market.
- Context windows are now uniformly large: four of the six verified models offer a full 1 million token context, dramatically expanding document‑processing possibilities.
Below you’ll find only directly verified prices pulled from the official providers’ sites on the snapshot date.
Verified model pricing snapshot
| Provider | Model | Input (per 1M tokens) | Output (per 1M tokens) | Context window |
|---|---|---|---|---|
| OpenAI | GPT-5.5 | $5.00 | $30.00 | 1,000,000 |
| OpenAI | GPT-5.4 | $2.50 | $15.00 | 1,000,000 |
| OpenAI | GPT-5.4 Mini | $0.75 | $4.50 | 400,000 |
| OpenAI | ChatGPT Chat Latest | $5.00 | $30.00 | 128,000 |
| Google AI | Gemini 3.1 Pro | $2.00 | $12.00 | 1,000,000 |
| Google AI | Gemini 3 Flash | $0.50 | $3.00 | 1,000,000 |
All prices in USD and based on pay‑as‑you‑go API access. Source: OpenAI API pricing and Google AI Gemini API pricing, verified 2026‑06‑23.
Source notes
We used only official, publicly available pricing pages to build this snapshot. For OpenAI, we referenced the current API pricing page. For Google, we relied on the Gemini Developer API pricing documentation. An Anthropic pricing page was also consulted, but its API‑level model prices were not captured in this verification – developers should visit Anthropic’s site directly for Claude pricing. Alibaba Cloud’s Model Studio pricing was likewise not included because the snapshot did not contain verified Alibaba API prices. No source content is republished here; we only extract and compare the stated numbers to give you a clean, actionable summary.
How to use this snapshot in a budget
Start by estimating your monthly token volume. Break it into input tokens (prompts, conversation history, documents) and output tokens (model responses). Multiply by the per‑1M‑token rates in the table above, then add any fixed platform fees. Remember:
- Token counts can vary by model tokenizer. Use each provider’s token estimation tool for precision.
- Caching and batching discounts exist: Google offers up to 50 % cost reduction through the Batch API on its paid tier, and OpenAI provides batch pricing for asynchronous jobs. Check the respective pricing pages for the latest discount rates; they are not reflected in the base table.
- Context window does not mean the model always uses the entire window efficiently. A model with 1M context but weaker long‑range attention may underperform on very long documents.
For a quick total cost estimate, try our interactive cost calculator.
Workload Cost Scenario
Imagine a customer‑support bot handling 500,000 interactions per month. Each interaction includes a system prompt, recent conversation history, and a user message averaging 800 input tokens. The assistant’s response averages 300 output tokens.
Total monthly tokens:
- Input: 500,000 × 800 = 400 million tokens
- Output: 500,000 × 300 = 150 million tokens
| Model | Input cost (400M tokens) | Output cost (150M tokens) | Total monthly cost |
|---|---|---|---|
| GPT-5.5 | $2,000 | $4,500 | $6,500 |
| GPT-5.4 | $1,000 | $2,250 | $3,250 |
| GPT-5.4 Mini | $300 | $675 | $975 |
| ChatGPT Chat Latest | $2,000 | $4,500 | $6,500 |
| Gemini 3.1 Pro | $800 | $1,800 | $2,600 |
| Gemini 3 Flash | $200 | $450 | $650 |
For this volume, Gemini 3 Flash is 10× cheaper than GPT‑5.5 and almost half the price of GPT‑5.4 Mini. Even the more capable Gemini 3.1 Pro costs significantly less than the entry‑level GPT‑5.4.
Editorial Analysis
The price gaps in this snapshot are not subtle. A few patterns stand out:
- Gemini 3 Flash is the cost‑performance leader for high‑volume, simpler tasks. Its 1M context window is a bonus, but its real strength is the fraction‑of‑a‑penny per interaction cost. If your use case tolerates occasional nuance gaps, this model will slash your bill dramatically.
- GPT‑5.4 Mini occupies the budget‑conscious OpenAI lane. At $0.75/$4.50 it’s still more expensive than Gemini 3 Flash but appeals to teams that are deeply integrated with the OpenAI ecosystem or need GPT‑specific tooling (function calling, JSON mode, etc.).
- GPT‑5.4 and Gemini 3.1 Pro are the balanced mid‑tier options. They cost roughly the same order of magnitude, but Gemini 3.1 Pro is 20–30 % cheaper. Choose Gemini if you value larger context and integration with Google Cloud; choose GPT‑5.4 if you cannot migrate away from OpenAI’s API format.
- GPT‑5.5 and ChatGPT Chat Latest share the same price as the top‑tier offering. The ChatGPT Chat Latest model has a much smaller context window (128K), making it suitable only for chat‑oriented tasks where long documents are never required.
None of these models exist in a vacuum. A multi‑model routing strategy can capture the best of both worlds – cheap models for simple replies, expensive models for reasoning‑heavy queries.
Routing Recommendations
You can reduce your overall API bill by sending each request to the cheapest model that meets that request’s quality bar. For example:
- High‑volume, low‑complexity chat: Gemini 3 Flash as default, with fallback to GPT‑5.4 Mini if quality degrades.
- Long‑document summarization (up to 1M tokens): Gemini 3.1 Pro or GPT‑5.4, depending on provider preference. Avoid GPT‑5.4 Mini and ChatGPT Chat Latest because of limited context.
- Advanced reasoning, code generation, or sensitive instructions: GPT‑5.5 (or Gemini 3.1 Pro if you need a lower price).
- Consumer‑facing chat products with OpenAI’s guardrails: ChatGPT Chat Latest (or GPT‑5.5) despite the cost, if safety filters and alignment are critical.
Use our comparison tool to stack up these models side‑by‑side on features beyond price.
Decision Table
| Use‑case | Recommended Model(s) | Why |
|---|---|---|
| Mass‑market Q&A bot | Gemini 3 Flash, GPT‑5.4 Mini | Lowest cost, sufficient quality for simple replies. |
| Document analysis (long) | Gemini 3.1 Pro, GPT‑5.4 | 1M context, reasonable price. |
| Code assistant / reasoning | GPT‑5.5, Gemini 3.1 Pro | Top performance; GPT‑5.5 for API‑specific tooling. |
| OpenAI‑dependent pipelines | GPT‑5.4 Mini, GPT‑5.4 | Native compatibility with existing code. |
| Google Cloud native app | Gemini 3.1 Pro, Gemini 3 Flash | Optimised billing and latency on Google infrastructure. |
| Strictly chat (no documents) | ChatGPT Chat Latest, Gemini 3 Flash | Chat‑tuned; choose by budget. |
Practical checks before publishing a pricing decision
Before you lock in a model for production, run through this checklist:
- Rate limits: A cheap model that throttles you under load can cost more in lost revenue. Check the provider’s quota documentation.
- Data retention policies: Google’s free tier uses input/output for product improvement; paid tiers do not. OpenAI may retain data under certain API usage scenarios. Understand what happens to your prompts.
- Accuracy requirements: If a cut‑rate model requires 2× the tokens to get a decent answer, the savings disappear. Run a side‑by‑side evaluation on your own dataset.
- Latency SLAs: Cheaper models may have slower response times. Verify latency under your expected concurrency.
- Model deprecation: Providers occasionally sunset models. Plan for migration paths.
Content Quality Notes
This article presents a point‑in‑time pricing view verified on June 23, 2026. Prices, model names, and capabilities can change. We encourage readers to visit the official provider pages for the most current information. The external pages referenced in the source list are used solely as factual grounding – no content is reproduced verbatim, and all analysis is our own. For a constantly updated list of AI model prices, see the models overview.
Bottom line
The API pricing landscape continues to bifurcate: ultra‑cheap models like Gemini 3 Flash offer near‑zero‑cost inference for simple tasks, while the premium tier (GPT‑5.5) remains relatively stable in pricing. For most builders, a hybrid approach – routing low‑stakes traffic to Flash and reserving expensive models for complex queries – yields the best economics. Use our calculator and comparison tools to build your own cost model before committing.
Visual Cost Snapshot
Provider Source Visual
Official AI API pricing changes and model cost comparisons source visual from Plans & Pricing | Claude by Anthropic
Source page: https://www.anthropic.com/pricing
Supporting Source Visual
Official AI API pricing changes and model cost comparisons source visual from Gemini Developer API pricing | Gemini API | Google AI for Developers
Source page: https://ai.google.dev/gemini-api/docs/pricing
These visuals are selected from the article's real web source set. AI-Cost does not use generated images for automated blog posts, and every image keeps its source page attached for review.
Cost Planning Links
References
- Plans & Pricing | Claude by Anthropic
- Gemini Developer API pricing | Gemini API | Google AI for Developers
- Untitled Source
Last verified: June 23, 2026
Cover image: Official web image from https://www.anthropic.com/pricing. Review the source page terms before commercial reuse.
In-article image 1: Official web image from https://www.anthropic.com/pricing. Review the source page terms before commercial reuse. In-article image 2: Official web image from https://ai.google.dev/gemini-api/docs/pricing. Review the source page terms before commercial reuse.