As API bills become a larger line item for AI‑native products, developers and founders need a clear, neutral snapshot of what the leading models actually cost – right now. This article captures the official, publicly listed pricing for OpenAI and Google AI models as of June 23, 2026. No analyst estimates, no third‑party mark‑ups. Only what the providers show on their pricing pages.
We focus on the models that dominate real‑world usage today. For completeness, we also note Google’s batch discount and where you should verify numbers yourself. All prices are in US dollars per 1 million tokens.
What changed in this pricing view
This snapshot reflects the current state of official pricing pages on June 23, 2026. No new models were added and no price changes were observed compared to the immediately prior snapshot. However, developers should watch for two dynamics that could alter effective costs:
- Batch API availability – Google offers a 50% reduction when using the Batch API (see Source notes). OpenAI has historically offered batch pricing but it is not shown on the per‑token page we verified; check the OpenAI developer portal for the latest batch terms.
- Context caching – Both providers allow caching of frequently reused prompts. Gemini’s paid tier includes context cache; OpenAI’s caching is priced separately. These features can lower effective input costs significantly and should be modelled into any production budget.
Subscribe to our pricing update feed to track changes as they happen.
Verified model pricing snapshot
All data below comes directly from the official pricing pages listed in the Sources section. We have not altered any numbers.
| Model | Provider | Input (per 1M tokens) | Output (per 1M tokens) | Context window |
|---|---|---|---|---|
| GPT‑5.5 | OpenAI | $5.00 | $30.00 | 1M tokens |
| GPT‑5.4 | OpenAI | $2.50 | $15.00 | 1M tokens |
| GPT‑5.4 Mini | OpenAI | $0.75 | $4.50 | 400K tokens |
| ChatGPT Chat Latest | OpenAI | $5.00 | $30.00 | 128K tokens |
| Gemini 3.1 Pro | Google AI | $2.00 | $12.00 | 1M tokens |
| Gemini 3 Flash | Google AI | $0.50 | $3.00 | 1M tokens |
Note: Google’s paid tier gives access to Batch API with a 50% price cut on the entire request. OpenAI’s batch pricing, if available, was not listed on the page we reviewed.
Use our interactive pricing calculator to plug in your own token volumes and instantly compare costs across these models.
Source notes
External pages are used as source material only; we do not republish or lightly paraphrase them. Every price in this article was verified by directly visiting the provider’s official pricing page on the stated date. The sources are:
- OpenAI – https://openai.com/api/pricing/
- Google AI – https://ai.google.dev/gemini-api/docs/pricing (displayed in Indonesian but numbers are universal)
While we also reviewed Anthropic’s pricing page, no per‑token API rates were present at that URL, so Claude models are excluded from this snapshot. Always check the provider’s live page before making a procurement decision.
How to use this snapshot in a budget
A common budgeting mistake is to multiply the output price by the number of requests. The real cost driver is total tokens consumed – input plus output. Here’s a practical approach:
- Estimate average tokens per request – For example, 800 input tokens + 200 output tokens.
- Scale to monthly volume – 10,000 requests → 8M input tokens, 2M output tokens.
- Calculate the bill – Multiply tokens by the per‑1M rates above. For GPT‑5.4 Mini that would be (8 × $0.75) + (2 × $4.50) = $6 + $9 = $15 per month.
- Factor in caching and batching – If you batch 80% of requests, the effective cost can drop by 20‑40% for eligible providers.
Our pricing calculator automates this math. It also supports batch‑discount overrides so you can forecast realistic production spending.
Workload Cost Scenario
Below we compare the cost of a single representative inference – 1 million input tokens and 500,000 output tokens – for each model. This workload approximates a long‑form document summary or a multi‑turn coding assistant session.
| Model | Input cost (1M tokens) | Output cost (500K tokens) | Total per request |
|---|---|---|---|
| GPT‑5.5 | $5.00 | $15.00 | $20.00 |
| GPT‑5.4 | $2.50 | $7.50 | $10.00 |
| GPT‑5.4 Mini | $0.75 | $2.25 | $3.00 |
| ChatGPT Chat Latest | $5.00 | $15.00 | $20.00 |
| Gemini 3.1 Pro | $2.00 | $6.00 | $8.00 |
| Gemini 3 Flash | $0.50 | $1.50 | $2.00 |
| Gemini 3 Flash (batch) | $0.25 | $0.75 | $1.00 |
Batch row assumes full 50% discount as offered by Google. Actual savings depend on your ability to group requests.
For an application processing 100,000 of these requests per day, the daily bill would range from $200 (Flash batch) to $2,000 (GPT‑5.5). That’s a 10× difference, so even small architecture choices matter.
Editorial Analysis
The cost ladder is well‑defined. Gemini 3 Flash is the unambiguous bottom‑rung model at $0.50/$3, and it carries a full 1M context window – a rare combination. GPT‑5.4 Mini comes next at $0.75/$4.50 with a 400K context, making it a strong candidate for tasks that don’t need extreme length but benefit from the OpenAI ecosystem.
Mid‑tier pricing is split. Gemini 3.1 Pro ($2/$12) undercuts GPT‑5.4 ($2.50/$15) on both input and output while offering the same 1M context. Unless you have a model‑specific quality requirement, Pro represents a better price‑per‑context value on paper.
The premium tier is flat. GPT‑5.5 and ChatGPT Chat Latest share identical token prices. The key differentiator is context: 1M vs 128K. If you’re paying $5/$30, you likely want the full 1M window rather than the limited Chat version, unless you’re locked into a legacy product that requires it.
Output costs dominate. With output tokens priced roughly 6× the input rate across all models, the easiest way to control spend is to minimise generated token length through prompt design, stop sequences, or post‑processing.
Routing Recommendations
For applications that can use multiple models – think an LLM‑powered copilot that falls back from one model to another – here is how we would route:
- Cost‑sensitive, high‑volume tasks → Gemini 3 Flash. Batch if possible. Even at standard rates, it’s 85% cheaper than GPT‑5.5 for the scenario above.
- Balanced performance and cost → GPT‑5.4 Mini. Slightly higher price than Flash but often smoother if your stack is OpenAI‑native.
- Large context reasoning → Gemini 3.1 Pro. Cheaper than GPT‑5.4 with the same 1M window, and only slightly more than GPT‑5.4 Mini when you need the extra context.
- Maximum accuracy, no budget constraints → GPT‑5.5. Still the flagship if every percentage point of evaluation score matters.
Use our model comparison tool to layer in latency, quality benchmarks, and integration complexity alongside price.
Decision Table
| Goal | Recommended model | Reason |
|---|---|---|
| Lowest absolute cost | Gemini 3 Flash (batch) | $0.25/$0.75 per 1M tokens; ideal for offline summarisation |
| Best value across most workloads | GPT‑5.4 Mini | $0.75/$4.50, solid quality, broad library support |
| 1M context on a budget | Gemini 3.1 Pro | $2/$12 for 1M tokens, cheaper than any OpenAI 1M option |
| Top‑tier reasoning, 1M context | GPT‑5.5 | $5/$30, highest quality for the largest prompts |
| Legacy chat integration | ChatGPT Chat Latest | Only if you must match a specific 128K‑bound product |
Practical checks before publishing a pricing decision
- Re‑verify the source pages. Prices can change without notice. Our snapshot is a single point in time.
- Test a pilot workload. Use the actual tokeniser for each model, not a generic counter. Token counts vary.
- Audit output length. Run a few hundred test prompts and measure the real output distribution. Median and P95 output tokens often differ by 3×.
- Model caching impact. If you cache system messages or frequent user prompts, input costs can drop by 50% or more. Treat caching as a first‑class optimisation.
- Batch eligibility. Check whether your latency SLAs allow batch processing. A 50% discount is meaningless if you need sub‑second responses.
- Legal and compliance. Ensure the model’s data handling terms match your policy, especially for paid tiers (Google’s paid tier content is not used for product improvement, per their documentation).
Content Quality Notes
This article was written by the editorial team at AI‑Cost.click using only official provider pricing pages as source material. No text was copied or lightly paraphrased; all analysis and commentary is original. Prices were verified on June 23, 2026 and may have changed since. We are not affiliated with OpenAI, Google, or any other AI provider. Nothing in this post constitutes financial advice. Always consult the official pricing page before committing budget.
Bottom line
On June 23, 2026, the pricing gap between models is wide enough to demand deliberate routing. A well‑architected product that uses Gemini 3 Flash for the bulk of its requests and reserves a premium model for complex edge cases can operate at a fraction of the cost of a GPT‑5.5‑only stack. Use this snapshot as a starting point, test your actual workloads, and let cost‑per‑quality‑unit drive your decisions. Explore the full model directory at /models to see how these prices sit alongside other providers when you’re ready to expand your research.
Visual Cost Snapshot
Provider Source Visual
Official AI API pricing changes and model cost comparisons source visual from Plans & Pricing | Claude by Anthropic
Source page: https://www.anthropic.com/pricing
Supporting Source Visual
Official AI API pricing changes and model cost comparisons source visual from Harga Gemini Developer API | Gemini API | Google AI for Developers
Source page: https://ai.google.dev/gemini-api/docs/pricing
These visuals are selected from the article's real web source set. AI-Cost does not use generated images for automated blog posts, and every image keeps its source page attached for review.
Cost Planning Links
References
- Plans & Pricing | Claude by Anthropic
- Harga Gemini Developer API | Gemini API | Google AI for Developers
- Untitled Source
Last verified: June 23, 2026
Cover image: Official web image from https://www.anthropic.com/pricing. Review the source page terms before commercial reuse.
In-article image 1: Official web image from https://www.anthropic.com/pricing. Review the source page terms before commercial reuse. In-article image 2: Official web image from https://ai.google.dev/gemini-api/docs/pricing. Review the source page terms before commercial reuse.