Table of Contents
- Setting Up Criteria for Codex Agent Pricing Comparison
- OpenAI Codex Pricing Structure Analysis
- Claude Code vs Codex Agent Pricing — API Token Rates
- Claude Managed Agents Billing Structure
- Gemini Code Assist Pricing Structure
- Direct Token Price Comparison — The Core of Codex Agent Pricing
- Monthly Cost Simulation — Single Developer Basis
- Codex Agent Pricing Comparison — Selection Criteria by Billing Model and Next Steps
As of April 2026, the AI coding agent market has split into three distinct billing approaches. OpenAI Codex switched from per-message billing to API token-usage-based billing starting April 2, 2026. Claude Managed Agents uses a dual billing structure that adds session runtime fees on top of token costs. Google Gemini Code Assist Enterprise charges a flat $45/user/month. Choosing a tool without comparing AI coding agent pricing can lead to unexpected charges at the end of the month.
This article covers the billing models, token rates, and monthly cost simulations for all three agents. It establishes comparison criteria first, then analyzes each agent’s pricing structure individually, and finally estimates monthly costs under identical conditions.
Setting Up Criteria for Codex Agent Pricing Comparison
Comparing AI coding agent pricing isn’t as simple as looking at the monthly price tag. The billing axes differ across agents. The following four axes serve as the comparison framework.
Billing Model Types
Each agent bills on a different unit.
- Token-based: Charges proportional to input/output token counts. OpenAI Codex and Claude API use this model.
- Session-based: Token costs plus additional charges based on session duration. Claude Managed Agents falls here.
- Per-user flat rate: Fixed monthly amount. Gemini Code Assist Enterprise uses this model.
- Subscription + credit hybrid: Monthly subscription includes a set amount of credits, with overage charges beyond that. Using Codex through ChatGPT Plus/Pro falls into this category.
What to Check When Comparing
| Comparison Axis | What to Verify | Why It Matters |
|---|---|---|
| Token rates | Per-MTok rates for input/output/cache hits | Unit costs differ for the same workload |
| Cache discount rate | Discount percentage on prompt cache hits | Makes a significant cost difference for agents with frequent repeated calls |
| Additional charges | Web search, code execution, session runtime | Hidden costs inflate the monthly bill |
| Monthly fixed costs | Subscription fee or per-user flat rate | Total cost varies with team size |
Claude Managed Agents charges a separate session runtime fee ($0.08/session-hour). Even with lower token rates, longer sessions can flip the total cost. Conversely, Gemini Code Assist is flat-rate, so the effective per-unit cost drops as usage increases.
Using these four axes, each agent is analyzed individually below, followed by a monthly cost simulation under identical scenarios.
OpenAI Codex Pricing Structure Analysis
OpenAI Codex made a major billing change starting April 2, 2026. It shifted from per-message billing to API token-usage-based billing. The structure converts credits into millions of input tokens, cached input tokens, and output tokens.
codex-mini-latest Token Rates
The API pricing for codex-mini-latest, Codex’s current core model, is as follows.
| Item | Rate |
|---|---|
| Input | $1.50/MTok |
| Output | $6.00/MTok |
| Cache hit | 75% discount applied |
With a 75% discount on cache hits, the effective input rate drops to $0.375/MTok for agent workflows with repetitive prompt patterns. This discount rate is lower than Claude’s cache hit discount (0.1x base price, i.e., 90% discount), but since the absolute rate is lower to begin with, the final cost isn’t necessarily worse.
Using Codex Under ChatGPT Subscription Plans
Codex can also be used through ChatGPT subscription plans rather than the API directly.
| Plan | Monthly Cost | Codex Allowance (vs Plus) | Notes |
|---|---|---|---|
| ChatGPT Plus | $20/mo | 1x (baseline) | Most basic |
| ChatGPT Pro $100 | $100/mo | 5x | Promotional 10x until May 31, 2026 |
| ChatGPT Pro $200 | $200/mo | 20x | For heavy users |
OpenAI’s guidance indicates an average cost of $100–$200 per developer per month. While the $20 Plus plan does include Codex access, the allowance is limited enough that the Pro plan is essentially required for serious production use.
The OpenAI Codex rate card page returned a 403 error, making it impossible to directly verify the original credit-to-token conversion table by model. Exact API token rates (input/output/cache) for GPT-5.4 and GPT-5.3-Codex are also unverifiable from official documentation. The information above is based on accessible documents, and the latest rate card should be confirmed through OpenAI support.
Claude Code vs Codex Agent Pricing — API Token Rates
Claude’s lineup splits into three models, each with significantly different token rates. The following rates are as of April 2026, confirmed from the Claude model pricing page.
Standard Token Rates by Model
| Model | Input ($/MTok) | Output ($/MTok) | Cache Hit Discount |
|---|---|---|---|
| Claude Opus 4.6 | $5.00 | $25.00 | 0.1x base price (90% discount) |
| Claude Sonnet 4.6 | $3.00 | $15.00 | 0.1x base price |
| Claude Haiku 4.5 | $1.00 | $5.00 | 0.1x base price |
Opus 4.6’s output rate of $25/MTok is over 4x codex-mini-latest’s $6/MTok. However, the 90% cache hit discount is overwhelming — with high cache hit rates, input costs drop dramatically. Opus 4.6’s cache hit input rate is $0.50/MTok, which isn’t far from codex-mini-latest’s cache hit rate of $0.375/MTok.
Special Pricing: Fast Mode and Batch API
Claude offers two special pricing options beyond standard rates.
Fast Mode — Reduces latency at a higher cost. Opus 4.6 Fast Mode runs $30/MTok input and $150/MTok output — 6x the standard rate. Unless latency-sensitive use cases like real-time code completion are involved, there’s no reason to use it.
Batch API — Suited for bulk tasks that don’t need immediate responses. A 50% discount applies to both input and output, bringing Opus 4.6 to $2.50/MTok input and $12.50/MTok output. For asynchronous tasks like automated code review, the Batch API is cost-effective.
| Pricing Option | Opus 4.6 Input | Opus 4.6 Output | Use Case |
|---|---|---|---|
| Standard | $5.00 | $25.00 | General API calls |
| Fast Mode | $30.00 | $150.00 | Latency-critical tasks |
| Batch API | $2.50 | $12.50 | Bulk async processing |
Tasks like PR code review that don’t need instant responses are a fit for the Batch API. Opus 4.6 output rate drops from $25 to $12.50 — a 50% reduction. This can be integrated as an async review step in CI/CD pipelines.
Claude Managed Agents Billing Structure
Claude Managed Agents has a different billing structure from plain API calls. Session runtime fees are added on top of token costs — a dual billing model. Misunderstanding this structure leads to underestimating Claude’s costs in any Codex agent pricing comparison.
Dual Billing Structure
Claude Managed Agents billing consists of two layers.
- Token cost: Same model-specific input/output rates as the standard API
- Session runtime: Additional $0.08/session-hour charge
Here’s the cost breakdown for a 1-hour session on Opus 4.6 (50k input tokens, 15k output tokens).
Token cost:
Input: 50,000 / 1,000,000 × $5.00 = $0.25
Output: 15,000 / 1,000,000 × $25.00 = $0.375
Session runtime: 1 hour × $0.08 = $0.08
──────────────────────────────────
Total cost: $0.705
With prompt caching applied, this drops to approximately $0.525. Caching alone can reduce costs by over 25%.
Additional Tool Charges
The Claude API has additional billing items beyond tokens.
| Tool | Billing Method |
|---|---|
| Web search | $10 per 1,000 queries |
| Web fetch | No additional cost (token cost only) |
| Code execution tool | 1,550 hours/month free, $0.05/hour after |
Adding web search to an agent drives costs up quickly. If an agent makes 50 web search calls in a single session, that alone adds $0.50. Web fetch is free, so when the URL is already known, using web fetch instead of web search saves money.
1,550 hours per month works out to roughly 51 hours per day. That’s generous for a solo developer, but teams running agents at scale could exceed it. The overage rate of $0.05/hour isn’t steep, but monitoring is still advisable.
Gemini Code Assist Pricing Structure
Google Gemini Code Assist takes a fundamentally different billing approach from the two agents covered above. It’s a per-user flat rate, not a per-token charge.
Enterprise Edition
According to the Gemini Code Assist pricing page, the Enterprise edition costs $45/user/month (with discounts for annual commitments). New customers automatically receive a free credit for up to 50 licenses for the first month.
The advantage of flat-rate pricing is predictability. Ten developers cost $450/month, fifty cost $2,250/month. The amount is fixed regardless of usage, so there’s no end-of-month billing spike like with token-based models.
The downside is that per-token comparisons become impossible. There’s no way to track costs at the granularity of "this task used N tokens and cost $X" the way Codex or Claude can.
The exact monthly rate for Google Gemini Code Assist Standard edition couldn’t be confirmed from official documentation (dynamic rendering issue). It’s certainly cheaper than Enterprise, but the exact price requires contacting Google Cloud sales.
Flat Rate vs Token Billing — Which Teams Benefit
| Usage Pattern | Favorable Billing Model | Reason |
|---|---|---|
| Few developers, heavy usage | Flat rate (Gemini) | Fixed cost with no usage cap |
| Many developers, light usage | Token-based (Codex/Claude) | No usage means no charges |
| Teams with volatile usage | Token-based | Cost savings during low periods |
| Teams needing budget predictability | Flat rate | Fixed monthly cost |
Direct Token Price Comparison — The Core of Codex Agent Pricing

Putting the token rates of all three agents side by side. Since Gemini Code Assist is flat-rate and has no per-token pricing, only OpenAI Codex and Claude are included in the token comparison.
Lightweight Model Comparison: codex-mini-latest vs Claude Haiku 4.5
A comparison of the lightweight models primarily used for everyday code generation and autocomplete in coding agents.
| Item | codex-mini-latest | Claude Haiku 4.5 |
|---|---|---|
| Input rate | $1.50/MTok | $1.00/MTok |
| Output rate | $6.00/MTok | $5.00/MTok |
| Cache hit input | $0.375/MTok (75% discount) | $0.10/MTok (90% discount) |
| Input/output rate ratio | Baseline (1.0x) | Input 0.67x, Output 0.83x |
Haiku 4.5 is cheaper on both input and output. The difference becomes dramatic with cache hits. Haiku 4.5’s cache hit input rate of $0.10/MTok is approximately 3.75x cheaper than codex-mini-latest’s $0.375/MTok. In agent workflows with high prompt caching hit rates, this gap compounds.
High-End Model Comparison: Claude Opus 4.6 vs Codex (GPT Family)
Claude Opus 4.6 Standard:
Input $5.00/MTok | Output $25.00/MTok
Claude Opus 4.6 Batch API:
Input $2.50/MTok | Output $12.50/MTok
codex-mini-latest:
Input $1.50/MTok | Output $6.00/MTok
Opus 4.6 is 3.3x more expensive on input and 4.2x on output compared to codex-mini-latest. However, these models sit at different performance tiers. Opus 4.6 is a flagship model while codex-mini-latest is a lightweight model. A fair comparison would require GPT-5.4 or GPT-5.3-Codex pricing, which isn’t publicly available in official documentation.
Rather than using a single model for all tasks, routing by task type is more cost-effective. Simple tasks like code autocomplete and lint fixes go to Haiku 4.5 or codex-mini-latest, while architecture design and complex refactoring get routed to Opus 4.6.
Monthly Cost Simulation — Single Developer Basis

Monthly costs are estimated under identical usage assumptions. No official benchmark data exists for actual task-to-cost ratios per agent, so this is a simulation based on assumed token usage.
Assumptions
Monthly usage assumptions (single developer):
Daily agent sessions: 5
Average input tokens per session: 50,000
Average output tokens per session: 15,000
Working days per month: 22
─────────────────────────
Monthly total input: 5 × 50,000 × 22 = 5,500,000 tokens (5.5 MTok)
Monthly total output: 5 × 15,000 × 22 = 1,650,000 tokens (1.65 MTok)
Monthly total session hours: 5 × 1 hour × 22 = 110 hours
Monthly Cost Comparison by Scenario
| Agent/Model | Input Cost | Output Cost | Additional Cost | Monthly Total |
|---|---|---|---|---|
| codex-mini-latest (API) | $8.25 | $9.90 | – | $18.15 |
| codex-mini-latest (50% cache) | $5.16 | $9.90 | – | $15.06 |
| Claude Haiku 4.5 | $5.50 | $8.25 | – | $13.75 |
| Claude Haiku 4.5 (50% cache) | $3.03 | $8.25 | – | $11.28 |
| Claude Opus 4.6 | $27.50 | $41.25 | – | $68.75 |
| Claude Opus 4.6 Batch | $13.75 | $20.63 | – | $34.38 |
| Claude Managed Agents (Opus) | $27.50 | $41.25 | $8.80 (runtime) | $77.55 |
| Gemini Code Assist Enterprise | – | – | – | $45.00 (flat) |
| ChatGPT Pro $100 (incl. Codex) | – | – | – | $100.00 (flat) |
For the 50% cache hit rate scenarios, cache hit input uses the discounted rate while the remaining 50% uses the standard rate. The $8.80 additional cost for Claude Managed Agents is calculated as 110 hours × $0.08/session-hour.
Interpreting the Simulation Results
On pure API token costs alone, Claude Haiku 4.5 comes out cheapest. With caching applied, it’s $11.28/month — roughly 25% cheaper than codex-mini-latest with caching ($15.06). But this comparison has caveats.
First, model performance isn’t equal. Haiku 4.5 and codex-mini-latest may differ in code generation quality, and quality gaps translate to rework costs. No official benchmark data comparing cost per identical task exists yet.
Second, the ChatGPT Pro $100 plan includes all ChatGPT features beyond just Codex. While it looks expensive as a pure coding agent cost, the perceived value changes when factoring in GPT-5 access, web browsing, and image generation.
Third, Gemini Code Assist at $45/user has no usage cap. For developers who use more than the "5 sessions/day" assumed in this simulation, the flat rate becomes more favorable.
Codex Agent Pricing Comparison — Selection Criteria by Billing Model and Next Steps
The key takeaway from this AI coding agent pricing comparison isn’t "how much" but rather "which billing model fits the team’s usage patterns."
Token-based billing (Codex API, Claude API) scales with usage, so costs drop during slow periods. The flip side is budget overrun risk during usage spikes. Flat-rate billing (Gemini Code Assist) is predictable but charges the same amount even in light months. The subscription + credit hybrid (ChatGPT Pro) guarantees a certain usage level within a fixed monthly fee, but the overage structure needs to be clearly understood.
Decision tree:
1. Team of 5 or fewer, direct API calls → Claude Haiku 4.5 / codex-mini-latest
2. Team of 10+, fixed budget required → Gemini Code Assist Enterprise
3. Solo developer, uses GPT features beyond coding → ChatGPT Pro $100
4. Async bulk processing (code review, migration) → Claude Batch API
5. Real-time agent automation → Claude Managed Agents (factor in session costs)
The detailed credit policy for Claude Code Max subscriptions ($100/$200) hasn’t been confirmed in official documentation yet. It may follow a subscription + credit model similar to ChatGPT Pro, and this comparison table could change once finalized.
AI coding agent costs aren’t determined by unit rates alone. The token volume per agent session, cache hit rate, session length, and frequency of additional tool calls all affect total cost. Before adopting an agent, logging API usage for about two weeks and plugging actual numbers into the simulation above is the most accurate way to forecast costs.
Once the OpenAI Codex pricing structure settles, the next topic to examine is agent orchestration. Building a routing structure that assigns different models to different task types — rather than using a single model for everything — can significantly change costs. Diving deeper into Claude Managed Agents’ cost structure also makes session design optimization an important topic. And since vendors may release official Codex agent pricing benchmark data in the second half of 2026, this comparison will need an update at that point.