AI Coding Agent Pricing Comparison: Codex vs Claude Code vs Gemini 2026

Table of Contents

As of April 2026, the AI coding agent market has split into three distinct billing approaches. OpenAI Codex switched from per-message billing to API token-usage-based billing starting April 2, 2026. Claude Managed Agents uses a dual billing structure that adds session runtime fees on top of token costs. Google Gemini Code Assist Enterprise charges a flat $45/user/month. Choosing a tool without comparing AI coding agent pricing can lead to unexpected charges at the end of the month.

This article covers the billing models, token rates, and monthly cost simulations for all three agents. It establishes comparison criteria first, then analyzes each agent’s pricing structure individually, and finally estimates monthly costs under identical conditions.

Setting Up Criteria for Codex Agent Pricing Comparison

Comparing AI coding agent pricing isn’t as simple as looking at the monthly price tag. The billing axes differ across agents. The following four axes serve as the comparison framework.

Billing Model Types

Each agent bills on a different unit.

  • Token-based: Charges proportional to input/output token counts. OpenAI Codex and Claude API use this model.
  • Session-based: Token costs plus additional charges based on session duration. Claude Managed Agents falls here.
  • Per-user flat rate: Fixed monthly amount. Gemini Code Assist Enterprise uses this model.
  • Subscription + credit hybrid: Monthly subscription includes a set amount of credits, with overage charges beyond that. Using Codex through ChatGPT Plus/Pro falls into this category.

What to Check When Comparing

Comparison Axis What to Verify Why It Matters
Token rates Per-MTok rates for input/output/cache hits Unit costs differ for the same workload
Cache discount rate Discount percentage on prompt cache hits Makes a significant cost difference for agents with frequent repeated calls
Additional charges Web search, code execution, session runtime Hidden costs inflate the monthly bill
Monthly fixed costs Subscription fee or per-user flat rate Total cost varies with team size
Why token rates alone aren’t enough for comparison
Claude Managed Agents charges a separate session runtime fee ($0.08/session-hour). Even with lower token rates, longer sessions can flip the total cost. Conversely, Gemini Code Assist is flat-rate, so the effective per-unit cost drops as usage increases.

Using these four axes, each agent is analyzed individually below, followed by a monthly cost simulation under identical scenarios.

OpenAI Codex Pricing Structure Analysis

OpenAI Codex made a major billing change starting April 2, 2026. It shifted from per-message billing to API token-usage-based billing. The structure converts credits into millions of input tokens, cached input tokens, and output tokens.

codex-mini-latest Token Rates

The API pricing for codex-mini-latest, Codex’s current core model, is as follows.

Item Rate
Input $1.50/MTok
Output $6.00/MTok
Cache hit 75% discount applied

With a 75% discount on cache hits, the effective input rate drops to $0.375/MTok for agent workflows with repetitive prompt patterns. This discount rate is lower than Claude’s cache hit discount (0.1x base price, i.e., 90% discount), but since the absolute rate is lower to begin with, the final cost isn’t necessarily worse.

Using Codex Under ChatGPT Subscription Plans

Codex can also be used through ChatGPT subscription plans rather than the API directly.

Plan Monthly Cost Codex Allowance (vs Plus) Notes
ChatGPT Plus $20/mo 1x (baseline) Most basic
ChatGPT Pro $100 $100/mo 5x Promotional 10x until May 31, 2026
ChatGPT Pro $200 $200/mo 20x For heavy users

OpenAI’s guidance indicates an average cost of $100–$200 per developer per month. While the $20 Plus plan does include Codex access, the allowance is limited enough that the Pro plan is essentially required for serious production use.

Codex rate card source unavailable
The OpenAI Codex rate card page returned a 403 error, making it impossible to directly verify the original credit-to-token conversion table by model. Exact API token rates (input/output/cache) for GPT-5.4 and GPT-5.3-Codex are also unverifiable from official documentation. The information above is based on accessible documents, and the latest rate card should be confirmed through OpenAI support.

Claude Code vs Codex Agent Pricing — API Token Rates

Claude’s lineup splits into three models, each with significantly different token rates. The following rates are as of April 2026, confirmed from the Claude model pricing page.

Standard Token Rates by Model

Model Input ($/MTok) Output ($/MTok) Cache Hit Discount
Claude Opus 4.6 $5.00 $25.00 0.1x base price (90% discount)
Claude Sonnet 4.6 $3.00 $15.00 0.1x base price
Claude Haiku 4.5 $1.00 $5.00 0.1x base price

Opus 4.6’s output rate of $25/MTok is over 4x codex-mini-latest’s $6/MTok. However, the 90% cache hit discount is overwhelming — with high cache hit rates, input costs drop dramatically. Opus 4.6’s cache hit input rate is $0.50/MTok, which isn’t far from codex-mini-latest’s cache hit rate of $0.375/MTok.

Special Pricing: Fast Mode and Batch API

Claude offers two special pricing options beyond standard rates.

Fast Mode — Reduces latency at a higher cost. Opus 4.6 Fast Mode runs $30/MTok input and $150/MTok output — 6x the standard rate. Unless latency-sensitive use cases like real-time code completion are involved, there’s no reason to use it.

Batch API — Suited for bulk tasks that don’t need immediate responses. A 50% discount applies to both input and output, bringing Opus 4.6 to $2.50/MTok input and $12.50/MTok output. For asynchronous tasks like automated code review, the Batch API is cost-effective.

Pricing Option Opus 4.6 Input Opus 4.6 Output Use Case
Standard $5.00 $25.00 General API calls
Fast Mode $30.00 $150.00 Latency-critical tasks
Batch API $2.50 $12.50 Bulk async processing
Cut code review costs in half with Batch API
Tasks like PR code review that don’t need instant responses are a fit for the Batch API. Opus 4.6 output rate drops from $25 to $12.50 — a 50% reduction. This can be integrated as an async review step in CI/CD pipelines.

Claude Managed Agents Billing Structure

Claude Managed Agents has a different billing structure from plain API calls. Session runtime fees are added on top of token costs — a dual billing model. Misunderstanding this structure leads to underestimating Claude’s costs in any Codex agent pricing comparison.

Dual Billing Structure

Claude Managed Agents billing consists of two layers.

  1. Token cost: Same model-specific input/output rates as the standard API
  2. Session runtime: Additional $0.08/session-hour charge

Here’s the cost breakdown for a 1-hour session on Opus 4.6 (50k input tokens, 15k output tokens).

Token cost:
  Input: 50,000 / 1,000,000 × $5.00 = $0.25
  Output: 15,000 / 1,000,000 × $25.00 = $0.375
Session runtime: 1 hour × $0.08 = $0.08
──────────────────────────────────
Total cost: $0.705

With prompt caching applied, this drops to approximately $0.525. Caching alone can reduce costs by over 25%.

Additional Tool Charges

The Claude API has additional billing items beyond tokens.

Tool Billing Method
Web search $10 per 1,000 queries
Web fetch No additional cost (token cost only)
Code execution tool 1,550 hours/month free, $0.05/hour after

Adding web search to an agent drives costs up quickly. If an agent makes 50 web search calls in a single session, that alone adds $0.50. Web fetch is free, so when the URL is already known, using web fetch instead of web search saves money.

Code execution tool free tier
1,550 hours per month works out to roughly 51 hours per day. That’s generous for a solo developer, but teams running agents at scale could exceed it. The overage rate of $0.05/hour isn’t steep, but monitoring is still advisable.

Gemini Code Assist Pricing Structure

Google Gemini Code Assist takes a fundamentally different billing approach from the two agents covered above. It’s a per-user flat rate, not a per-token charge.

Enterprise Edition

According to the Gemini Code Assist pricing page, the Enterprise edition costs $45/user/month (with discounts for annual commitments). New customers automatically receive a free credit for up to 50 licenses for the first month.

The advantage of flat-rate pricing is predictability. Ten developers cost $450/month, fifty cost $2,250/month. The amount is fixed regardless of usage, so there’s no end-of-month billing spike like with token-based models.

The downside is that per-token comparisons become impossible. There’s no way to track costs at the granularity of "this task used N tokens and cost $X" the way Codex or Claude can.

Standard edition pricing unconfirmed
The exact monthly rate for Google Gemini Code Assist Standard edition couldn’t be confirmed from official documentation (dynamic rendering issue). It’s certainly cheaper than Enterprise, but the exact price requires contacting Google Cloud sales.

Flat Rate vs Token Billing — Which Teams Benefit

Usage Pattern Favorable Billing Model Reason
Few developers, heavy usage Flat rate (Gemini) Fixed cost with no usage cap
Many developers, light usage Token-based (Codex/Claude) No usage means no charges
Teams with volatile usage Token-based Cost savings during low periods
Teams needing budget predictability Flat rate Fixed monthly cost

Direct Token Price Comparison — The Core of Codex Agent Pricing

![token-price-direct-comparison]({{image:token-price-direct-comparison|Minimalist line art of two side-by-side weighing scales on a clean white background, each scale pan holding stacked coin towers of different heights, thin precision grid lines connecting the scales to a central bar chart, single teal accent color on the taller bar, landscape orientation, no text, no letters, no words, no characters}})

Putting the token rates of all three agents side by side. Since Gemini Code Assist is flat-rate and has no per-token pricing, only OpenAI Codex and Claude are included in the token comparison.

Lightweight Model Comparison: codex-mini-latest vs Claude Haiku 4.5

A comparison of the lightweight models primarily used for everyday code generation and autocomplete in coding agents.

Item codex-mini-latest Claude Haiku 4.5
Input rate $1.50/MTok $1.00/MTok
Output rate $6.00/MTok $5.00/MTok
Cache hit input $0.375/MTok (75% discount) $0.10/MTok (90% discount)
Input/output rate ratio Baseline (1.0x) Input 0.67x, Output 0.83x

Haiku 4.5 is cheaper on both input and output. The difference becomes dramatic with cache hits. Haiku 4.5’s cache hit input rate of $0.10/MTok is approximately 3.75x cheaper than codex-mini-latest’s $0.375/MTok. In agent workflows with high prompt caching hit rates, this gap compounds.

High-End Model Comparison: Claude Opus 4.6 vs Codex (GPT Family)

Claude Opus 4.6 Standard:
  Input $5.00/MTok  |  Output $25.00/MTok

Claude Opus 4.6 Batch API:
  Input $2.50/MTok  |  Output $12.50/MTok

codex-mini-latest:
  Input $1.50/MTok  |  Output $6.00/MTok

Opus 4.6 is 3.3x more expensive on input and 4.2x on output compared to codex-mini-latest. However, these models sit at different performance tiers. Opus 4.6 is a flagship model while codex-mini-latest is a lightweight model. A fair comparison would require GPT-5.4 or GPT-5.3-Codex pricing, which isn’t publicly available in official documentation.

Rather than using a single model for all tasks, routing by task type is more cost-effective. Simple tasks like code autocomplete and lint fixes go to Haiku 4.5 or codex-mini-latest, while architecture design and complex refactoring get routed to Opus 4.6.

Monthly Cost Simulation — Single Developer Basis

![monthly-cost-simulation-solo-developer]({{image:monthly-cost-simulation-solo-developer|Isometric 3D illustration of a lone developer silhouette seated at a floating desk, three transparent cost meter columns rising beside the desk at different heights with coin stack icons inside each column, calendar grid beneath showing monthly billing cycle, cool slate and amber palette, landscape orientation, no text, no letters, no words, no characters}})

Monthly costs are estimated under identical usage assumptions. No official benchmark data exists for actual task-to-cost ratios per agent, so this is a simulation based on assumed token usage.

Assumptions

Monthly usage assumptions (single developer):
  Daily agent sessions: 5
  Average input tokens per session: 50,000
  Average output tokens per session: 15,000
  Working days per month: 22
  ─────────────────────────
  Monthly total input: 5 × 50,000 × 22 = 5,500,000 tokens (5.5 MTok)
  Monthly total output: 5 × 15,000 × 22 = 1,650,000 tokens (1.65 MTok)
  Monthly total session hours: 5 × 1 hour × 22 = 110 hours

Monthly Cost Comparison by Scenario

Agent/Model Input Cost Output Cost Additional Cost Monthly Total
codex-mini-latest (API) $8.25 $9.90 $18.15
codex-mini-latest (50% cache) $5.16 $9.90 $15.06
Claude Haiku 4.5 $5.50 $8.25 $13.75
Claude Haiku 4.5 (50% cache) $3.03 $8.25 $11.28
Claude Opus 4.6 $27.50 $41.25 $68.75
Claude Opus 4.6 Batch $13.75 $20.63 $34.38
Claude Managed Agents (Opus) $27.50 $41.25 $8.80 (runtime) $77.55
Gemini Code Assist Enterprise $45.00 (flat)
ChatGPT Pro $100 (incl. Codex) $100.00 (flat)

For the 50% cache hit rate scenarios, cache hit input uses the discounted rate while the remaining 50% uses the standard rate. The $8.80 additional cost for Claude Managed Agents is calculated as 110 hours × $0.08/session-hour.

Interpreting the Simulation Results

On pure API token costs alone, Claude Haiku 4.5 comes out cheapest. With caching applied, it’s $11.28/month — roughly 25% cheaper than codex-mini-latest with caching ($15.06). But this comparison has caveats.

First, model performance isn’t equal. Haiku 4.5 and codex-mini-latest may differ in code generation quality, and quality gaps translate to rework costs. No official benchmark data comparing cost per identical task exists yet.

Second, the ChatGPT Pro $100 plan includes all ChatGPT features beyond just Codex. While it looks expensive as a pure coding agent cost, the perceived value changes when factoring in GPT-5 access, web browsing, and image generation.

Third, Gemini Code Assist at $45/user has no usage cap. For developers who use more than the "5 sessions/day" assumed in this simulation, the flat rate becomes more favorable.

Codex Agent Pricing Comparison — Selection Criteria by Billing Model and Next Steps

The key takeaway from this AI coding agent pricing comparison isn’t "how much" but rather "which billing model fits the team’s usage patterns."

Token-based billing (Codex API, Claude API) scales with usage, so costs drop during slow periods. The flip side is budget overrun risk during usage spikes. Flat-rate billing (Gemini Code Assist) is predictable but charges the same amount even in light months. The subscription + credit hybrid (ChatGPT Pro) guarantees a certain usage level within a fixed monthly fee, but the overage structure needs to be clearly understood.

Decision tree:

1. Team of 5 or fewer, direct API calls → Claude Haiku 4.5 / codex-mini-latest
2. Team of 10+, fixed budget required → Gemini Code Assist Enterprise
3. Solo developer, uses GPT features beyond coding → ChatGPT Pro $100
4. Async bulk processing (code review, migration) → Claude Batch API
5. Real-time agent automation → Claude Managed Agents (factor in session costs)

The detailed credit policy for Claude Code Max subscriptions ($100/$200) hasn’t been confirmed in official documentation yet. It may follow a subscription + credit model similar to ChatGPT Pro, and this comparison table could change once finalized.

AI coding agent costs aren’t determined by unit rates alone. The token volume per agent session, cache hit rate, session length, and frequency of additional tool calls all affect total cost. Before adopting an agent, logging API usage for about two weeks and plugging actual numbers into the simulation above is the most accurate way to forecast costs.

Once the OpenAI Codex pricing structure settles, the next topic to examine is agent orchestration. Building a routing structure that assigns different models to different task types — rather than using a single model for everything — can significantly change costs. Diving deeper into Claude Managed Agents’ cost structure also makes session design optimization an important topic. And since vendors may release official Codex agent pricing benchmark data in the second half of 2026, this comparison will need an update at that point.

Scroll to Top