GPT-5 Codex CLI Guide in 7 Steps — Install, Auth, API Calls, Agent Tasks

Table of Contents

The GPT-5-based Codex model family splits into two axes: the terminal agent (Codex CLI) and the server-side API (Responses API). This article walks through the core path of GPT-5 Codex CLI usage in order — installation, authentication, API calls, and agent task configuration.

With OpenAI releasing the GPT-5-based Codex model family, two axes opened simultaneously: the terminal agent and server-side API calls. Codex CLI is an open-source project (Apache-2.0) written in 95.2% Rust, while GPT-5-Codex models are exclusive to the Responses API. Confusing the two leads to API call errors and cost estimation mistakes at the same time. Let’s break them down one by one.

GPT-5 Codex CLI — Where to Start with Installation

The most common first question is "How do I install Codex CLI?" The answer is straightforward.

npm install -g @openai/codex
brew install --cask codex

There are two installation paths: npm global install and Homebrew cask install. If a Node.js environment is already in place, npm is faster. For those on macOS who want the GUI desktop app bundled in, brew cask is more convenient. As of 2026-04-15, the latest stable version is 0.121.0.

Codex CLI can be used in four ways: terminal (CLI), VS Code/Cursor/Windsurf IDE integration, desktop app (codex app), and web version (chatgpt.com/codex). Pick the one that fits the team’s situation and task type.

After installation, running the codex command moves to the authentication step. The full list of installation options is available in the Codex CLI GitHub repository README.

Version Check and Updates

Quite a few features were added in 0.121.0. Notable changes include marketplace installation (GitHub URL, local directory, direct URL support), Ctrl+R reverse search in TUI prompt history, opt-in parallel MCP and plugin calls, Realtime API output modality support, and bubblewrap sandbox support.

Feature Description When to Use
Marketplace install Install plugins from GitHub URL, git URL, or local directory When deploying custom tools
TUI history Ctrl+R reverse search through previous prompts During repetitive tasks
MCP parallel calls Call multiple MCP server tools simultaneously Multi-tool pipelines
bubblewrap sandbox Isolated file system execution Running untrusted code

Marketplace installation is particularly useful. Previously, plugins had to be placed manually. Starting from 0.121.0, specifying a GitHub URL automatically downloads and registers them. Version-by-version changelogs are available in the Codex CLI release notes.

How to Set Up Codex CLI Authentication

The next question after installation is authentication. Codex CLI supports two authentication methods.

First, the ‘Sign in with ChatGPT’ option. Running the codex command opens a browser-based login flow. This method requires one of the ChatGPT Plus, Pro, Business, Edu, or Enterprise plans. The free plan doesn’t support CLI authentication.

Second, API key method. Set the OPENAI_API_KEY environment variable or add it directly to the config. For server-side automation and CI/CD pipelines, the API key method is effectively the only option.

Choosing an Authentication Method
For personal development environments, ChatGPT login is simpler. For team servers, GitHub Actions, and deployment pipelines, inject the API key as an environment variable. If both methods are configured simultaneously, the API key takes priority.

A common follow-up question here is "Does having ChatGPT Plus mean unlimited API calls?" — it doesn’t. ChatGPT plan authentication grants access to use the CLI, while server-side calls through the Responses API require separate API credits. Failing to establish this distinction early leads to confusion in cost estimation.

What Makes the GPT-5-Codex Model Family Different

"Can’t I just use GPT-5? Why is there a separate Codex model?" — another frequent question.

GPT-5-Codex is a model optimized from GPT-5 for agentic coding tasks. Unlike general GPT-5, it’s only available through the Responses API, and the underlying model snapshot receives periodic updates. This model cannot be called from the Chat Completions API.

GPT-5.2-Codex also exists. This model supports reasoning effort at four levels — low, medium, high, and xhigh — and supports function calling, structured output, streaming, and prompt caching.

Item GPT-5-Codex GPT-5.2-Codex
API Responses API only Responses API only
Reasoning effort Supported low·medium·high·xhigh
Function calling Supported Supported
Structured output Supported Supported
Streaming Supported Supported
Snapshot updates Periodic Periodic
Context window, pricing, and max output tokens unconfirmed
These aren’t specified even in official documentation. The model spec page on platform.openai.com returns 403 and is inaccessible directly (as of 2026-04-20), and GPT-5.3-Codex model specs haven’t been officially confirmed either. For cost estimation, checking real-time usage in the OpenAI dashboard is the safer approach.

The key point is that the Codex model family is specialized for "code agent" scenarios. Compared to general conversational GPT-5, it produces more stable responses in agentic workflows like multi-step code modifications, file exploration, and test execution. However, being Responses API-exclusive means it’s incompatible with existing Chat Completions API codebases, so migration costs must be factored in.

How to Call GPT-5-Codex via the Responses API

"Can I just use my existing openai.ChatCompletion code as is?" — no. GPT-5-Codex is Responses API-exclusive. client.responses.create() must be used. This API call pattern is the most practical part of GPT-5 Codex CLI usage.

response = client.responses.create(
  model="gpt-5.4",
  reasoning={"effort": "low"},
  input=[{"role": "user", "content": prompt}]
)

Three things to note in the code above.

First, the client.responses.create() endpoint. This is an entirely different path from Chat Completions’ client.chat.completions.create(). The SDK version must also support the Responses API.

Second, the reasoning parameter. Reasoning intensity is specified like {"effort": "low"}. reasoning.effort is a parameter that controls the number of reasoning tokens the model generates before responding. In Codex variant models, the none value gets automatically converted — to low in Codex/Codex Max and to medium in Codex Mini. In other words, reasoning cannot be fully disabled in Codex models.

Third, the input field. It’s input, not messages from Chat Completions. The structure is similar but the field name differs, so simple find-replace isn’t sufficient when migrating existing code.

Choosing the Right Reasoning Effort Level

Which value to set for reasoning effort depends on the task complexity and cost trade-off.

Effort Reasoning Tokens Suitable Tasks Cost Impact
low Minimal Simple code formatting, lint fixes Low
medium Moderate General code refactoring, bug fixes Medium
high High Architecture design, complex algorithms High
xhigh Maximum Large codebase analysis, multi-file agent tasks Maximum

Starting with low and stepping up when result quality falls short is the practical approach. xhigh consumes significant tokens, so setting up cost alerts when embedding it in CI/CD automation is advisable. The OpenAI Reasoning guide covers the effort parameter’s behavior in detail.

Full Responses API code example unverified
The full code example for calling GPT-5-Codex via the Responses API hasn’t been directly verified against official documentation. The code snippet above is based on the official guide’s reasoning example, but the API reference should be checked before production deployment.

Controlling Agent Tasks with AGENTS.md

Once Codex CLI is installed and authenticated, the next question is "How do I give different instructions per project?" The answer is AGENTS.md.

AGENTS.md is a markdown file that provides project-specific custom instructions to Codex. Place it at the project root and Codex CLI reads it automatically. The file search order is defined:

AGENTS.override.md → AGENTS.md → TEAM_GUIDE.md → .agents.md

AGENTS.override.md takes the highest priority, followed by AGENTS.md, then TEAM_GUIDE.md, and finally .agents.md. A clean pattern is putting shared team guides in TEAM_GUIDE.md and personal overrides in AGENTS.override.md.

AGENTS.md Read Behavior Control

Two config keys control AGENTS.md read behavior.

  • project_doc_max_bytes — Maximum bytes when reading the AGENTS.md file. If the file is too large, it gets truncated.
  • project_doc_fallback_filenames — Customizes the default filename list. File names can be changed to match internal organizational conventions.

The content inside AGENTS.md is free-form, but four patterns have proven effective in practice:

  • Description of the project’s tech stack and key directory structure
  • Coding conventions (naming, import order, test rules)
  • Prohibitions (banning specific libraries, blocking certain patterns)
  • Task scope restrictions (files Codex should never touch in this project)
project/
├── AGENTS.md           ← Codex default instructions
├── AGENTS.override.md  ← Personal override (recommend gitignore)
├── TEAM_GUIDE.md       ← Team shared guide
├── src/
│   ├── api/
│   └── db/
└── tests/

The more specific the project context in AGENTS.md, the higher Codex’s code modification accuracy and the lower the rate of unintended file changes. Conversely, running Codex without AGENTS.md means it modifies code without project context, resulting in inconsistent output quality.

config.toml and MCP Server Integration

"How do I connect an MCP server to Codex?" — the answer is config.toml.

The Codex CLI config file is located at ~/.codex/config.toml. To use different settings per project, create .codex/config.toml at the project root. Project settings override the global settings.

Here’s an MCP server integration configuration example.

[mcp_servers.{name}]
command = "server-executable"
supports_parallel_tool_calls = true
default_tools_approval_mode = "approve"

Breaking down each field:

  • command — MCP server executable binary path. Specify npx, uvx, or a custom-built binary.
  • supports_parallel_tool_calls — Setting this to true enables parallel MCP calls added in 0.121.0. The speed benefit is significant in pipelines that need to call multiple tools simultaneously.
  • default_tools_approval_mode"approve" requires user approval before tool calls. It can be changed to "prompt" in automation pipelines, but this decision requires careful consideration from a security standpoint.

Environment Variables and Additional Config Keys

Behavior can also be controlled via environment variables in addition to config.toml.

Environment Variable Role
CODEX_SQLITE_HOME Storage path for Codex’s internal SQLite database
CODEX_HOME Codex home directory path (default: ~/.codex)
CODEX_CA_CERTIFICATE Custom CA certificate path (for corporate proxy environments)

CODEX_CA_CERTIFICATE is essential when behind a proxy on a corporate network. Omitting it causes SSL certificate errors that make all API calls fail. The full list of config keys is available in the Codex CLI configuration documentation.

Per-project config.toml Usage Pattern
Set authentication and default MCP servers in the global `~/.codex/config.toml`, and configure project-specific MCP servers and approval_mode in the per-project `.codex/config.toml`. Config keys like `plan_mode_reasoning_effort` allow tuning reasoning intensity to match each project’s characteristics.

Codex CLI Practical Workflow Setup

"Installation, auth, and config are all done — what does the actual workflow look like?" — this section covers GPT-5 Codex CLI usage from a practical workflow perspective.

The basic execution of Codex CLI starts with a single codex command. Running it in the terminal launches the TUI (Terminal User Interface), and entering a prompt triggers the agent to perform the task. Starting from 0.121.0, Ctrl+R enables reverse search through previous prompts, making repetitive tasks significantly more convenient.

IDE Integration vs Direct Terminal Usage

Codex can be used integrated into IDEs like VS Code, Cursor, and Windsurf, or run directly from the terminal. Which is better depends on the type of work.

Direct terminal usage is better for:

  • Batch operations that traverse multiple repositories for bulk modifications
  • Running automated tasks in CI/CD pipelines
  • Working on remote servers accessed via SSH

IDE integration is better for:

  • Immediately reviewing code modification results as diffs
  • Interactive work alternating between file exploration and code modification
  • Running debugging and Codex tasks in parallel

The web version (chatgpt.com/codex) has the advantage of being usable directly in a browser without installation. However, local file system access is limited, so CLI or IDE integration is more suitable for handing actual project code to the agent.

Using the bubblewrap Sandbox

The bubblewrap sandbox added in 0.121.0 is useful when running code with unverified trust levels. It isolates the file system so Codex can’t touch files outside the designated directory. It serves as a security layer when applying Codex to open-source projects or when reviewing external contributors’ PRs with Codex.

Sandbox Environment Limitations
Inside the bubblewrap sandbox, network access and certain system calls may be restricted. When having the Codex agent run integration tests that require database connections, either adjust the sandbox settings or run those specific tasks outside the sandbox.

Expanding GPT-5 Codex CLI Usage — Additional Configuration Areas

With the basic flow of GPT-5 Codex CLI usage established, it’s time to move into expansion areas.

Connecting multiple MCP servers to Codex CLI enables bundling database queries, external API calls, and file system operations into a single agent task — going beyond simple code modifications. Combined with the supports_parallel_tool_calls = true setting, multi-tool calls execute in parallel, significantly improving pipeline speed.

To go deeper into OpenAI Codex usage, refining AGENTS.md writing patterns is key. Moving beyond simply stating "this project uses Python" and embedding test coverage thresholds, prohibited patterns, and review checklists into AGENTS.md brings the Codex agent’s output quality close to manual code review levels. Adjusting plan_mode_reasoning_effort in the Codex config.toml per project also helps balance cost and quality.

Comparing Codex vs Claude Code is another topic that invariably comes up in technology selection. Both tools share the commonality of being terminal agents, but they differ in model backend, sandbox implementation, and MCP integration approach. When factoring in Responses API-based server-side automation, the selection criterion becomes which one aligns with the team’s tech stack. Measuring the performance differences across GPT-5.2 Codex reasoning effort levels on actual tasks helps identify the optimal cost-to-quality sweet spot.

Scroll to Top