How to Use Any Model with Codex CLI: Custom OAI-Compatible Provider Setup
TL;DR
Codex CLI ships with one built-in endpoint — OpenAI’s. Everything past that is your problem. The [model_providers.<id>] block in ~/.codex/config.toml is the official lever: declare a provider once, attach the right wire_api, and you can switch between GPT-5.3 Codex, Claude Sonnet 4.6, and DeepSeek V3.2 from a single terminal — no shell-rc gymnastics.
This guide skips the OPENAI_BASE_URL shortcut covered in our Codex CLI API configuration guide and walks the config-file path: one file, multiple providers, profile-based switching, and the pitfalls that bite real setups.
Why the env-var trick stops working
The two-line shell-rc trick (OPENAI_API_KEY + OPENAI_BASE_URL) is fine for a single endpoint. It breaks the moment you want to:
- Keep OpenAI direct and an OpenAI-compatible gateway active in the same terminal
- Run one project against GPT-5.3 Codex and another against Claude Sonnet 4.6 without re-sourcing your rc
- Inject a non-standard auth header (some gateways require
X-Project-Idor rotate Bearer tokens) - Tune
request_max_retriesper provider so a flaky upstream doesn’t poison your default
Each of those needs a real config file. Codex CLI reads ~/.codex/config.toml on every invocation, and the [model_providers.<id>] table is where it expects custom endpoints to live (OpenAI Codex config reference).
Anatomy of a model_providers block
The full table accepts roughly a dozen keys. The five that actually matter most days:
[model_providers.ofox]
name = "ofox.ai gateway"
base_url = "https://api.ofox.ai/v1"
env_key = "OFOX_API_KEY"
wire_api = "chat"
request_max_retries = 4
base_url— points at the API root, ending in/v1for OpenAI-compat gateways. No trailing slash. The path Codex appends depends onwire_api.env_key— the environment variable name Codex reads at runtime for the Bearer token. Do not hard-code keys in TOML.wire_api—"responses"makes Codex POST to/responses(OpenAI’s newer endpoint)."chat"makes it POST to/chat/completions. Third-party OpenAI-compatible gateways universally implement the latter, so for ofox.ai, OpenRouter, DeepSeek direct, etc., use"chat".http_headers— static headers merged into every request. Useful for organization scoping or regional routing.env_http_headers— same, but the value is read from an env var at request time. Use this for tokens that rotate.
Two corollaries that catch people:
- Reserved IDs
openai,ollama, andlmstudioare taken — your custom provider needs a different name. requires_openai_auth = falseremoves Codex’s assumption that the key prefix issk-. Most gateways need this set explicitly.
A working ofox.ai setup (copy this)
Create or edit ~/.codex/config.toml:
model = "openai/gpt-5.3-codex"
model_provider = "ofox"
[model_providers.ofox]
name = "ofox.ai"
base_url = "https://api.ofox.ai/v1"
env_key = "OFOX_API_KEY"
wire_api = "chat"
requires_openai_auth = false
Export your key once:
export OFOX_API_KEY=<your-ofox-key>
Verify with a trivial run:
codex "list every TODO in src/ and group them by file"
If you see a model response, you’re done. If you see 404 Not Found, your wire_api is wrong (you sent /responses to a gateway that only serves /chat/completions, or vice versa).
Swapping the model per command
The model key in the top of config.toml is the default. Override per call:
codex --model anthropic/claude-sonnet-4.6 "review this PR for race conditions"
codex --model deepseek/deepseek-v3.2 "translate this Bash script to Python"
codex --model openai/gpt-5.4-pro "design a Postgres schema for an audit log"
This works because ofox.ai routes by model string inside one OpenAI-compatible endpoint — Codex doesn’t know it’s talking to three different vendors. The model IDs above are the ones published in ofox’s catalog; verify the exact string before pasting (vendors rename, the catalog updates).
Profiles: the cleanest multi-stack pattern
Switching --model is fine for one-offs. For frequently-used combinations, define profiles. Each profile bundles model + provider + reasoning effort + sandbox policy under one name:
[profiles.codex-fast]
model = "openai/gpt-5.3-codex"
model_provider = "ofox"
model_reasoning_effort = "low"
[profiles.review]
model = "anthropic/claude-sonnet-4.6"
model_provider = "ofox"
model_reasoning_effort = "high"
[profiles.bulk]
model = "deepseek/deepseek-v3.2"
model_provider = "ofox"
Then:
codex --profile codex-fast "generate unit tests for utils/parse_url.go"
codex --profile review "audit src/auth/ for token leakage"
codex --profile bulk "rewrite README in plain English"
The mental model: profile = “stack.” --model = “tweak one knob.” Stop typing five flags every invocation.
Multiple providers in one config
There’s no rule against declaring several. A practical setup keeps OpenAI direct for sensitive prompts and ofox.ai for everything else:
[model_providers.ofox]
name = "ofox.ai"
base_url = "https://api.ofox.ai/v1"
env_key = "OFOX_API_KEY"
wire_api = "chat"
requires_openai_auth = false
[model_providers.openai-direct]
name = "OpenAI direct"
base_url = "https://api.openai.com/v1"
env_key = "OPENAI_API_KEY"
wire_api = "responses"
Switch with --config:
codex --config model_provider=openai-direct --model gpt-5.4 "..."
codex --config model_provider=ofox --model deepseek/deepseek-v3.2 "..."
The same trick covers a self-hosted vLLM rig at http://10.0.0.5:8000/v1 (wire_api = "chat", no auth) for local-only work.
Auth that isn’t a static Bearer
If your gateway hands out short-lived tokens, the static env_key model is wrong. Codex supports an auth sub-table that shells out to a token-fetcher command on a refresh interval:
[model_providers.corp]
name = "Internal proxy"
base_url = "https://llm.corp.internal/v1"
wire_api = "chat"
[model_providers.corp.auth]
command = "/usr/local/bin/corp-token"
args = ["--audience", "codex"]
timeout_ms = 5000
refresh_interval_ms = 300000
Codex re-invokes the command every five minutes and uses the stdout as the Bearer token. This is the right shape for AWS SigV4, Azure managed identity, or any OIDC bridge. Don’t hack it with env_http_headers and a cron job — there’s a sanctioned hook.
The five mistakes I keep seeing
- Trailing slash on
base_url.https://api.ofox.ai/v1/will work sometimes depending on the gateway; the spec is no trailing slash. Match the docs exactly. wire_api = "responses"against a Chat-only gateway. You get a404 /responses not found. Set it to"chat".- Forgetting
requires_openai_auth = false. Codex tries to validate the key prefix and refuses your gateway’sofox-oror-prefix. Disable the check. - Reusing the
openaiprovider ID. Reserved. Pick a different name. - Hard-coding the key in TOML. Don’t.
env_keyexists for a reason — secrets in a checked-in dotfile is a recurring incident.
Where this fits in the bigger Codex picture
The custom-provider config is one of three lifts you’ll typically need:
- Install (see the complete official Codex CLI install guide)
- Routing (this guide)
- Day-to-day usage patterns (see the real-world Codex CLI workflow)
If you’re choosing between Codex CLI and other terminal agents first, the Claude Code vs Codex CLI vs Cursor vs DeepSeek TUI comparison is the right starting point. And if the gateway question itself is unsettled, why use an LLM API gateway covers the rationale before the config.
Closing
Codex CLI’s custom-provider story used to be undocumented folklore. In 2026 it’s a first-class config block, and that’s the version worth learning — because the moment you have two API keys to juggle, OPENAI_BASE_URL becomes the thing standing between you and a clean setup. Forty lines of TOML buys you a coding stack where the model is a flag, not a tax.


