Codex Weekly Limit Drained: 5 Fixes and a Drop-In API That Caps Spend (2026)
(updated )

Codex Weekly Limit Drained: 5 Fixes and a Drop-In API That Caps Spend (2026)

On May 17, 2026, a Plus user watched their Codex weekly meter drop from 96% remaining to 0% in a single day—and the OpenAI rep who acknowledged the incident still could not promise the counter would reset until the natural weekly window. If your weekly cap dies twice a month, the right move in 2026 is not to keep refreshing /status—it is to wire Codex CLI to a pay-per-token endpoint and cap spend at the wallet, not at the calendar.

This article walks the specific fix path for the drained weekly cap: a drop-in OpenAI Responses-compatible API, configured in one block of ~/.codex/config.toml, with three patterns for keeping the monthly bill bounded. For the full config reference (custom providers, headers, model identifiers), see Configure Codex CLI with a Custom API Endpoint.

Is Your Codex Quota Actually Out? The 30-Second Diagnosis

Before you change anything, confirm the meter is truly the problem and not a connection or model error masquerading as a quota error.

SymptomWhat /status showsWhat it actually meansFirst move
Banner: “You’ve hit your weekly limit”weekly: 0% remainingWeekly cap exhausted, 5-hour may still have headroomTry a non-weekly-metered route (drop-in API) or spend a banked reset
Banner: “5-hour limit reached”5h: 0% remaining / weekly > 0%Short-window throttle onlyWait, switch task to non-CLI work, or run the same prompt against drop-in API
Error: usage_limit_reachedweekly + 5h both > 0%Out-of-sync counter bug (May 2026 known issue)Restart CLI; if persists, raise with OpenAI status and use drop-in fallback
Error: Unsupported wire_apiprovider mismatchCustom provider does not speak Responses APISwitch model or add a Responses translator

Start the Codex REPL with codex and type /status first. If it says the weekly meter is non-zero but you still cannot start a session, you are looking at the out-of-sync bug pattern that OpenAI’s Tibo publicly acknowledged on X in May 2026—and reaching for a drop-in API is faster than waiting on a counter reconciliation.

When to Apply These Fixes (and When to Just Wait)

Not every drain warrants a config change. Use this gate before touching config.toml.

When to fix now (configure a drop-in API):

  • Your weekly meter dies more than once per calendar month and you are mid-shipping.
  • You hit the cap on a Friday afternoon and the natural reset lands after your sprint ends.
  • You are a Plus user staring at a 5h: 0% remaining immediately after a fresh reset—the May 2026 desync pattern.

When to wait:

  • You are within 24 hours of the natural weekly reset and your work is not urgent.
  • You have a banked reset available (per third-party reports, eligible accounts since around June 12, 2026) and the remaining work fits in one window.
  • You are on Pro and the cap drop is below 25%—the throughput headroom on Pro generally absorbs single-sprint surges.

Stop rule: if your monthly Codex billable equivalent (subscription + overflow) exceeds two Pro seats, you should be on a metered API permanently, not bouncing between subscription tiers. Read the pricing-math section first—if the math is decisive, skip the rest.

Understanding Codex Usage Limits: 5-Hour, Weekly, and Credits

Codex stacks four meters, and the failure mode depends on which one fires.

MeterScopeReset cadenceWhat drains it
5-hour windowCLI + cloud task messagesRolling 5 hoursBursty active sessions, multi-turn refactors
Weekly capSame pool, broader windowRolling 7 daysSustained daily work, long autonomous runs
CreditsPlan-eligible, account-sideRefilled per plan termsExtends weekly when supported by your plan
Banked rate-limit resetAccount-side tokenOnce spent, refills via referral or plan grantOne-shot counter clear

A few details the official docs only hint at:

If you want the full meter breakdown, see How Codex Usage Limits Work.

Why the Weekly Cap Drains Faster Than the 5-Hour

The single most counterintuitive thing about Codex’s meter shape in 2026 is that the weekly cap is the one that surprises you, not the 5-hour. Three structural reasons:

Cloud tasks count differently. A Codex CLI session sending a quick edit to a local file consumes a small slice. The same prompt routed as a cloud task with multi-step planning and tool use can multiply the weekly billable equivalent without triggering the 5-hour ceiling, because the cloud task’s work happens outside the rolling local window but still lands on the weekly accumulator.

Reasoning-heavy variants compound. The Codex-tuned variants that score highest on refactor benchmarks—the ones you reach for on Friday afternoon when the work matters—are also the ones that burn the most per call. A single autonomous-run prompt of 30+ minutes on the highest-tier variant can equate to dozens of conventional Plus messages in weekly-budget terms.

Meters reconcile asynchronously. The 5-hour reflects local intent. The weekly reflects server-side reconciliation after credits, cloud tasks, and any plan-side adjustments. The May 2026 desync incident OpenAI publicly acknowledged is the visible failure of this reconciliation—but normal day-to-day usage also shows minor sync gaps that occasionally compound into a “where did 30% of my week go?” moment.

The practical takeaway: do not budget by 5-hour observation. Read the weekly meter as the actual ceiling, and the 5-hour as a per-burst rate-limit guardrail. If you treat the weekly meter as the planning unit from day one, you stop being surprised on Wednesday.

How to Recover When the Weekly Limit Drains (By Tier)

Free / Go Tier

flowchart LR
    A[Weekly drain] --> B{Banked reset available?}
    B -->|Yes| C[Spend it now]
    B -->|No| D{In referral window<br/>June 11-24, 2026?}
    D -->|Yes| E[Invite up to 3 friends]
    D -->|No| F[Switch CLI to drop-in API]
    C --> G[Resume work]
    E --> G
    F --> G

The Go tier has the smallest weekly cap. The free banked reset that ships with eligible accounts is your single highest-leverage move; spend it on a session you cannot defer.

Plus Tier

MoveWhen to pick itEffort
Spend banked resetYou have one and the remaining work fits in one weekly window5 seconds
Activate eligible creditsYour account shows credits on the dashboard30 seconds
Switch to drop-in APIDrains happening twice or more per month3 minutes one-time setup
Upgrade to ProYou consistently exhaust weekly within 3 days of resetPer OpenAI plan page

Plus is the meter most vulnerable to the May 2026 desync incident—the 96%→0% in a day thread is dominated by Plus reports. If you see this pattern recur, treat the drop-in API not as a fallback but as the primary route.

Pro Tier

Pro carries the highest weekly headroom but is not immune. In the same May 2026 incident, a Pro user reported their weekly limit dropping from 100% to 60% in one hour without heavy work. The recovery path is identical to Plus, except the Pro 20x tier holds out longer against single-prompt damage.

For account-side state checking, run codex then type /status to see the in-session 5h / weekly counters, and compare against the OpenAI usage dashboard at platform.openai.com/usage. The in-session meter is the client-side view; the dashboard is the server-side reconciliation. When they diverge by more than 10 points, you are looking at the desync pattern OpenAI’s Tibo acknowledged in May 2026 — restart the CLI, and if the gap persists, switch to the drop-in API path below rather than waiting on the counter.

Codex Weekly Limit Incidents in 2026: What Actually Drained Faster

DatePlanWhat happenedSource
April 28, 2026All paid plansAccount-wide rate-limit reset event (planned)OpenAI community announcement
May 17, 2026PlusUser afaqak: weekly 96%→0% in a single day with minimal usageCommunity thread #1381172
May 18, 2026ProUser 3rtech: weekly 100%→60% in one hour without heavy workSame thread
May 18, 2026PlusUser Brian_Henderson: 0% remaining in 5-hour window post-restartSame thread
May 20, 2026PlusUser minifi: drain isolated to one Codex-tuned model variantSame thread
~June 12, 2026 (third-party reports)Go / Plus / Pro / BusinessBanked-reset feature begins rolling out — each account reportedly gets 1 free reset; no first-party announcement page foundPasquale Pillitteri news brief

The pattern: the meters are reconciled server-side and counter desync is the failure mode you will hit most often. The drop-in API removes that failure mode entirely because the meter lives on your wallet.

The Drop-In API Fix: Configure Codex CLI in 5 Lines

The February 2026 wire-protocol change matters here. Codex CLI dropped Chat Completions support and now speaks only the OpenAI Responses API. Any provider you point it at must expose /v1/responses. ofox.ai’s documented Codex integration sets wire_api = "responses" exactly because of this constraint.

Step 1: Get an API key

Sign in to your provider, create a key, and export it.

export OPENAI_API_KEY="ofx_live_..."

Step 2: Edit ~/.codex/config.toml

Add a [model_providers.<id>] block. The provider id openai is reserved—use any other label.

model = "openai/gpt-5.4-mini"
model_provider = "ofox"

[model_providers.ofox]
name = "ofox.ai Responses API"
base_url = "https://api.ofox.ai/v1"
wire_api = "responses"
env_key = "OPENAI_API_KEY"

Step 3: Verify with /status and a single call

Start the REPL and check the active model and meters, then send a one-shot in non-interactive exec mode:

codex            # opens REPL, then type /status at the prompt
codex exec "print hello in python"

A successful one-shot means the wire protocol matched. If you see Unsupported wire_api the model you picked does not support Responses—choose a Codex-tuned variant from the ofox model marketplace where the Responses tag is shown, or read the deeper walk-through in How to Use Any Model with Codex CLI.

Step 4: Set the per-session model override (optional)

For one-off complex refactors, swap models without editing config (Codex CLI accepts -m / --model; pair it with exec for a single non-interactive run):

codex exec -m openai/gpt-5.5 "refactor the auth middleware"

If you need every config-block field explained, the Codex CLI config.toml deep dive is the canonical reference, and the Codex CLI API configuration guide covers the env-var-only path if you prefer to skip the TOML.

Common Errors During Setup

ErrorCauseFix
Unsupported wire_apiModel does not implement /v1/responsesPick a Codex-tuned variant; the marketplace tags supported models
Authentication failedTrailing slash in base_url or wrong env varUse exactly https://api.ofox.ai/v1; confirm env_key matches the exported var
Provider id reservedUsed openai, ollama, or lmstudio as the block labelRename the block, e.g. [model_providers.ofox]
Model not foundProvider prefix wrongAlways include the provider prefix in the model id, e.g. openai/gpt-5.4-mini not gpt-5.4-mini
Connection resetNetwork/proxy between CLI and endpointDrop corporate CA proxies for the test; retry without VPN

Gotchas When Toggling Between Subscription and Drop-In

A few subtle behaviors only show up the first time you bounce between modes:

  • /status still reports subscription meters even when model_provider points at a drop-in. The CLI does not surface drop-in account balance in /status—check your provider dashboard for that. If the dashboard shows healthy balance but /status shows weekly 0%, you are reading the subscription meter, not your drop-in.
  • Auth tokens are shared via the same env var. If you keep an OpenAI key in OPENAI_API_KEY and then swap to a drop-in key under the same var, every subsequent CLI invocation goes to the drop-in until you swap back. There is no warning. Use distinct shell aliases (use-plus, use-ofox) if you toggle daily.
  • Per-project config beats global. A ./codex.toml in the project root overrides ~/.codex/config.toml. Commit the team drop-in config at the project level so engineers cannot accidentally route through their personal Plus while debugging shared code.
  • Model swap mid-session does not refresh tools. If you type /model mid-session and pick a different variant (e.g. openai/gpt-5.5) from the popup, the new model picks up where the previous one left off but cached tool descriptions are not regenerated. Restart the session after a flagship swap if you see tool-use regressions.

Capping Spend on a Drop-In API: 3 Patterns

A pay-per-token route only beats a subscription if you actually keep the bill bounded. Three patterns, ordered by enforcement strength.

Pattern 1 — Prepaid wallet ceiling (hardest stop)

Top up the account with a fixed amount (say $20). When the wallet hits zero, the API refuses calls. This is the only fix that survives operator error because the stop is enforced upstream, not in your local config.

Sanity-check the remaining balance from the provider dashboard before each top-up rather than from a local script—dashboard numbers are authoritative and you avoid drift between local cache and account state.

Pattern 2 — Tier downgrade per task

Use the cheapest viable model per task. The Codex CLI model flag is per-invocation, so a wrapper script that picks tier by command intent keeps cost-per-call honest.

codex_tiered() {
  case "$1" in
    refactor|migrate) codex exec -m openai/gpt-5.5 "${@:2}" ;;
    *) codex exec -m openai/gpt-5.4-mini "${@:2}" ;;
  esac
}

The downgrade is invisible to the codebase—it lives entirely in shell.

Pattern 3 — Daily budget cron

Cap daily spend with a local accumulator. The accumulator resets at midnight via cron, and a wrapper aborts the call once the per-day ceiling is reached.

# ~/.codex/budget.sh
TODAY=$(date +%F)
SPENT_FILE=~/.codex/spent.$TODAY
DAY_CAP_USD="${DAY_CAP_USD:-3.00}"
spent=$(cat "$SPENT_FILE" 2>/dev/null || echo 0)
awk -v s="$spent" -v c="$DAY_CAP_USD" 'BEGIN{exit !(s<c)}' || {
  echo "Codex daily cap $DAY_CAP_USD reached. Wait or raise DAY_CAP_USD." >&2
  exit 1
}
codex "$@"

The math is honest only if you actually log per-call cost back into $SPENT_FILE after each invocation—wire a post-call hook to do so.

PatternWhere stop is enforcedRisk of overrunBest for
Prepaid walletUpstream accountNone (hard stop)Solo dev, fixed monthly budget
Tier downgradePer-invocation model flagMedium (no aggregate ceiling)Mixed-task workloads
Daily budget cronLocal shell wrapperHigh (local-only, bypassable)Team shared shells with accountability

Picking Between the Three

If you only adopt one, pick prepaid. It is the only mechanism that survives bad days—Slack tabs open, terminal forgotten, a runaway loop on a Friday before vacation. The wallet ceiling is not your future self’s discipline; it is upstream enforcement that your future self cannot override under stress.

Layer the others on top:

  • Prepaid wallet alone catches the “I forgot the meter” failure mode.
  • Prepaid + tier downgrade catches the “this task did not need the flagship variant” failure mode and stretches the wallet by 3-5x in practice.
  • All three together gives you a per-day soft cap that flags surprises in the same day they happen, while still trusting the wallet as the hard backstop. This is the recommended stack for any team where Codex is mission-critical.

A note on tier downgrade specifically: do not chase the cheapest model for refactor tasks. The cost gap between the smallest Codex-tuned variant and the flagship is often less than the cost of one debugging round caused by a degraded response. Use the flagship for refactors and migrations; reserve the smaller variant for boilerplate generation, formatting, and one-line edits where the gap rarely shows.

Team / Multi-Developer Configuration

The patterns above scale to teams by hoisting the cap to the provider account rather than each engineer’s shell. Three habits worth committing:

  1. One shared ofox account per team, distinct API keys per engineer—lets you revoke individuals without rotating everyone, and the wallet ceiling applies to the whole pool.
  2. Pin the team config.toml in dotfiles—commit a sanitized version to the team dotfiles repo so every engineer’s Codex CLI lands on the same provider/wire/model defaults; only the API key stays per-engineer in ~/.config/credentials (not the repo).
  3. Weekly spend digest—the provider’s usage export can feed a Slack digest each Monday so spikes surface within days instead of at month-end. Wire this once and you never debug a phantom $400 bill again.

If your team has 5+ engineers in the shared Codex pool, How to Configure Codex CLI with a Custom API Endpoint covers the env-var-only setup that is easier to deploy via Ansible than the TOML.

Migrating a Mid-Project Codebase Without a Stall

The realistic team-side situation is this: half your engineers are mid-sprint with Codex CLI configured against their personal Plus subscription; the team account has Pro and just hit the weekly. You cannot ask everyone to stop and reconfigure. Three moves keep the sprint alive:

  1. Promote the team API key to a personal env-var override—engineers add export OPENAI_API_KEY=$TEAM_OFOX_KEY and export OPENAI_BASE_URL=https://api.ofox.ai/v1 to their shell rc, no config.toml change needed. Codex CLI’s env-var path overrides personal subscriptions for the session.
  2. Reserve the flagship model for the engineer with the longest-running refactor—the rest of the team uses the smaller Codex-tuned variant. This is the cheapest possible recovery posture for a single sprint day.
  3. Audit which sessions actually need overflow tomorrow morning—if the team weekly drain was a one-off (a single engineer’s autonomous run), revert the env vars and stay on the subscription. If it is the pattern, commit the team config.toml to dotfiles in the next standup and stop bouncing.

The mistake most teams make is treating the drop-in API as a panic move and reverting the moment the weekly resets. The right framing is the opposite: subscription is the default for predictable solo work; the metered API is the default for sustained team work. The weekly drain is just a signal that you are on the wrong default.

When the Drop-In API Is Down: Best Alternatives That Work Now

AlternativeWire protocolCodex CLI readyWhen to pick it
ofox.aiResponses + Chat CompletionsYes, marketplace tags Responses-capable modelsMixed coding workloads, pay-per-token with per-model price visibility (Codex integration docs)
OpenRouterResponses surface via routerYesYou want one bill across many providers, accept the router markup
Direct OpenAI APINative ResponsesYesYou want the exact same model identities as ChatGPT, accept full retail pricing
Self-hosted (LiteLLM gateway)Translate Chat→ResponsesWith translatorYou already run a gateway and need to route from arbitrary Chat-only backends
Wait for natural resetn/an/aYour work fits in the next weekly window with banked-reset headroom

For the routing-aware deep dive on cross-provider failover, see Configure Codex CLI with a Custom API Endpoint and How to Use Any Model with Codex CLI.

How to Monitor Codex Status and Get Alerts

Three layers, increasing fidelity.

  1. Official status page: bookmark status.openai.com—the meter desync incidents land there within hours of community reports.
  2. /status in-session: type it at the Codex REPL prompt (after launching codex) to print active model, 5h, and weekly remaining. Cheap to call; run it before every long task.
  3. Account dashboard polling: a 5-minute cron hitting your usage endpoint catches drift between /status and the server-side counter (the May 2026 desync pattern). Pipe to Slack on weekly < 10%.

For deeper polling, the OpenAI usage endpoint exposes per-account aggregate consumption that you can scrape on a cron — see platform.openai.com/usage for the dashboard view and the OpenAI Help Center note on Codex plan limits for what each meter actually represents. Cross-checking the dashboard against in-session /status is the single most useful signal for catching the May 2026 reconciliation-gap pattern early.

What This Article Is Not About

A few scope boundaries so you can route to the right reference:

If your weekly Codex cap dies twice a month, switching to a metered API and capping the prepaid wallet is not a downgrade—it is switching from a buffet to a takeout menu where you only pay for the dishes you actually order, and the meter only ticks on the work that ships.

Sources Checked for This Refresh