What is Codex Goal Mode?

Goal Mode is a persistent objective layer on top of Codex. You set a single `/goal` directive (with a desired end state, verification surface, and constraints) and Codex keeps pulling toward that goal across turns, budget resets, and pauses — until it succeeds, hits a blocker, or you stop it. It reached general availability on May 21, 2026 in Codex CLI 0.133.0, the IDE extension, and the Codex app.

What is Locked Computer Use (Remote Computer Use)?

Locked Computer Use lets the Codex desktop agent keep driving macOS apps after your Mac locks — including when you trigger it remotely from Codex Mobile. It installs an Apple authorization plugin, temporarily unlocks the Mac while keeping the displays covered, scopes the unlock to the active task, and relocks the moment it detects local keyboard or trackpad input. It is unavailable in the EEA, UK, and Switzerland at launch.

Which Codex model should I run on a multi-day goal?

Use `gpt-5.3-codex` for the actual agent loop — it is OpenAI's latest agentic coding model with a 400K context window and pricing of $1.75 per million input tokens and $14 per million output tokens (as of May 2026). For cheap reasoning side tasks, route to `gpt-5.4-mini` or `gpt-5.3-chat`. Through ofox.ai you can use all three behind a single OpenAI-compatible endpoint.

Can Codex really code unattended for days?

Practically: hours of attended Goal Mode work is reliable today; days of unattended work requires (a) a verifiable goal — `all tests pass`, not `make the code elegant` — and (b) hard budget caps. Codex stops automatically on success, pause, clear, interruption, or budget exhaustion. The 'days' framing comes from goals that span multiple Codex sessions and token budget resets, not from a single uninterrupted run.

Codex Goal Mode & Remote Computer Use: How OpenAI's Agent Can Code for Days

TL;DR. On May 21, 2026 OpenAI moved two Codex features to GA: Goal Mode (a persistent /goal directive that survives session breaks and budget resets) and Locked Computer Use (the desktop agent keeps driving Mac apps after your screen locks, even when triggered from Codex Mobile). Pair them with gpt-5.3-codex and a verifiable success criterion, and you can hand Codex a real engineering objective like “ship the v2 checkout endpoint with the benchmark green” and walk away. The unlock isn’t longer prompts. A coding agent now treats time as a resource you can budget, instead of a wall you have to babysit through.

OpenAI shipped both features the same day, in Codex CLI 0.133.0 and the matching IDE and desktop builds (changelog). I have been running Goal Mode against real repos for the past week, and the gap between “demo-friendly” and useful turns out to be how you write the goal, not how patient you are.

What Goal Mode actually changes about your prompt

Goal Mode replaces the per-turn instruction with a persistent objective Codex re-reads every cycle. The slash-command surface is tiny:

# Set or replace the active goal
/goal Reduce p95 checkout latency below 120 ms on the checkout
      benchmark while keeping the correctness suite green

/goal           # view current goal
/goal pause     # stop the loop, keep the state
/goal resume    # pick back up where it stopped
/goal clear     # discard the goal entirely

The shape of the goal matters more than the wording. The OpenAI cookbook recommends <desired end state> verified by <specific evidence> while preserving <constraints> — three slots in that order (Using Goals in Codex). Drop any one of them and the agent either drifts (no end state), reward-hacks (no verification), or breaks something it should not have touched (no constraints).

Concretely, this fails:

/goal Make the code more elegant

This works:

/goal Migrate this codebase from Pydantic v1 to v2, verified by
      `pytest -q` exiting 0 and `mypy --strict src/` exiting 0,
      while preserving all public API signatures listed in
      docs/public_api.md

The second version gives Codex something to measure. The agent will write, run the suite, read the diff between expected and actual, write again, and stop when both commands exit zero — or stop and surface the blocker if it cannot reach that state.

Stopping conditions are explicit and small: success, /goal pause, /goal clear, user interruption, a repeated blocker the agent cannot get past, or your plan’s usage limit. Nothing else stops the loop, which is why a verifiable success criterion matters more than it used to — without one, the loop only ever stops on the cost side. (For the full setting reference, see the Codex CLI config.toml deep dive.)

”Code for days” means something specific

The phrase “code for days” gets misread as “one giant uninterrupted session.” It is not that. Goal Mode persists the objective across:

Session breaks. Close the terminal, come back tomorrow, run /goal resume, and the agent picks up from the last verified state.
Token budget resets. When the rolling budget rolls over (daily for most plans), the active goal survives and the agent continues.
Interruptions. Ctrl-C, app crashes, Mac restarts. The goal is journaled to disk; Codex 0.133+ rehydrates it on next launch.

What you get is a multi-session objective layer. A migration that would have eaten three afternoons of one-shot prompts now runs as one coherent thread. The cost model is unchanged: every reasoning turn still costs the same per-token rate against gpt-5.3-codex. The coordination cost drops to roughly zero, which is where most of the wall-clock savings live.

I tested this against a real repo migration (Pydantic v1 → v2 on a 14k-line internal service). Total wall time: about 31 hours across four sessions. Total Codex token spend at gpt-5.3-codex rates: roughly $44. The same migration done by hand-prompting the agent would have taken me two full focused days of supervision. Here I checked in three times.

Locked Computer Use: the actually-controversial half

Computer Use itself shipped earlier in 2026 — Codex could already operate GUI apps when the Mac was unlocked and you were watching. The May 21 update added two things (Computer Use docs):

Continued operation after the screen locks, so a Goal Mode loop that needs to drive a desktop app doesn’t stall the moment your screensaver kicks in.
Triggering from Codex Mobile, so you can hand the agent a task from your phone and let it drive the Mac you left at your desk.

The safety model is more interesting than the feature. When you enable Locked Use, Codex installs an Apple authorization plugin that participates in the macOS unlock flow (macrumors writeup). During a task:

The Mac is temporarily unlocked, but every display stays covered. The lock screen stays painted while Codex drives in the background.
The authorization window is short-lived and scoped to the current unlock attempt. There is no standing grant.
If you touch the keyboard, trackpad, or mouse, the Mac immediately relocks and disables auto-unlock until you manually unlock again.
Codex asks before operating each new app. You can mark frequently used apps “Always allow.”
It cannot drive Terminal apps, Codex itself, or system-level admin prompts. These are hard-coded exclusions to prevent privilege escalation through GUI automation.

The launch carve-outs: the feature is unavailable in the EEA, UK, and Switzerland at launch, and a few categories of apps are blocked outright by Apple’s automation policy regardless of your settings.

If you have not already enabled regular Computer Use, you need to grant Screen Recording and Accessibility permissions to Codex through System Settings first. The plugin install only adds the locked-screen layer on top.

A real Goal Mode loop, end to end

Here is what the loop looks like in practice for the migration example above. Start in your project root:

$ cd ~/work/orders-service
$ codex
# Inside the TUI:
> /goal Migrate this codebase from Pydantic v1 to v2, verified by
        `pytest -q` exiting 0 and `mypy --strict src/` exiting 0,
        while preserving all public API signatures in docs/public_api.md

Codex acknowledges the goal, runs an initial scan, and proposes a plan. From this point you can:

Walk away. The loop will run until success, blocker, or budget exhaustion.
Hand off to Locked Computer Use for any GUI step (running the migration tool’s optional desktop wizard, screenshotting a failing CI dashboard, etc.) and lock your Mac.
Trigger a status check from Codex Mobile while you are away from the laptop.

When you come back, /goal shows you the current state: what has been verified, what is still pending, what the last blocker was. /goal pause lets you intervene without clearing context.

A reasonable starter config in ~/.codex/config.toml for goal-driven work:

model = "gpt-5.3-codex"
model_provider = "ofox"      # or "openai" if going direct

[model_providers.ofox]
name = "ofox.ai"
base_url = "https://api.ofox.ai/v1"
env_key = "OFOX_API_KEY"
wire_api = "responses"

Goal Mode itself has no per-session token or iteration cap exposed in config.toml — the documented stopping levers are slash commands (/goal pause, /goal clear), a detected repeated blocker, and your plan’s usage limit. The practical knob, then, is the usage cap on whichever provider you point Codex at. At gpt-5.3-codex rates of $1.75 input / $14 output per million tokens (confirmed via the OpenRouter listing), a single mostly-output multi-hour session can easily run $30-80, so the cap you set on your OpenAI or ofox account is the actual budget guardrail, not a TOML key.

Why route Codex through ofox.ai

Goal Mode hammers the model. A multi-day objective routinely makes hundreds of reasoning turns, and the bill is dominated by gpt-5.3-codex output tokens at $14/M. Three reasons to pipe the requests through a unified gateway instead of straight to OpenAI:

Single key for the side models. Goal Mode loops typically delegate cheap sub-tasks (summarization, classification, regex generation) to a smaller model. With one ofox.ai key you can route the hot path to gpt-5.3-codex and the cold path to gpt-5.4-mini or deepseek-v4-flash without juggling provider credentials. Same pattern Codex CLI already supports via model_provider — just point it at ofox.
Spend visibility per goal. Tag your sessions with a custom header and the dashboard shows per-goal cost, not per-day cost. Useful when you want to know whether the Pydantic migration really was worth $44.
Failover on gpt-5.3-codex outages. Long-horizon goals are exactly the workloads that get burned by a 20-minute provider blip. ofox falls back automatically; a direct OpenAI key just errors out and forces you to /goal pause until things recover.

If you are still on a single-vendor setup, the Codex CLI configuration guide walks through the gateway switch, and How to Use Any Model with Codex CLI covers the model_providers block in detail.

When not to use Goal Mode

Three disqualifiers worth being honest about:

You cannot write a verification command. If success means “the design feels right” or “make the code more elegant,” Goal Mode will either declare premature victory or churn forever. Use one-shot prompts instead.
The work needs human judgment every few turns. Goals are designed for autonomy. If you need to approve every change, you are paying for context Codex never gets to use. Run claude --permission-mode plan or a one-shot Codex session — cheaper, faster.
You are doing something destructive at scale. Database migrations, mass git push --force, anything that touches production. Goal Mode is great at unattended convergence. It is not great at unattended judgment about when not to act. Keep the agent sandboxed to a worktree, set approval_policy to require approval on shell commands, and prefer goals whose verification surface is a dry-run rather than a live mutation.

For the broader picture of where Codex fits among coding agents, the Claude Code vs Codex CLI vs Cursor vs DeepSeek TUI comparison walks through the trade-offs, and the agentic coding overview frames the category.

The shape of the next year

Goal Mode plus Locked Computer Use is the first credible “set a goal, lock your laptop, check tomorrow” coding loop I have used in production. The agent isn’t smarter than it was last month. The friction is just gone, and that changes which engineering tasks are worth handing to a model at all. A coding agent that survives screen locks, budget resets, and your dinner break is a different kind of tool from one that needs you in the chair.

The caveat that matters: hours of attended Goal Mode work is reliable today, but days of fully unattended work still depends on how verifiable your goal is. The discipline of writing a goal with a real evidence surface is the skill now, not the prompt-craft of any single turn.

What Goal Mode actually changes about your prompt

”Code for days” means something specific

Locked Computer Use: the actually-controversial half

A real Goal Mode loop, end to end

Why route Codex through ofox.ai

When not to use Goal Mode

The shape of the next year

Sources & further reading

Related Articles

Codex Weekly Limit Drained: 5 Fixes and a Drop-In API That Caps Spend (2026)

Codex Mobile App: Monitor & Control Your AI Coding Agent from iPhone or Android (2026)

OpenAI Codex Chrome Extension: May 2026 Launch Guide