Codex Goal Mode & Remote Computer Use: How OpenAI's Agent Can Code for Days
TL;DR. On May 21, 2026 OpenAI moved two Codex features to GA: Goal Mode (a persistent /goal directive that survives session breaks and budget resets) and Locked Computer Use (the desktop agent keeps driving Mac apps after your screen locks, even when triggered from Codex Mobile). Pair them with gpt-5.3-codex and a verifiable success criterion, and you can hand Codex a real engineering objective like “ship the v2 checkout endpoint with the benchmark green” and walk away. The unlock isn’t longer prompts. A coding agent now treats time as a resource you can budget, instead of a wall you have to babysit through.
OpenAI shipped both features the same day, in Codex CLI 0.133.0 and the matching IDE and desktop builds (changelog). I have been running Goal Mode against real repos for the past week, and the gap between “demo-friendly” and useful turns out to be how you write the goal, not how patient you are.
What Goal Mode actually changes about your prompt
Goal Mode replaces the per-turn instruction with a persistent objective Codex re-reads every cycle. The slash-command surface is tiny:
# Set or replace the active goal
/goal Reduce p95 checkout latency below 120 ms on the checkout
benchmark while keeping the correctness suite green
/goal # view current goal
/goal pause # stop the loop, keep the state
/goal resume # pick back up where it stopped
/goal clear # discard the goal entirely
The shape of the goal matters more than the wording. The OpenAI cookbook recommends <desired end state> verified by <specific evidence> while preserving <constraints> — three slots in that order (Using Goals in Codex). Drop any one of them and the agent either drifts (no end state), reward-hacks (no verification), or breaks something it should not have touched (no constraints).
Concretely, this fails:
/goal Make the code more elegant
This works:
/goal Migrate this codebase from Pydantic v1 to v2, verified by
`pytest -q` exiting 0 and `mypy --strict src/` exiting 0,
while preserving all public API signatures listed in
docs/public_api.md
The second version gives Codex something to measure. The agent will write, run the suite, read the diff between expected and actual, write again, and stop when both commands exit zero — or stop and surface the blocker if it cannot reach that state.
Stopping conditions are explicit and small: success, /goal pause, /goal clear, user interruption, a repeated blocker the agent cannot get past, or your plan’s usage limit. Nothing else stops the loop, which is why a verifiable success criterion matters more than it used to — without one, the loop only ever stops on the cost side. (For the full setting reference, see the Codex CLI config.toml deep dive.)
”Code for days” means something specific
The phrase “code for days” gets misread as “one giant uninterrupted session.” It is not that. Goal Mode persists the objective across:
- Session breaks. Close the terminal, come back tomorrow, run
/goal resume, and the agent picks up from the last verified state. - Token budget resets. When the rolling budget rolls over (daily for most plans), the active goal survives and the agent continues.
- Interruptions. Ctrl-C, app crashes, Mac restarts. The goal is journaled to disk; Codex 0.133+ rehydrates it on next launch.
What you get is a multi-session objective layer. A migration that would have eaten three afternoons of one-shot prompts now runs as one coherent thread. The cost model is unchanged: every reasoning turn still costs the same per-token rate against gpt-5.3-codex. The coordination cost drops to roughly zero, which is where most of the wall-clock savings live.
I tested this against a real repo migration (Pydantic v1 → v2 on a 14k-line internal service). Total wall time: about 31 hours across four sessions. Total Codex token spend at gpt-5.3-codex rates: roughly $44. The same migration done by hand-prompting the agent would have taken me two full focused days of supervision. Here I checked in three times.
Locked Computer Use: the actually-controversial half
Computer Use itself shipped earlier in 2026 — Codex could already operate GUI apps when the Mac was unlocked and you were watching. The May 21 update added two things (Computer Use docs):
- Continued operation after the screen locks, so a Goal Mode loop that needs to drive a desktop app doesn’t stall the moment your screensaver kicks in.
- Triggering from Codex Mobile, so you can hand the agent a task from your phone and let it drive the Mac you left at your desk.
The safety model is more interesting than the feature. When you enable Locked Use, Codex installs an Apple authorization plugin that participates in the macOS unlock flow (macrumors writeup). During a task:
- The Mac is temporarily unlocked, but every display stays covered. The lock screen stays painted while Codex drives in the background.
- The authorization window is short-lived and scoped to the current unlock attempt. There is no standing grant.
- If you touch the keyboard, trackpad, or mouse, the Mac immediately relocks and disables auto-unlock until you manually unlock again.
- Codex asks before operating each new app. You can mark frequently used apps “Always allow.”
- It cannot drive Terminal apps, Codex itself, or system-level admin prompts. These are hard-coded exclusions to prevent privilege escalation through GUI automation.
The launch carve-outs: the feature is unavailable in the EEA, UK, and Switzerland at launch, and a few categories of apps are blocked outright by Apple’s automation policy regardless of your settings.
If you have not already enabled regular Computer Use, you need to grant Screen Recording and Accessibility permissions to Codex through System Settings first. The plugin install only adds the locked-screen layer on top.
A real Goal Mode loop, end to end
Here is what the loop looks like in practice for the migration example above. Start in your project root:
$ cd ~/work/orders-service
$ codex
# Inside the TUI:
> /goal Migrate this codebase from Pydantic v1 to v2, verified by
`pytest -q` exiting 0 and `mypy --strict src/` exiting 0,
while preserving all public API signatures in docs/public_api.md
Codex acknowledges the goal, runs an initial scan, and proposes a plan. From this point you can:
- Walk away. The loop will run until success, blocker, or budget exhaustion.
- Hand off to Locked Computer Use for any GUI step (running the migration tool’s optional desktop wizard, screenshotting a failing CI dashboard, etc.) and lock your Mac.
- Trigger a status check from Codex Mobile while you are away from the laptop.
When you come back, /goal shows you the current state: what has been verified, what is still pending, what the last blocker was. /goal pause lets you intervene without clearing context.
A reasonable starter config in ~/.codex/config.toml for goal-driven work:
model = "gpt-5.3-codex"
model_provider = "ofox" # or "openai" if going direct
[model_providers.ofox]
name = "ofox.ai"
base_url = "https://api.ofox.ai/v1"
env_key = "OFOX_API_KEY"
wire_api = "responses"
Goal Mode itself has no per-session token or iteration cap exposed in config.toml — the documented stopping levers are slash commands (/goal pause, /goal clear), a detected repeated blocker, and your plan’s usage limit. The practical knob, then, is the usage cap on whichever provider you point Codex at. At gpt-5.3-codex rates of $1.75 input / $14 output per million tokens (confirmed via the OpenRouter listing), a single mostly-output multi-hour session can easily run $30-80, so the cap you set on your OpenAI or ofox account is the actual budget guardrail, not a TOML key.
Why route Codex through ofox.ai
Goal Mode hammers the model. A multi-day objective routinely makes hundreds of reasoning turns, and the bill is dominated by gpt-5.3-codex output tokens at $14/M. Three reasons to pipe the requests through a unified gateway instead of straight to OpenAI:
- Single key for the side models. Goal Mode loops typically delegate cheap sub-tasks (summarization, classification, regex generation) to a smaller model. With one ofox.ai key you can route the hot path to
gpt-5.3-codexand the cold path togpt-5.4-miniordeepseek-v4-flashwithout juggling provider credentials. Same pattern Codex CLI already supports viamodel_provider— just point it at ofox. - Spend visibility per goal. Tag your sessions with a custom header and the dashboard shows per-goal cost, not per-day cost. Useful when you want to know whether the Pydantic migration really was worth $44.
- Failover on
gpt-5.3-codexoutages. Long-horizon goals are exactly the workloads that get burned by a 20-minute provider blip. ofox falls back automatically; a direct OpenAI key just errors out and forces you to/goal pauseuntil things recover.
If you are still on a single-vendor setup, the Codex CLI configuration guide walks through the gateway switch, and How to Use Any Model with Codex CLI covers the model_providers block in detail.
When not to use Goal Mode
Three disqualifiers worth being honest about:
- You cannot write a verification command. If success means “the design feels right” or “make the code more elegant,” Goal Mode will either declare premature victory or churn forever. Use one-shot prompts instead.
- The work needs human judgment every few turns. Goals are designed for autonomy. If you need to approve every change, you are paying for context Codex never gets to use. Run
claude --permission-mode planor a one-shot Codex session — cheaper, faster. - You are doing something destructive at scale. Database migrations, mass
git push --force, anything that touches production. Goal Mode is great at unattended convergence. It is not great at unattended judgment about when not to act. Keep the agent sandboxed to a worktree, setapproval_policyto require approval on shell commands, and prefer goals whose verification surface is a dry-run rather than a live mutation.
For the broader picture of where Codex fits among coding agents, the Claude Code vs Codex CLI vs Cursor vs DeepSeek TUI comparison walks through the trade-offs, and the agentic coding overview frames the category.
The shape of the next year
Goal Mode plus Locked Computer Use is the first credible “set a goal, lock your laptop, check tomorrow” coding loop I have used in production. The agent isn’t smarter than it was last month. The friction is just gone, and that changes which engineering tasks are worth handing to a model at all. A coding agent that survives screen locks, budget resets, and your dinner break is a different kind of tool from one that needs you in the chair.
The caveat that matters: hours of attended Goal Mode work is reliable today, but days of fully unattended work still depends on how verifiable your goal is. The discipline of writing a goal with a real evidence surface is the skill now, not the prompt-craft of any single turn.
Sources & further reading
- Codex Changelog — May 2026 — official release notes for Goal Mode GA and Locked Computer Use.
- Using Goals in Codex — cookbook with goal syntax and worked examples.
- Computer Use — Codex App — official safety model and platform constraints.
- MacRumors: Codex Can Use Your Mac When Locked — independent writeup of the unlock flow.
- GPT-5.3-Codex on OpenRouter — pricing and context window reference.


