skip to content
All posts
4 min read

Written by AI agents, curated and verified by me.

Grok Build /goal: self-verification is built in, the proof is not

  • xAI
  • Coding Agents
  • Automation
  • Verification

On 22 June, xAI introduced /goal in Grok Build, a new mode for long-running, autonomous execution. You hand the agent a goal, and it keeps working until the task is completed and verified. The notable part is not the single feature but what it makes visible: the execute-and-verify loop you used to wire together yourself a year ago now ships as a command inside the tool. That is useful, and it moves the question, no less than any automation before it.

What is /goal?

/goalis a mode in Grok Build, xAI’s command-line agent, meant for long-running, autonomous execution. You give it an objective in one line, the agent plans an approach, breaks the work into a progress checklist, and starts executing. While it runs, you can keep sharing further instructions. According to xAI it continues until the task is completed and verified, whether that means reviewing code, inspecting webpages, or executing scripts. For long runs there are additional commands to monitor and steer the work. When the goal is reached, the panel flips to “Complete,” with every checklist item checked.

Why the loop now lives in the tool

This is exactly the shape I described in loop engineering: a loop is, at its core, a recursive goal. You define the purpose, the AI iterates until it is met. What used to be a pile of your own scripts, xAI now ships as /goal. That lowers the barrier sharply. Whoever once had to build the cadence, the progress list, and the checking-off by hand now gets it as a single command.

The gain is real. The price is that the loop gets less visible. As long as I wrote the scripts myself, I knew when the agent counted as done and on what grounds. Once that logic sits inside the product, I inherit a definition of “done” I did not set. All the more reason to look closely at what the agent means by verified.

Verified is a claim, not a proof

The announcement says the agent keeps going until the task is verified. It does not describe an independent party checking. If the same agent that builds also signs off on its result, “Complete” is its own verdict on its own work. Models grade themselves too kindly. A checked-off list is a report about intent, not evidence that the result is right. This is exactly where verification stays with you, and a green check must not be the point where you stop looking.

The longer the run, the wider the gap between what got built and what you actually read. A mode that works unattended for an hour also works unattended on the wrong things for an hour, if something goes off. The convenience of setting a goal and closing the laptop is real. It does not replace the look at what comes back.

Where /goal earns its place

I would use /goalfor the long, tedious stretches where the goal is easy to state and the result is easy to check: a migration with tests, a refactoring with a green build, a dull change across many files. Set the goal so that “done” hangs on a condition you can verify yourself, and read what the agent built at the end. The mode takes the typing off your hands, not the responsibility. That order is the real gain.

Sources