Reliable Agent Systems
Deterministic Gates Beat Prompt-Only Control
The more important a rule is, the less it should depend on prompt wording alone.
Failure Mode
One of the clearest lessons from building agent systems is that language is a weak control surface.
You can write a careful instruction:
“Ask discovery questions before writing the spec.”
“Verify outputs after every stage.”
“Do not skip build verification.”
“Never claim completion without runtime testing.”
And still watch the model skip half of it.
Not because it is malicious. Because this is how these systems behave. They compress intent. They pattern-match. They optimize for plausible continuation. They do not naturally respect process boundaries just because the wording was strong.
This is why I have become skeptical of prompt-only control for serious agent systems.
At some point, if you want reliability, the important constraints have to leave the prompt and enter the environment.
That is what deterministic gates do.
Instead of trusting the agent’s own description of whether it completed a stage correctly, the system checks the artifact against a concrete rule. A spec is not accepted because the architect says it is done. It is accepted because a verification step confirms the required sections exist. A build stage is not accepted because the implementor says the code should compile. It is accepted because the build command exits successfully. A review is not accepted because the evaluator sounded thorough. It is accepted because the review contains a status, traceable findings, and evidence.
Control Surface
Diagram
Reliable agent loop
The system only advances when artifacts pass an external check. Failed work routes into diagnosis, not more guessing.
This sounds obvious in hindsight, but it changes the character of the system.
Without deterministic gates, the pipeline is mostly social. Each stage makes claims, and the next stage informally trusts them. With deterministic gates, the pipeline becomes operational. The artifact either passes or it does not.
That distinction matters because models are often too willing to declare progress. They are good at producing the appearance of completion. They are much less reliable at recognizing that a missing section, failed command, or untested runtime path should block forward motion.
Once you move those checks into scripts, an entire class of failure disappears.
It also simplifies the prompts.
You no longer need giant paragraphs insisting that the model please remember to do the right thing. You can let it try, then force the output through a rule that is external and unambiguous.
More generally, I think this points to a broader design principle:
The more important a constraint is, the less it should rely on wording alone.
What Ships
task -> spec -> implement -> verify -> ship
|
+--> diagnose -> retryIf violating it would break trust, break correctness, or create hidden failure, that rule probably belongs in the harness, not just in the prompt.
This is especially true for coding systems because software has a natural ground truth. The code builds or it does not. The test passes or it does not. The app runs or it does not. The UI behaves or it does not. When that reality exists, it is a mistake not to use it.
Prompts still matter. They shape intent, decomposition, local judgment, and how the system handles ambiguity. But prompts should guide. Gates should enforce.
That is a cleaner division of labor.
I think a lot of agent systems become fragile because they ask prompts to do too much. They ask language to carry not just the task, but the policy, the safety boundaries, the structure, the quality bar, and the verification logic. That is too much weight on a medium that is inherently soft.
A better design is usually some mix of:
• prompt for direction
• environment for enforcement
• review against reality
That may sound less elegant than “the agent understands.”
But it works better.
And for systems that are supposed to do real work, that trade is worth making.