Watch now | The next serious agent failure won’t look like a jailbreak.
͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Forwarded this email? Subscribe here for more
Get the full post and max your AI career leverage, plus connect with tens of thousands of paid subscribers in Nate’s Substack chat.
Most AI newsletters tell you what happened. This one tells you what to do about it — without the hype.
Upgrade to paid
Watch now
You gave your AI agent real tools. Here's the 4-part control layer it's missing + the Judge Layer implementation guide
Your agent has opinions. The question is which ones get to leave the building.
Nate
May 11| | | ∙| | Preview
READ IN APP
The next serious agent failure won’t look like a jailbreak. It’ll look like an email sent because the thread seemed to imply approval, a customer record updated because the old value looked stale, a pull request opened because the tests passed and the change looked done. None of that requires the model to misbehave, which is what makes it hard. The risk starts where the product gets useful: when language turns into action.
A chat demo lives in suggestion space. The model drafts, summarizes, answers, proposes, and if it’s wrong, the user rejects it. The cost is local. A production agent lives closer to consequence: it can notify someone, expose private information, change a shared record, trigger a workflow, or spend money. That moves a question to the center of the product demos never had to answer: who decides whether the agent should be allowed to act?
A better prompt doesn’t really answer it. Telling the model to “be careful” doesn’t either. Approval modals technically reduce risk but ruin the workflow. Users either click through out of habit or stop using the system. The answer that’s actually working is architectural: a separate judge wrapped around the actor, deciding whether each proposed action should move forward. If you’re building agents that act, this is the layer of the product you cannot bolt on later.
Here ’s what’s inside:
The Lindy example. How a multi-channel agent product hit the failure mode every production system eventually faces, and the architectural fix that worked.
Why prompting and approval modals both fail. The structural reasons a single prompt can’t pursue a task and police it at the same time.
Orchestration is not judgment. Why coordinating agents and judging their actions are different problems with different homes in the stack.
The builder toolkit. Action classification, proposals, specialist judges, eval, memory governance, and what to build first.
The OpenBrain Judge Extender guide + the prompt kit that builds your first judge. Five prompts that take you from “my agent acts” to a working judge at your highest-risk boundary, plus the full implementation spec for wiring that judge to durable memory, provenance, and structured write-back so it doesn’t start every session from zero.
Start with the team that hit this wall publicly and figured out what to do about it.
Subscribers get all posts like these!...
Watch with a 7-day free trial
Subscribe to Nate’s Substack to watch this video and get 7 days of free access to the full post archives.
Start trial
A subscription gets you:
About 10 posts a week on AI, including videos, guides, and how-to’s
Subscriber-only podcast episodes
Active private substack chat with daily posts on all things AI
Like
Comment
Restack
© 2026 Nate
548 Market Street PMB 72296, San Francisco, CA 94104