Most teams that say they have a content agent have a prompt: a long, carefully tuned instruction that produces a decent first draft of a blog post, an email, or a landing page. That is useful. It is not an agent, and the gap is where most of the value sits.
The line is not fuzzy. Anthropic, in its widely cited guide to building effective agents, draws it cleanly: a workflow runs the model and its tools through paths a developer wrote in advance, while an agent lets the model direct its own process and decide how to accomplish the task. A prompt produces output. An agent pursues a goal, takes steps, checks its own work, and corrects course. What turns a prompt into a content agent is a loop with four stages: plan, draft, critique, publish. This piece is about how that loop is built and why each stage carries weight.
Where the loop comes from
The pattern is not invented for content. It is borrowed from a few years of work on making language models reliable at multi-step tasks.
In October 2022, a team led by Shunyu Yao published ReAct, which showed a model does better when it interleaves reasoning with action: think a step, take a step, observe the result, think again. That interleaving is the spine of every agent loop since. In early 2024, Andrew Ng named four design patterns he expected to drive agent progress: reflection, tool use, planning, and multi-agent collaboration. Reflection matters most here. It is where a system critiques and revises its own output instead of handing over the first thing it generates.
Anthropic's guide formalizes a related pattern it calls evaluator-optimizer: one model call generates a response, a separate call evaluates it against criteria, and the feedback goes back to the generator in a loop. Anthropic is specific about when this is worth doing. It works when you have clear evaluation criteria and when iteration measurably improves the result. Content writing fits both. You can usually say what a good draft looks like, and a second honest pass almost always beats the first attempt.
So the loop is not novel. It is the standard agent loop, with the stages named for what they do in a content workflow. What matters is why all four are load-bearing.
The four stages, and what each one is for
Plan
The plan stage takes a goal, say "we need a post on generative engine optimization," and turns it into a structured brief before any prose gets written. The plan decides the angle, the working title, the section structure, the key claims the piece must support, the search intent it serves, and the sources it should draw on.
This stage exists for a reason that becomes obvious once you skip it. A model asked to write a 2,000 word article in one shot produces something fluent and shapeless. It hedges, repeats itself, and buries the point. Planning first forces the hard decisions, what the piece argues and how it is organized, into a small, reviewable artifact. Rewriting a one-paragraph brief costs nothing. Rewriting a finished draft costs a whole loop.
The plan is also where the agent's first tool use happens. A planning step worth its name does not invent an angle from the model's memory. In a real content agent it pulls signal: what already ranks, what the brand has published, what the audience is asking. Wiring those tools in is the subject of Part 2. For Part 1, the point is structural. The plan is an explicit, inspectable output, not a hidden internal monologue. Anthropic's guide names transparency, showing the agent's planning steps, a core principle. You should be able to read the plan and disagree with it before a word is drafted.
Draft
The draft stage writes the piece against the brief. This looks most like the single-prompt approach everyone knows. The difference is everything around it: the draft is not the final output, it is an input to the next stage.
Two things make the draft stage work as part of a loop. First, it writes against the plan, so the structure is decided and the model spends its effort on sentences, not scaffolding. Second, it is allowed to be imperfect. When the draft is the end of the line, every weakness ships. When it feeds a critique stage, it can be a solid 80 percent effort, because the loop will find and fix the missing 20.
Critique
The critique stage is the one that separates an agent from a generator, and it is the stage teams most often cut. It should not be.
Here the system evaluates the draft against explicit criteria and produces specific, actionable feedback. Does the piece deliver on the brief. Is every claim supported. Does it match the brand voice. Is the structure tight or does section three sag. Are there statements that read as confident but are unverified. The output is not a score. It is a list of concrete problems with locations.
The critical design choice is that the critique should be a separate model call with a separate instruction, not the draft-writing call asked to "review your work." Anthropic's evaluator-optimizer pattern and Ng's reflection pattern both stress the separation: a generator and an evaluator with explicitly different roles beat single-LLM self-review. A model that just wrote a paragraph is primed to defend it. The same paragraph handed to a fresh call, with a checklist and no authorship attachment, gets a harsher and more useful read. The critique call should get the criteria, the brief, and the draft, and nothing else. It should not see the reasoning that produced the draft, because that biases it toward approval.
The critique then feeds back to the draft stage, which revises. That return path is the loop. It runs until the critique finds nothing material, or until a fixed iteration cap is hit. The cap matters, and we will come back to it.
Publish
The publish stage takes an approved draft and ships it: formats it, adds metadata, places images, schedules or posts it. In a fully wired content agent this stage calls real publishing tools, which Part 2 covers in depth. The important thing about publish is what sits in front of it: it is the one stage that should never run without a human gate immediately before it.
Where the human gates sit
A content agent is not an autonomous content factory, and the teams running these systems well do not treat it as one. LangChain's 2025 State of Agent Engineering report, which surveyed 1,340 practitioners in late 2025, found that very few teams let an agent read, write, and delete freely. Most allow read-only tool access or require human approval before any action that writes or deletes. Publishing content is a write action with your brand's name on it. It gets a gate.
The strong default is one mandatory gate, placed between critique and publish. The agent plans, drafts, and runs the critique loop on its own, then presents a finished, self-reviewed draft to a person, along with its own critique notes and the brief it worked from. The human approves, edits, or sends it back. Nothing reaches publish without that approval. The agent removes the production work; the human keeps control over what goes live.
Heavier workflows add a second, lighter gate after the plan stage. For a sensitive topic, a regulated claim, or a high-stakes campaign, approving the brief before any drafting happens is cheap insurance: it catches a wrong angle when fixing it still costs nothing. Tier the gates by risk, so a routine post gets one gate before publish and a piece making a competitive or compliance-relevant claim gets a gate after planning too.
What you should resist is the instinct to gate every stage. A checkpoint at every step turns the agent back into a slow assistant and gives away the reason to build it. One 2026 study of content teams found the manual content cycle runs a median of about 4.7 days, with only about one of those days spent on actual editing and QA. The rest is status chasing and routing. An agent earns its place by collapsing that overhead. Place gates where a mistake is expensive and irreversible. Publishing qualifies. A first draft does not.
Context and brand voice: the part that is mostly plumbing
A content agent that produces generic copy has a context problem, not a creativity problem, and teams keep trying to fix it with better adjectives instead.
The complaint is well documented. eMarketer's 2026 content marketing coverage reports that AI content flooding channels is now marketers' single biggest industry concern, because it makes brand differentiation harder. MarTech's analysis of why AI content feels generic lands on the mechanism: a model with no specific context defaults to a neutral, averaged voice that sounds like nobody in particular, because the average of all writing is what it was trained to produce. The instruction "write in our brand voice" does not fix this. It is too loose to constrain anything.
The architectural fix is to treat brand voice as a supplied artifact, not a request. Somewhere in your system there should be a concrete voice definition: rules, do and do not lists, real examples of on-brand and off-brand sentences, the things the brand never says. That artifact is injected into the draft stage as context, and the checkable part of it becomes a criterion in the critique stage. Voice gets enforced twice, going in and coming out.
This connects to a discipline the agent-building field calls context engineering. Anthropic's guide to effective context engineering defines it as curating the smallest set of high-signal tokens that get the job done. More context is not better. Models suffer what the guide calls context rot, a measurable decline in output quality as the token count climbs, because attention is a finite budget that thins across a longer input. So you do not dump your entire brand bible, every past post, and forty source documents into one prompt. Each stage gets only the brief, artifacts, and source material it needs, and that scoping is what keeps the output sharp.
How the loop avoids the common failure modes
A loop with a feedback path can fail in ways a straight-line prompt cannot, and a good architecture anticipates the specific failures. Agent failure modes are well catalogued, and the same few recur.
The infinite loop. A critique stage that always finds something to fix and a draft stage that keeps revising can spin without ever finishing. The defense is a hard iteration cap, commonly two or three rounds. If the loop hits the cap without the critique going quiet, it stops and escalates to a human rather than churning. The cap is not a fallback. It is required.
Error compounding. An agent that makes a small mistake early, a misread brief or a wrong fact pulled in planning, can carry that error through every later step as if it were sound. The critique stage is the structural defense against this, and the brief gate adds a second catch upstream.
Context poisoning. A hallucinated fact that enters early influences everything downstream. The mitigation is the scoped-context discipline above, plus making factual accuracy a named critique item.
Evaluation gaming. When you optimize a draft against a critique, you risk copy that satisfies the checklist while missing the actual goal, the content equivalent of writing for the rubric instead of the reader. Anthropic's alignment team published work in November 2025 showing reward hacking, models exploiting the grader rather than doing the task, can spread into broader misaligned behavior. The practical guard is to keep the critique criteria about genuine quality, and to keep a human as the final evaluator. The gate before publish is the backstop against a draft that gamed its own critique.
What actually makes this an agent
The plan, draft, critique, publish loop is an agent, and a long clever prompt is not, for three concrete reasons. It pursues a goal across multiple steps instead of producing one output. It uses tools to act in the world, in the planning and publishing stages, rather than only generating text. And it corrects its own work through the critique loop. Multi-step goal pursuit, tool use, and self-correction are the lines the agent-design literature draws, and a content agent built this way has all three.
That does not mean every job needs one. Anthropic's guide says to start simple and add agentic structure only when a simpler approach has fallen short, because agents trade latency and cost for capability. For a one-off paragraph, a single prompt is still the right tool. The loop earns its complexity when content production is continuous, when brand voice matters across high volume, and when workflow overhead has become the real cost. Gartner expects 60 percent of brands to use agentic AI for one-to-one interactions by 2028, so the direction is clear.
Part 2 wires the loop into a real stack: connecting the planning stage to analytics and SERP data so briefs are grounded in what audiences actually search, and connecting the publish stage to live publishing APIs. A content agent wired to powerful tools but missing its critique stage is just a faster way to publish work nobody reviewed.
Council summary
This post argues that a content agent is a four-stage loop, plan, draft, critique, publish, and that a single tuned prompt, however good, is not an agent because it lacks the loop. The load-bearing claims are structural: the critique must be a separate model call, the iteration count must be capped, and a human gate belongs before publish. The council verified every cited source, including the ReAct paper (October 2022), Anthropic's reward hacking research (November 2025), the LangChain survey of 1,340 respondents, and Gartner's 60 percent by 2028 figure. We fixed a misattribution that credited eMarketer with the 4.7-day content cycle stat, which comes from a separate 2026 content operations study, and softened an overstated line about marketers finding AI content generic. The takeaway: build the loop and its gates before wiring in tools, because a content agent missing its critique stage just publishes unreviewed work faster.
Comments