The Hidden Problem With Agentic AI Workflows

The Death of the "Artisanal" Prompt

Mar 21, 2026

∙ Paid

There’s a version of AI use that looks like craft. You sit down with a blank chat window, think carefully about what you need, write a prompt that’s been refined over weeks, paste it in, review the output, nudge the follow-up, and get something good. It takes time but you know exactly what happened. You were there for the whole thing.

That is becoming a liability.

Not because careful prompting was ever wrong. It wasn’t. But the way most serious AI users are working is shifting fast, and the habits that made you good at manual prompting are exactly the habits that slow you down when the system runs without you.

What “Agentic” Actually Means in Practice

The word gets thrown around a lot, so a quick grounding.

An agentic workflow is one where AI takes sequences of actions, makes decisions along the way, and produces an output without a human approving every step. You set it going. It runs. You come back to results.

A researcher who sets up a workflow that pulls a list of companies, checks each one for funding news, drafts a personalised outreach email and drops it in a folder for review. Nobody sat there prompting through each step. The system handled it.

The prompts inside that workflow weren’t written in the moment. They were written once, tested, locked in. The system uses them every time it runs.

That’s a fundamentally different relationship with a prompt than what most people have.

The Artisanal Model

When you write prompts manually, for a single task, in real time, you’re compensating constantly. You read the output, notice it went sideways, adjust, try again. The feedback loop is tight and you’re in it.

This works well. It’s also exhausting if you’re doing it at volume, and it doesn’t scale at all.

More importantly, it creates a specific kind of prompt that’s optimised for one person, one session, one moment. Those prompts are full of implicit context that lives in your head. They depend on you being present to course-correct. Pull the human out of the loop and they often fall apart.

I’ve seen this play out with people who’ve built real AI fluency over the past two years. They’re good at prompting. Better than most. And then they try to hand a workflow off to an automated system and discover that their best prompts don’t work when nobody’s watching.

Why Manual Prompts Break in Automation

A few specific failure modes.

They assume repair. Artisanal prompts are written with the assumption that if something’s off, you’ll notice and fix it. Automated workflows have no such fallback. If the prompt produces garbage on step two, step three runs on that garbage, and by step five the output is unrecoverable.

They’re underspecified at the edges. When you’re in the session, you handle edge cases on instinct. The prompt doesn’t need to account for every scenario because you’ll deal with outliers as they come. An automated system hits an outlier and either fails or, worse, produces something plausible-looking but wrong.

They rely on conversational context. A lot of effective manual prompting happens across a multi-turn conversation. The second and third messages are doing as much work as the first. Automated workflows often run single-turn, or in sequences where each step starts fresh. What worked across six messages in a chat window doesn’t translate to a single system prompt.

They contain the prompter’s personality. Not a bug when you’re the one reading the output. But when the output is feeding into another system, or going to a client, or triggering a downstream action, idiosyncratic style becomes unpredictable variation.

The Prompt as Infrastructure

The mental shift that matters is this: in an agentic workflow, a prompt is infrastructure. Not a message. Not a creative act. Infrastructure.

Infrastructure needs to be robust, not optimised for the best case. It needs to handle bad inputs and produce predictable outputs. It needs to be readable and maintainable by someone other than the person who wrote it. It needs to degrade gracefully when something goes wrong.

None of those things are what people optimise for when they’re writing prompts in the moment. They optimise for quality in the current session. That’s the right instinct for manual work. It’s the wrong instinct for systems.

A prompt that’s infrastructure looks different. It has explicit handling for unexpected inputs. The instructions assume no ambient context. It prioritises consistency over peak quality. And it’s usually simpler than you’d expect, because complexity in infrastructure is risk.

The new skill is writing prompts that are defensive rather than optimistic.

What’s Actually Changing

Two things are happening simultaneously and they’re pushing in the same direction.

The models are getting better at following instructions with less hand-holding. The things you needed to spell out explicitly a year ago, the model now infers correctly most of the time. This is reducing the value of elaborate manual prompts even in single-session use.

And the tooling for agentic workflows is maturing fast. n8n, Make, Zapier, Claude’s new Projects and tool-use features, GPT’s Assistants API. The infrastructure to build automated AI pipelines is no longer the hard part. Writing prompts that actually work inside those pipelines is where people are getting stuck.

The bottleneck has moved. It used to be “can I get a good output from this model.” It’s now “can I write a prompt that gets a good output reliably, without me there.”

The Skills That Transfer (and the Ones That Don’t)

Most of what you know about prompting is still useful. Understanding how models interpret instructions, knowing how to specify an output format, being able to decompose a complex task into steps. All of that carries over.

What doesn’t transfer is the intuitive, in-session adjustment. The craft of reading an output and knowing exactly what one word to change in the follow-up. That skill is real but it’s a manual skill. It has no place in a workflow that runs at 2am while you’re asleep.

The new skill is writing prompts that are defensive rather than optimistic. Prompts that assume something will go wrong and build in ways to handle it. Prompts that are explicit about failure modes. Prompts that you’d be comfortable with a colleague reading, editing and maintaining, because in a real automated system that’s eventually what happens.

Optimistic vs. Defensive: A Quick Example

Say you’re automating a workflow that reads customer feedback and categorises each response as positive, neutral or negative.

The optimistic prompt looks something like this:

AI Prompt Hackers