← After the Prompt

Read the book · Free

After the Prompt

The Case for Ambient Intelligence and the Organizations That Will Get There First

Jean-Philippe LeBlanc · 2026

Uses your browser’s print dialog. Choose “Save as PDF” for a copy.

Chapter 1

The Confession Hidden in Every Chat Window

Why the Interface Is the Argument

You Won and It Wasn't Enough

You did everything right.

You championed the pilot program. You sat through the vendor demos and asked good questions. You built the business case with projected time savings and ran the numbers past finance twice. You wrote the internal memo. You got executive sponsorship. You trained the team, or at least you forwarded the training videos and blocked time on the calendar for people to watch them. You picked the right use cases, the low-hanging fruit, the things everyone agreed were tedious and ripe for automation. You launched. You measured. You reported back.

The numbers were good. Not spectacular, but good. Fifteen percent faster on first-draft content. Twenty percent fewer hours on the weekly data summary. A handful of developers said they were writing code faster, though they couldn't quite agree on how much faster. The CFO nodded. Your VP said "great work" in a Slack thread. Someone on your team posted a screenshot of a particularly impressive output, and it got thirty-two emoji reactions.

You won.

And it wasn't enough.

Not in any way you can point to in a meeting. Not in any way that shows up in a dashboard. The deployment is a success by every measure you set for it, and yet there's a feeling you carry that you haven't said out loud to anyone. Not to your boss, not to your team, not even to the friend at the other company who's running a similar program and would probably understand.

The feeling is this: you expected transformation and got a tool.

A good tool. A useful tool. A tool that does what it says on the box. But a tool that requires you to pick it up every single time, to think carefully about how to hold it, to phrase your intent just right so it doesn't give you something slightly wrong, and then to check the output because you can't quite trust it without checking. You are, in some hard-to-articulate way, doing more thinking about the work in order to do less of the work itself. And the math on that exchange feels off, even though the spreadsheet says it shouldn't.

Here is what nobody told you: this is the normal outcome.

Not normal as in "expected by the vendors," who told you this would be transformative. Not normal as in "predicted by the analysts," who are busy writing reports about trillion-dollar productivity gains. Normal as in: this is what almost everyone who successfully adopts AI through the current interaction model actually experiences, and almost nobody talks about it because the success metrics are real and the dissatisfaction is hard to name.

You are not failing. You are succeeding at a version of AI adoption that has a ceiling built into its design. And the ceiling is not where you think it is.

It is not in the model. The model is good and getting better every quarter. It is not in your prompts. Your prompts are fine. Some of them are quite clever. It is not in your team's willingness to adopt. They adopted. They use the tools. They've built workflows.

The ceiling is in the window itself. The chat box. The prompt field. The copilot sidebar. The thing you type into and wait for a response from. That interaction model, the one that feels so natural you've probably never questioned it, is doing something to your work that the efficiency metrics can't see.

It is making you the bottleneck.

Not the old kind of bottleneck, where you didn't have enough information or enough time. A new kind. A bottleneck of attention, of translation, of cognitive effort spent converting what you actually need into language a machine can act on. You are the interpreter standing between your intent and the machine's capability, and the interpreter is tired.

This book is about removing the interpreter.

The Interface Is an Argument

Every interface you've ever used made an argument about who you are.

The spreadsheet argued that you think in grids. The filing cabinet argued that you think in categories. The search engine argued that you know the right words for what you're looking for. The smartphone home screen argued that you want to pick an application before you pick a task. These are arguments, not facts. They are design decisions dressed up so well that they feel like nature.

The prompt-and-response interface makes an argument too. It argues that the right way for a human to work with AI is to formulate a request in natural language, submit it, receive a response, evaluate that response, and then decide whether to refine, accept, or start over. This feels reasonable. It feels like conversation. It feels like the most intuitive thing in the world, because humans are built for conversation.

But look closer at what the design actually requires of you.

It requires you to know what to ask for. Before you type anything, you need a goal. You need that goal to be specific enough to produce a useful output. If the goal is vague, the output will be vague, and you will be the one who pays for that vagueness with another round of prompting.

It requires you to translate your goal into effective language. Knowing what you want and knowing how to ask for it are two different skills. The gap between them has a name in some circles: prompt engineering. The fact that this phrase exists, that we now have a discipline devoted to the art of asking a machine for things correctly, should tell you something about how much work the interface is pushing onto the user.

It requires you to evaluate the output. You read what comes back. You check it against your own knowledge. You catch the hallucinations if you can. You decide if the tone is right, the facts are right, the structure is right. This evaluation requires the very expertise the tool was supposed to spare you from needing.

It requires you to iterate. If the output isn't right, you go again. You refine. You add context. You say "no, more like this" or "shorter" or "make it sound less like a robot wrote it." Three rounds of prompting can feel more exhausting than writing the thing yourself, even when the clock says it was faster.

You are the translation layer.

That is the argument the chat window is making about you. It is arguing that you should be the active component in the system. The machine waits. You initiate. The machine responds. You evaluate. The machine sits idle until you return. You are always on.

This is not the only possible argument. It was a choice.

When the telephone was invented, it argued that communication should be synchronous: both people on the line at the same time. Email came along and argued the opposite. Neither was wrong. Both were arguments. And the argument shaped what people did with the technology, which shaped what people expected from it, which shaped what got built next, in a self-reinforcing loop that made the original design choice feel like the only possible one.

The chat window is in that loop right now. We build AI tools with a prompt box. People learn to use the prompt box. They get good at it. They build workflows around it. Vendors see what people are doing and build more features around the prompt box. The prompt box becomes the thing. Any other design starts to feel alien, experimental, impractical.

But the argument the prompt box is making, that you should be the active translator of your own intent every single time, is not an argument that was chosen because it was best for the work. It was chosen because it was available. And the story of how it became available is worth telling, because it reveals just how accidental the whole thing was.

A Brief Genealogy of the Prompt

The command-line prompt is older than most of the people using ChatGPT.

1960s. Teletype terminals connected to mainframes. You typed a command. The machine executed it. You read the output on paper. The interaction was serial, one instruction at a time, because the hardware couldn't support anything else. The machine had no screen. There was no mouse. There was no way to point at a thing and say "that one." You typed. It responded.

This was not a philosophy. It was a constraint.

The people building these systems did not sit in a room and decide that the optimal relationship between a human and a computer is one where the human issues textual commands. They were working with what they had: a keyboard, a wire, and a machine that could process one request at a time. The prompt was the cheapest possible interface for the most expensive possible computer.

Then a funny thing happened. The constraint became a culture.

Unix in the 1970s formalized the command line into something beautiful. Pipes, scripts, composable programs, the whole elegant architecture of small tools that do one thing well. Programmers fell in love with it. For good reason: it was powerful, it was flexible, and it rewarded expertise. The better you knew the commands, the more you could do with less. The prompt became a mark of fluency. People who were good at it were proud of being good at it. A cultural identity formed around the terminal.

Graphical interfaces in the 1980s tried to kill the prompt. The Macintosh, Windows, the graphical paradigm: click, drag, point, select. The argument shifted. Now the interface argued that you should see your options and choose from them, rather than recall commands from memory. This was a different theory of the user: someone who recognizes rather than recalls, someone who wants to act on visible objects rather than type invisible commands.

The prompt didn't die. It retreated to the provinces of system administration, software development, and power-user culture. For forty years, most people interacted with computers through graphical interfaces. They clicked. They dragged. They tapped. The prompt was a specialist tool for specialist people.

Then, in late 2022, ChatGPT put a prompt box in front of three hundred million people.

Think about this. The interaction model that OpenAI chose for the most widely adopted AI product in history was the interaction model of the 1960s teletype terminal. Type a command. Get a response. Type another command. The container was a modern web browser on a modern smartphone. The interaction pattern inside it was sixty years old.

Why?

Not because someone at OpenAI decided that the command-line paradigm was the best way for humans to interact with large language models. Because it was the fastest way to ship. The research lab had been using prompts to test the model. Prompts worked for researchers, who are, by definition, people with high technical fluency and clear experimental goals. The demo became the product. The research interface became the consumer interface. The constraint of the lab became the design of the market.

Every transformative technology goes through this phase. When electric motors arrived in factories at the turn of the twentieth century, engineers bolted them onto existing machines and declared victory. The looms still looked like looms. The lathes still looked like lathes. The factory floor still looked like a factory floor designed for a steam engine parked in the middle of it. It took thirty years, and a full generational turnover of factory managers, before anyone thought to rearrange the machines themselves. [AI doesn't need a better interface. It needs to disappear.]

We are in the bolted-on phase. We took AI, which is a genuinely new kind of capability, and bolted it onto the oldest interaction pattern in computing. Then we shipped it. Then we built businesses around it. Then we trained millions of people to use it. And now the pattern feels inevitable, like the only way this could have been done, because it is the way it was done.

But there is nothing about a large language model that requires a chat window. There is nothing about a diffusion model that requires a text box. There is nothing about an AI agent that requires you to type instructions before it acts. These are design choices. They were made quickly, under commercial pressure, by teams that needed to ship something people could understand. The prompt was familiar. The prompt was fast. And that was enough to ship it to three hundred million people.

The question is what that familiarity costs, and who pays for it.

The Cognitive Tax and Who Pays It

Think about what happens every time you start a prompt interaction.

You recall the goal. You formulate the prompt. You decide what context to include and what to leave out. You submit and wait. You read and evaluate the response. You decide what to do next: accept, refine, start over, or abandon the tool and do it by hand.

Six steps. Each one draws on working memory, attention, domain knowledge, and what psychologists call metacognition: thinking about your own thinking. Each one is a small tax on your finite cognitive budget for the day.

One interaction is cheap. The tax is barely noticeable.

Twenty interactions is a workday. And twenty interactions, six steps each, is a hundred and twenty moments where you had to be the smartest part of the system. A hundred and twenty micro-decisions about how to talk to a machine. This is the cognitive exit cost of the prompt paradigm: the price you pay, in mental energy, to leave your own head and enter the machine's frame of reference every time you want help.

Who thrives in this system? People with high metacognition.

If you are good at decomposing your own goals into clear sub-tasks, good at anticipating what a language model will interpret well, good at spotting errors in output quickly, good at adjusting your prompt strategy on the fly, you will get a lot out of AI tools. You were probably already good at your job. The tool made you faster at being good. Your fifteen percent productivity gain is real, and it feels like magic, and you genuinely cannot understand why your colleague down the hall says AI "doesn't really help that much."

Your colleague down the hall is not stupid. Your colleague down the hall has a different metacognitive profile. They are good at their job in a way that is procedural, experiential, intuitive. They know what a good marketing email looks like when they see one, but they can't decompose the qualities of a good marketing email into a prompt that reliably produces one. They know when a financial forecast feels off, but they can't tell a language model which assumptions to stress-test. The gap between their expertise and their ability to verbalize that expertise in a machine-readable form is the gap where the cognitive tax hits hardest.

This is an asymmetric tax. It falls lightest on the people who need the least help and heaviest on the people who need the most.

The executive who thinks in bullet points and clear directives gets a faster assistant. The mid-level analyst who thinks by staring at a screen and moving numbers around until something clicks gets a tool that interrupts the very process by which they do their best work. The senior engineer who can specify a function in plain English gets working code in seconds. The junior engineer who is still learning what to specify gets plausible-looking code that may or may not work, and has to burn cognitive energy they don't have to figure out which one it is.

The distribution of gains from AI adoption, in most organizations, follows a power law. A small number of high-metacognitive users capture an outsized share of the value. The median user gets moderate benefit at moderate cost. And a meaningful percentage of users quietly abandon the tools or use them only for the simplest tasks, like rewriting an email or generating a list of ideas, because the cognitive cost of the more complex interactions isn't worth the return.

Nobody puts this in the quarterly report.

The quarterly report says "83% adoption rate" because 83% of people logged in at least once last month. It says "4.2 satisfaction score" because people were asked if the tool was useful and they said yes, because saying no feels like admitting you're bad at your job. The report says "estimated 12,000 hours saved" because someone multiplied the number of interactions by an average time saving per interaction and got a big number.

But the report doesn't capture the analyst who spends forty minutes trying to get the tool to produce the chart she wants, gives up, and makes it in Excel in twenty minutes. It doesn't capture the product manager who uses AI to write a first draft, then rewrites 70% of it, then wonders whether the AI step added anything other than a slightly different starting point. It doesn't capture the support rep who was told to use the AI to draft responses to tickets and now spends more time editing AI drafts than she used to spend writing her own, because the AI's tone is slightly but consistently wrong and she can't figure out how to fix it without rewriting the whole thing.

These are not failure cases. These are the normal experience of people with moderate metacognitive ability using a system designed for people with high metacognitive ability. The system works. It just works unevenly, and the unevenness is structural, not accidental.

The interface cannot help you until you help it first. That is what the design says, whether the designers meant it or not. It confesses that the burden of translation falls on you. It confesses that the better you are at thinking about thinking, the more it will do for you, and the less you need it for, the more you will get from it. It is a tool that rewards the already-rewarded and taxes the already-taxed, and it does this not because the AI inside it is biased or broken but because the interaction paradigm through which the AI reaches you was designed around the assumption that you, the user, would handle the hard part.

The hard part is not writing the email or building the spreadsheet or fixing the code.

The hard part is knowing what to ask for.

The Paradigm Is Not the Technology

Here is the sentence this whole chapter has been building toward:

The AI is not the problem. The paradigm is the problem.

These are two different things, and the failure to separate them is the source of almost every frustrating conversation about AI in organizations right now. When people say "AI didn't deliver what we expected," they are almost always talking about the paradigm, not the technology. When vendors say "you just need better prompts," they are defending the paradigm, not the technology. When consultants say "you need an AI strategy," they usually mean a strategy for wringing more value out of the current paradigm. More prompt training. Better tooling. Faster iteration on the same loop.

The technology is large language models, diffusion models, reinforcement learning systems, and the architectures that make them work. The technology has advanced at a pace that genuinely surprised even the people building it. GPT-3 to GPT-4 was a single generation of improvement that turned a party trick into a reasoning engine. The technology is not done improving. It will keep getting better.

The paradigm is the chat window.

The paradigm is the prompt box, the copilot sidebar, the "ask AI" button, the text field where you type your request and wait. The paradigm is the interaction model that places you, the human, in the role of active initiator and the machine in the role of passive responder. The paradigm has not changed in any meaningful structural way since November 2022. The box has gotten prettier. The responses have gotten better. You can now attach files, share links, reference previous conversations. But the fundamental relationship, you ask, it answers, has not moved.

Think about what this means.

The engine has gone from a four-cylinder to a turbocharged V8, and we are still driving on the same one-lane dirt road. The gains you've captured so far, the fifteen percent here and twenty percent there, are what you can get by driving a much more powerful engine on the same road. There is a ceiling on how fast you can go, and the ceiling is the road, not the engine.

This is why you feel the way you feel. This is the name for the dissatisfaction that showed up in section one. You adopted AI successfully, and you hit a ceiling, and the ceiling felt like your fault because everyone around you was talking about how powerful the technology is, and if the technology is so powerful and you're not getting transformative results, then maybe you're doing it wrong. Maybe your prompts are bad. Maybe your team needs more training. Maybe you should hire a prompt engineer, or take a course, or read a book about prompt frameworks.

You don't need better prompts. You need a different paradigm.

What does a different paradigm look like? A system where the burden of initiation doesn't sit with you. Instead of you asking the AI for help, the AI notices when help is relevant. Instead of you translating your intent into a prompt, the AI infers your intent from your context: what you're working on, what you've done before, what your goals are. Instead of you evaluating the output, the AI presents options ranked by its confidence and your history of preferences. Your role shifts from initiator to editor. From translator to decision-maker.

Picture your nine o'clock meeting tomorrow. Before you walk in, the system has already pulled the three contracts relevant to the discussion, flagged the clause that changed since the last version, and drafted the question you didn't know you needed to ask. You arrive informed rather than prepared. That is the difference.

Most of the components for this exist today. Your calendar already knows what meetings you have tomorrow. Your email client already knows who you've been corresponding with. Your project management tool already knows which tasks are overdue. The AI model already has the reasoning capability to connect these data points. What's missing is not capability. What's missing is a design philosophy that says the human should not have to be the one who connects these dots every single time.

That gap, between what the technology can do and what the paradigm allows you to access, is where the value is. Not the fifteen percent you've already captured. The rest of it. The part the current paradigm can't reach.

You were right to feel unsatisfied. The ceiling you hit is not yours. It belongs to the paradigm, and paradigms can be changed.

Chapter 2

Chapter 2

Cognitive Exit

The Hidden Cost That Makes Every Efficiency Gain a Partial Lie

What Happens When You Stop to Ask

You are writing a strategy memo.

Not the kind of memo that follows a template. The kind that requires you to hold six competing priorities in your head at the same time, weigh trade-offs between time-to-market and margin pressure, and arrive at a recommendation your SVP will actually trust. You have been working on it for forty minutes. You are in that state where the structure has finally clicked, where the argument has taken shape and you can feel the logic connecting section to section, where your fingers are moving and the words are coming and the whole thing is starting to cohere.

Then you realize you need a number.

Specifically, you need the retention rate for a customer segment from Q3, and you need to know how it compares to the previous year. You could dig through a dashboard. You could email the analytics team and wait. Or you could ask the AI assistant your company deployed three months ago, the one that sits in a sidebar and is supposed to make work like this faster.

So you stop writing.

That is the first thing that happens, and it is the thing nobody measures. You stop. The memo is mid-sentence. The argument is mid-arc. The mental model you spent forty minutes constructing, the one holding all six priorities in productive tension, is now suspended. Not saved. Not paused with a bookmark you can return to. Suspended in the biological wetware of your working memory, which has a carrying capacity roughly equivalent to a post-it note.

Now you move into the tool. You click the sidebar. You think about how to phrase what you need. This takes longer than it sounds. "What was Q3 retention for enterprise customers compared to the prior year?" Is that specific enough? Should you mention which product line? Should you specify that you mean net retention, not gross? You know what you mean. You are not sure the model will know what you mean. So you add context. You refine. You type.

You wait.

The response comes back. It gives you a number, but it is formatted in a way that suggests it is pulling from a different data source than the one your team uses. The percentage looks close but not quite right. Is it right? You are now doing the thing you were trying to avoid: verifying data. You open the dashboard anyway. You check. The number is close but off by 1.3 percentage points. Close enough? For a strategy memo that will go to the SVP? You decide it is not close enough. You use the dashboard number.

Elapsed time in the AI tool: maybe three minutes.

Elapsed time away from the memo: eight minutes, counting the decision-making about how to phrase the prompt, the wait, the evaluation, the verification, and the moment of deciding which number to trust.

Now you go back to the memo.

Where were you?

This is the question that costs the most. It does not appear in any efficiency metric anywhere. You were mid-argument. You were connecting the retention data to the pricing strategy to the competitive pressure from a new entrant. You had a thread. The thread is gone. Not entirely — frayed. You re-read the last two paragraphs. Re-read the outline. You try to find the next move. The shape of the argument is there, like a word on the tip of your tongue. The momentum isn't.

It takes you six minutes to get back to where you were.

Three minutes in the tool. Eight minutes total in the interruption cycle. Six minutes rebuilding the mental state the interruption destroyed. Seventeen minutes total cost for a task the AI "helped with" in three.

If your organization's efficiency metrics captured this event at all, they captured three minutes. Maybe less. They captured the time inside the tool, the time between prompt and response. They recorded one successful AI interaction. If the system tracks such things, it counted one query resolved. One more data point in the "12,000 hours saved" figure that will appear in the next quarterly report.

The other fourteen minutes do not exist in any system. They are invisible. They happened inside your head, in the space between one tool and another, in the biological reality of a brain that cannot context-switch for free.

Cognitive exit. The moment you leave the mental state you were in to enter the mental state the tool requires, and then the cost you pay to return. It is not unique to AI tools. Every tool switch carries some version of this cost. But the prompt-and-response paradigm makes cognitive exit structural. It is not an occasional interruption. It is the interaction model itself. Every single time you use the tool, you exit your current cognitive state, construct a new one inside the tool's frame of reference, and then attempt to re-enter where you left off. That "attempt" is where the cost lives, because the re-entry is never complete. You come back slightly degraded. The thread is slightly thinner. The argument is slightly less sharp.

And you do this twenty times a day.

The Science of Interrupted Thinking

The experience I just described is not anecdotal. It has been studied for decades under different names, in different labs, by researchers who were not thinking about AI at all but whose findings now land with uncomfortable precision.

Start with a number. Gloria Mark, a researcher at the University of California, Irvine, published work showing that after an interruption, it takes an average of twenty-three minutes and fifteen seconds to return to the original task. Not twenty-three minutes to start working again. Twenty-three minutes to get back to the same depth of engagement. You start working again almost immediately. You look productive almost immediately. But the quality of the cognitive state, the ability to hold complex relationships in mind and reason about them, takes twenty-three minutes to rebuild.

Twenty-three minutes.

If that number seems high, consider what it includes. It includes the partial resumption where you are technically back on the task but your mind is still processing residual fragments of the interruption. It includes the false starts where you pick up the wrong thread and have to back up. It includes the metacognitive overhead of re-orienting: Where was I? What was I doing? What was the next step?

Now apply this to the strategy memo scenario. If you prompt the AI assistant five times during a two-hour writing session, and each prompt triggers even a fraction of that resumption cost, you have not saved time. You have borrowed time from the depth of the work and spent it on the surface of the interaction. The memo gets finished. It might even get finished sooner. But the quality of the thinking embedded in that memo has been taxed at a rate that no timestamp can reveal.

The cognitive science term for the capacity that gets taxed is working memory. Working memory is the system that holds information in active, manipulable form while you reason about it. It is small. Depending on whose model you use, it holds somewhere between four and seven chunks of information at a time. Not four to seven facts. Four to seven organized groups of related information, where the definition of "organized" depends on your expertise and the nature of the task.

When you are deep in a strategy memo, those four to seven slots are full. One holds the competitive landscape. One holds the pricing model. One holds the executive audience and what they care about. One holds the structure of the argument itself, which section leads to which. One holds the specific sentence you are constructing right now. They are all active. They are all interacting.

That is what deep work feels like from the inside: full slots, all talking to each other.

Cognitive exit empties some of those slots. It has to. Your working memory cannot simultaneously hold the strategy memo's argument and the AI interaction's requirements. Something gets displaced. Usually the most fragile thing, which is usually the thing that was most recently constructed, which is usually the specific thread of reasoning you were in the middle of developing. The competitive landscape and the pricing model might survive, because those have been rehearsed enough to be partially stored in long-term memory. The specific argumentative move you were making at the moment of interruption? First casualty.

This is not a flaw in your discipline. This is how human memory works.

In a series of controlled experiments published in 2014, Cyrus Foroughi and colleagues at the Naval Research Laboratory asked participants to write practice essays for a graduate school entrance exam. Participants were interrupted at random intervals during the writing task. The interruptions were brief. The resumption was immediate. Independent raters, blind to the experimental condition, scored the interrupted essays lower on overall quality, idea development, and coherence. The writers did not notice.

People systematically underestimate the cost of cognitive exit. They report that switching tasks is easy, that they pick up right where they left off, that the interruption was no big deal. The empirical record says otherwise. Every interruption creates a small permanent loss, a thread that was dropped and never fully recovered, a connection that was present and is now gone. The writer doesn't feel the loss because they never see the counterfactual: the version of the memo they would have written without the interruptions. They only see the version they did write, and it seems fine.

There is a second-order effect that is worse than the direct cost.

Repeated interruptions change the way people work. Sophie Leroy at the University of Washington introduced the concept of "attention residue," the finding that when you switch from Task A to Task B, part of your attention stays stuck on Task A. It doesn't fully transfer. You carry a residue. If you switch back to Task A, you now carry residue from Task B. Each switch layers another residue. By the end of a day like this, you are operating on fractional attention — spread across half a dozen partial engagements, never fully in any of them.

This is the workday that the prompt paradigm produces. Not one big interruption, but a steady drip of small ones, each individually manageable, collectively devastating. The person at the end of this day is not exhausted from working hard. They are exhausted from working fractured. The tiredness isn't the good kind — the kind that follows deep concentration. It's the scattered kind. The kind that comes from a day of never quite being fully in any one thing.

Multiply this across an organization.

If twenty knowledge workers each experience five AI-related cognitive exits per day, and each exit costs an average of six minutes in total (the interruption plus the resumption penalty, a conservative estimate compared to Mark's twenty-three minutes), that is six hundred minutes per day. Ten hours. Across a five-day week, fifty hours. Across a month, two hundred hours of degraded cognitive performance that shows up nowhere in any system.

Against this, the organization reports "12,000 hours saved this quarter." Those hours are real. The cognitive exit costs are also real. Nobody is subtracting one from the other. Nobody has the instrumentation to subtract one from the other. The gains happen inside the tool, where they can be measured. The losses happen inside the person, where they cannot.

And the efficiency report says they were 15% more productive.

Why High Performers Masked the Problem

Here is a pattern I have seen in every organization that adopted AI tools through the prompt paradigm.

A small group of users become internal champions. They produce impressive outputs. They share tips in Slack channels. They build personal prompt libraries. They give informal demos. They become the visible face of AI adoption in the organization, the proof that the investment is working, the people the quarterly report is really about.

These users have something in common, and it is not that they got better training or spent more time with the tools. It is that they have unusually high metacognitive ability. They are people who naturally think about their own thinking. They plan before they act. They can decompose a complex goal into a sequence of specific requests without much effort. They anticipate how the model will interpret a prompt because they are already in the habit of examining their own assumptions before they communicate them.

These people experience lower cognitive exit costs. Not zero. Lower.

Why? Because they pre-plan. Before opening the AI tool, they have already decided what they need, how to ask for it, and what the success criteria are. The transition from their current task to the AI interaction is smoother because they have essentially pre-loaded the new cognitive state before entering it. The interruption is less disruptive because it is less of a surprise to their working memory.

They also post-process efficiently. When the output comes back, they know what to look for. They can scan for the relevant parts, ignore the filler, and extract what they need in seconds rather than minutes. They have a mental model of what good output looks like for their specific purpose, and they match against it almost automatically.

They are operating the prompt paradigm the way it was designed to be operated. They are doing the cognitive work that the paradigm demands. And they are doing it so smoothly that it looks effortless, which creates the impression that the paradigm itself is effortless, which makes leadership believe the tool is working well for everyone, which ensures that nobody looks too hard at what is happening for the other 70% of users.

I want to be specific about the mechanism here, because it matters. When the VP of Operations sees the internal champion produce a polished competitive analysis in thirty minutes using AI, the VP does not see the metacognitive pre-planning that preceded the first prompt. The VP does not see the years of practice at self-monitoring and goal decomposition that made that pre-planning possible. The VP sees the tool. The VP sees the output. The VP draws the reasonable conclusion: the tool works. If other people are not getting these results, they need more training.

More training does help. A little. Training can teach someone specific prompt patterns. It can show them how to structure a request or how to use system-level instructions to shape the output. What a two-hour workshop cannot do is build the metacognitive habits that high performers spent years developing. You cannot train someone to be a natural planner. You cannot train someone to intuitively model how a language system will interpret their words. These are deep cognitive habits, and they transfer from the AI context to every other part of the person's professional life. They are not prompt skills. They are thinking skills. The high performers had them before AI existed.

The shape of that distribution has a cause, and it is not what most organizations assume. The rest get moderate returns at moderate cost. And the organization's reporting infrastructure, which measures adoption rates and average time savings, is structurally incapable of showing the shape of the distribution. The average hides the asymmetry.

What makes this worse is that the high performers often occupy the roles where AI assistance is least transformative. The person who can rapidly decompose a complex goal into clear sub-tasks is also the person who was already fast at the work the AI is doing. Their fifteen percent gain is real. It is also fifteen percent of output they were already producing at a high level. The largest potential gains, the ones that would actually change organizational outcomes, live with the people who struggle most with the paradigm: the mid-level knowledge workers who do the bulk of the organization's thinking work but whose cognitive style does not match the tool's demands.

Think about the analyst from Chapter 1. Forty minutes trying to get the AI to produce a chart, then twenty minutes making it in Excel. That analyst is not bad at their job. They may be excellent at their job. They are bad at the specific cognitive task that the prompt paradigm requires: translating a visual intuition about data into a verbal specification precise enough for a language model to act on. That translation task has nothing to do with their actual job. It is overhead imposed by the interface.

The high performers don't struggle with this translation task, so they don't see it as a burden, so they don't report it as a problem, so leadership never learns that it exists.

Three months into the deployment, the organization has a confident story. The tools are working. Adoption is high. The champions are producing great results. The few people who aren't getting as much value probably just need more practice.

The story is wrong. The paradigm is selecting for a specific cognitive profile and rewarding it, while imposing a tax on everyone else that looks, from the outside, like insufficient skill. And the high performers, through no fault of their own, are providing the cover that keeps the real problem hidden.

The Measurement Blind Spot

How did your organization decide that AI adoption was succeeding?

I can guess the answer, because almost every organization used the same playbook. Someone pulled numbers from the tool's admin dashboard: logins, queries, active users. Someone surveyed the team: "How useful do you find the AI tools?" on a 1-to-5 scale. Someone estimated time saved: the number of AI-generated drafts multiplied by an assumed per-draft time savings. Someone compiled a set of anecdotes, the impressive outputs, the before-and-after comparisons, the testimonial from the sales rep who used the AI to prepare for a client meeting in half the time.

These are all measurements of what happened inside the tool. How many people used it. What it produced. How they felt about it. Not bad measurements. Just measurements of the wrong surface.

What they do not measure is what happened around the tool. The cognitive exit costs. The resumption penalties. The fractured attention. The degraded depth of the surrounding work. The time spent translating intent into prompts. The time spent verifying outputs. The invisible decision-making about when to use the tool and when to give up and do it by hand.

These are the costs of the workflow, and the workflow is where the actual work happens.

There is a classic name for this kind of error: the streetlight effect. A man loses his keys in a dark parking lot and searches for them under the streetlight, not because that is where the keys are but because that is where the light is. The tool's metrics are the streetlight. The cognitive exit costs are the dark parking lot. The accurate picture of AI's net value is somewhere out there in the dark, and nobody is walking away from the light to look for it.

Let me make this concrete.

Your organization reports 12,000 hours saved this quarter. Let's accept that number at face value. Now let's ask: what is the offsetting cost? If the average knowledge worker experiences ten cognitive exits per day related to AI tool interactions, and each exit carries a total cost (interruption plus resumption) of five minutes, that is fifty minutes per worker per day. For an organization with two hundred knowledge workers, that is approximately 166 hours per day, or roughly 8,300 hours per quarter.

I am making up these numbers. I do not have your organization's data. But the structure of the math is real, and the point is this: nobody is doing this math. Nobody is even attempting it. The 12,000 hours saved is a gross number, not a net number, and the gross number is the one that goes in the executive summary, the one that gets celebrated in the all-hands meeting, the one that justifies the renewal of the contract.

The absence of the offsetting calculation is not a minor oversight. It is the reason organizations are surprised when AI adoption feels less transformative than expected. They are measuring half the equation and treating it as the whole.

The time cost is actually the easier half to think about. The harder half is quality.

A strategy memo written with five AI-assisted interruptions is a different memo than one written without them. Not necessarily worse by any obvious metric. A reader might not notice. The author might not notice. But the thinking embedded in the memo is different. The connections are slightly less tight. The argument is slightly less integrated. The places where the author's own expertise would have surfaced a non-obvious insight are slightly more likely to contain a conventional observation, because the cognitive state that produces non-obvious insight, the state where all the working memory slots are full and interacting, was disrupted five times during the writing.

A lawyer drafting a brief. She is three pages into the argument, holding the opposing counsel's likely objections in her head alongside the case law she plans to cite and the specific framing her client needs. She opens the AI sidebar to check a citation. Four minutes later she is back in the brief. The objections are still there. The case law is still there. But the connection she was about to draw between the two, the specific angle that would have made her argument harder to counter, is gone. She writes something serviceable in its place. The brief is fine. It is not the brief it would have been.

Nobody will ever know the difference. The brief that would have been does not exist. Only the brief that was.

This quality cost does not appear in any metric. The organization sees the memo that was written, the brief that was filed. They look professional. They were produced faster. By every available measure, the AI-assisted process was better. But the strategic thinking may be fractionally thinner. And if the organization's actual competitive advantage comes from the quality of its strategic thinking, which for most knowledge-work organizations it does, then the AI adoption program may be making the things that matter most slightly worse while making the things that matter less slightly faster.

Nobody knows, because nobody is looking.

What You Were Actually Building

Take a step back. Look at what your organization constructed over the past twelve to eighteen months.

You deployed AI tools. You trained people to use them. You built internal knowledge bases of prompts and best practices. You designated champions and power users. You created workflows that route work through AI-assisted steps. You measured adoption and celebrated success.

What did you actually build?

You built a cognitive-exit infrastructure.

I don't mean that pejoratively. You built it because there was no alternative architecture available, because the tools arrived in the form they arrived in, because the prompt paradigm was the only paradigm on offer, and because doing something with AI was better than doing nothing with AI. The decision to adopt was correct. The gains you captured are real. The 15% here and 20% there, those matter, and they compound.

But the infrastructure you built is one that requires human intelligence to route work into and out of every AI interaction. Every prompt requires a person to stop what they are doing, construct a request, submit it, evaluate the response, and then figure out how to reintegrate the result into the work they paused. The system runs on human cognitive labor at every junction point. The AI does the middle part. The human does the beginning and the end.

This is the opposite of what most people imagined when they heard the word "automation."

Automation, in its ordinary meaning, suggests that a process runs without continuous human input. The assembly line automates manufacturing because a person does not need to pick up each part and decide where it goes. The thermostat automates climate control because a person does not need to check the temperature and adjust the furnace. The dishwasher automates cleaning because a person does not need to stand at the sink.

Using a prompt-based AI to write a section of a report is not like using a dishwasher. It is like having a sous-chef who won't start cooking until you write out every ingredient, quantity, and technique in precise written instructions, then insists you taste every dish before it goes to the table, then asks you what to cook next. You are freed from the chopping. You are not freed from the thinking about the chopping. And the thinking was always the harder part.

The vendors call this augmentation, a word they prefer because it sounds collaborative rather than threatening. But augmentation, as implemented through the prompt paradigm, means something specific: it means the human does more cognitive work per unit of output, not less. The total time per task may decrease. The cognitive load per task increases. You traded physical minutes for mental cycles, and nobody told you that mental cycles are the scarcer resource.

This is what I mean by calling the efficiency gains a partial lie. They are not a full lie. The gains are real. But they are reported without their offsetting costs, and the offsetting costs are denominated in a currency the organization does not track: cognitive depth. The thirty minutes saved on the report shows up in the metrics. The fractured attention over the rest of the afternoon does not.

Here is the question I want to leave you with for the next chapter.

What if the system did not require you to exit? What if you never had to stop what you were doing to ask for help, because the help arrived in the context of the work you were already doing, at the moment you needed it, without a prompt? What if the AI could see what you were working on, infer what you needed, and deliver it, not as a response to a query but as a feature of the environment?

There is a version of this that already has a name.

That is the ambient model. That is where this book is going. But before we can build toward it, you needed to see, in full clinical detail, what the current model is actually costing you. Not the cost you can see in the dashboard. The cost you carry in your head.

The cost of stopping to ask.

Chapter 3

Chapter 3

What Ambient Means

A Definition Precise Enough to Be Useful

Ambient Is a Spectrum, Not a Switch

Here is a question worth sitting with. When you adjust the temperature in your house, how much of your attention does it take?

If you have a manual thermostat, you notice you are cold, get up, walk to the wall, make a decision about what temperature to set, push buttons, and return to whatever you were doing. Six steps. Conscious involvement at every one.

If you have a programmable thermostat, you did that work once, maybe twice, during initial setup. Now the system runs a schedule. You still intervene when the schedule is wrong, when guests are coming, when the season shifts in a way the programming did not anticipate. But most of the time, you do nothing. The house is warm when you wake up. It cools when you leave. Your attention is almost entirely elsewhere.

If you have a learning thermostat, the kind that watches when you come and go and adjusts its own model of your preferences over time, you did less work during setup and almost none after. The system observes. It infers. It acts. When it gets it wrong, you correct it, and the correction feeds back into the model. Over months, your interventions approach zero.

Three thermostats. Same function. Wildly different demands on you.

That difference is the ambient spectrum.

I am going to give this spectrum a formal shape, because the rest of this book depends on it. At one end is fully explicit interaction, where nothing happens until you construct a complete instruction and deliver it. At the other end is fully autonomous ambient operation, where the system perceives your situation, infers your need, and acts on it without any prompting from you at all. In between is a long gradient, and almost everything interesting is happening on that gradient right now.

Let me lay out five points along the spectrum. These are not rigid categories. They are landmarks you can use to locate any system, any deployment, any product, and say: it is about here.

Point 1: Fully explicit. The user initiates every interaction. The user frames every request. The system does nothing until asked and returns nothing beyond what was requested. Think of a command-line terminal in 1974, or a ChatGPT session today where you open a blank window and type a prompt from scratch. The cognitive burden sits entirely on the human side. This is where Chapter 1 started, and it is where the prompt-and-response paradigm lives.

Point 2: Template-assisted explicit. The user still initiates, but the system provides structured starting points. Pre-built prompt templates in a copilot sidebar. Dropdown menus that constrain the interaction to common tasks. "Summarize this document" buttons that eliminate the need to compose the request from scratch. The burden of initiation is reduced. The burden of evaluation is not. You still have to decide when to invoke the tool, and you still have to judge what comes back.

Point 3: Context-triggered suggestion. The system watches what you are doing and offers help before you ask. An email tool that drafts a reply based on the content of the message you received. A code editor that suggests the next line based on the function you are writing. A document tool that notices you are building a comparison table and offers to pull data from a connected source. The system initiates. You ratify. This is where the action starts to shift from user to system, but the human remains the final decision-maker for every individual action.

Point 4: Proactive with bounded autonomy. The system infers your intent from context and history, takes action within defined boundaries, and reports what it did. A calendar assistant that reschedules a meeting when it detects a conflict, sends the invites, and notifies you after the fact. A procurement system that reorders supplies when inventory drops below a threshold you set six months ago. The human sets the boundaries. The system operates within them. Human attention is required only at the edges, when something falls outside the defined boundaries, or when the system's confidence in its inference drops below a threshold.

Point 5: Fully ambient. The system maintains a persistent model of your intent, infers appropriate action across contexts and time horizons, executes without requiring ratification, and learns from outcomes to refine its model. No prompting. No approving. The human's role is to periodically review outcomes and adjust the intent model when their goals change. This point on the spectrum is largely theoretical for complex knowledge work. It exists in narrow domains. A pacemaker that adjusts heart rhythm in real time is fully ambient. A spam filter that routes messages without your involvement is fully ambient within its domain. For the kinds of work most of this book is about, nobody is here yet.

Five points. One spectrum.

The distance between Point 1 and Point 5 is not a technology gap. That is the claim I want to establish in this chapter, and it may be the most important claim in this book. The distance is an organizational gap. The technology to operate at Point 4 exists today for a large number of knowledge-work tasks. The reason most organizations are stuck between Points 1 and 2 has almost nothing to do with what the models can do and almost everything to do with how organizations have structured the relationship between human intent and system action.

But I am getting ahead of myself.

The Three Prerequisites

If ambient intelligence is a spectrum, what determines where a given system sits on it? What makes the difference between a tool that waits for your prompt and a system that acts before you even know you need it?

Three things. I am going to state them plainly, then spend the rest of this section making each one concrete.

The first is persistent intent context. The second is proactive inference. The third is closed-loop action authority.

All three must be present for a system to operate above Point 3 on the spectrum. Remove any one and the system collapses back toward explicit interaction, no matter how sophisticated the underlying model is. You can have the most powerful language model ever built, and if it does not have access to persistent intent context, it will sit there waiting for a prompt like every other chatbot.

Persistent intent context means the system knows what you are trying to accomplish, not just right now but across time. It knows your goals, your constraints, your preferences, and your priorities, and that knowledge persists between sessions. It does not reset when you close the browser tab.

Think about what happens when you open ChatGPT today. Every conversation starts from zero. The system has no memory of the strategy memo you worked on yesterday, no knowledge of the quarterly priorities you care about, no model of the trade-offs you are willing to make and the ones you are not. You carry all of that context in your head, and you re-inject the relevant pieces into every new prompt.

This is the translation labor that Chapter 1 described. You are the memory. You are the bridge between sessions. You are the continuity that the system lacks.

Persistent intent context would change this. Imagine a system that knows you are leading a product launch scheduled for September, that your three biggest concerns are channel readiness, pricing approval, and a competitor's announcement expected in August, that you prefer concise summaries with data over narrative reports, and that your SVP cares most about margin impact. That context does not change between Tuesday's session and Thursday's session. It changes when the launch date moves, or when a new concern surfaces, or when the SVP's priorities shift.

A system with persistent intent context does not need you to explain yourself every time. It already knows what you are working toward. It can interpret a partial, ambiguous, half-formed request in the light of goals it already understands.

The word "persistent" is doing real work here. Lots of systems have session-level context. They remember what you said five messages ago. Some have longer memory windows. But persistent intent context is something different. It is a structured representation of what you are trying to do at the level of your job, your projects, your role, maintained and updated over weeks and months, not minutes. It is more like a briefing document than a chat history. And it does not exist in any widely deployed consumer or enterprise AI product today.

Why not?

Not because it is technically impossible. Because nobody has done the organizational work of defining it. This is the gap I will come back to in the last two sections of this chapter. For now, just note that the first prerequisite for ambient operation is something the technology cannot generate on its own. The system needs to know your intent, and you, or your organization, have to articulate it.

Proactive inference means the system anticipates what you need rather than waiting for you to ask.

This sounds like magic. It is not. It is pattern matching combined with context. When your email client suggests "Sounds good, thanks!" as a reply to a scheduling confirmation, that is proactive inference at the most basic level. The system recognized a pattern (a confirmation email), inferred a likely response, and offered it before you typed anything. The inference is simple. The proactiveness is real.

Scale it up. A system that knows you are preparing for a board meeting next week (persistent intent context) and notices that the financial data you rely on was updated this morning could proactively generate the three slides you will need, formatted the way you prefer, with the specific comparisons the board has asked about in the last two meetings. It does this not because you asked for it, but because the combination of context and timing made the inference reliable.

The key word is "reliable." Proactive inference that is wrong is worse than no inference at all. If the system generates the wrong slides, you now have a new task: figuring out why the output is wrong and either correcting it or starting over. You've gained nothing and lost time. The cognitive exit cost from Chapter 2 applies with double force, because you weren't even expecting the interruption.

This is why proactive inference depends on persistent intent context. Without deep, accurate knowledge of what you are trying to accomplish, the system's guesses will be wrong often enough to destroy trust. And trust, once lost in a proactive system, is almost impossible to rebuild. A tool you invoke is a tool you chose to use. A system that acts on your behalf without being asked is one you have to trust. If it gets things wrong twice in a row, you will disable it. If it gets things wrong three times, you will never turn it on again.

So proactive inference is not about the model's raw capability. It is about the system's confidence in its understanding of your situation. That confidence comes from context, and context comes from organizational infrastructure, not from the model itself.

Closed-loop action authority means the system can act on its inferences without requiring you to approve every step.

This is the prerequisite that makes organizations most uncomfortable, and for good reason. Giving a system permission to act on your behalf, to send an email, to reschedule a meeting, to place an order, to update a document, without your explicit approval of each action, feels risky. It should feel risky. Mistakes at this level are visible. They affect other people. They carry consequences that a bad paragraph in a draft document does not.

And yet.

Without closed-loop action authority, the system cannot be ambient. It can be proactive. It can generate suggestions. It can queue up recommendations. But if every suggestion requires you to stop, evaluate, and approve, you are back in the prompt paradigm with a different coat of paint. The cognitive exit cost returns. The interruption to your working state returns. You are still the bottleneck.

The word "closed-loop" is important. It means the system acts, observes the result, and adjusts. If the calendar assistant reschedules a meeting and two attendees decline the new time, the system does not stop and ask you what to do. It finds the next available slot, sends new invitations, and updates your calendar. It loops. The loop closes without your involvement.

The boundaries of that loop matter enormously. No sensible person would give a system unlimited authority to act in their name. Closed-loop action authority is always bounded. The boundaries define what the system can do autonomously, what requires notification but not approval, and what requires your explicit decision before proceeding. Setting those boundaries is a design task, and it is an organizational task, and in most organizations nobody has done it yet.

So there they are. Three prerequisites. Persistent intent context: the system knows what you want over time. Proactive inference: the system anticipates what you need before you ask. Closed-loop action authority: the system can act within defined boundaries without waiting for your permission at every step.

Each one depends on the one before it. You cannot have useful proactive inference without persistent intent context, because the system has no basis for its inferences. You cannot have safe closed-loop action authority without reliable proactive inference, because the system will take the wrong actions. The three prerequisites are sequential, and they are cumulative.

And here is the uncomfortable part: the bottleneck for all three is not the AI model. GPT-4 is smart enough to infer your needs from context if it had the context. It is capable enough to take action through tool-calling APIs if it had the authority. The bottleneck is the organizational infrastructure that would provide the context and define the authority.

The model is waiting for you. You are the missing piece.

Where the Best Deployments Actually Sit

Let me be honest with you about where the industry is right now.

I spend a lot of time talking to organizations about their AI deployments. The conversations follow a pattern. The team shows me what they have built. They are proud of it, and they should be. They have connected a language model to their internal documents. They have built a chatbot that can answer questions about company policy or product specifications. They have created workflow automations that draft emails, summarize meeting notes, generate reports from structured data. Some of them have gone further: custom-trained models on domain-specific data, agent-based systems that can execute multi-step tasks, integrations with CRM and ERP systems that allow the AI to pull and push live data.

This work is real. It took effort, money, and talent. It is generating real value.

It is also sitting between Point 1 and Point 2 on the ambient spectrum.

Sometimes barely reaching Point 2.

The chatbot that answers questions about company policy is a fully explicit system wearing a friendly face. The user has to know they have a question. They have to formulate the question. They have to type the question into the chatbot. They have to evaluate the answer. There is no persistent intent context. There is no proactive inference. There is no closed-loop action authority. The chatbot is a search engine with better manners.

The workflow automation that drafts emails is slightly further along. If it is triggered by an incoming email and generates a draft reply, it has a simple form of proactive inference: it detected a situation and proposed an action. But the user still reviews and approves every draft. There is no closed loop. And the "context" the system uses is the content of the single email thread, not a persistent model of the user's goals, communication preferences, or relationship dynamics.

Even the most sophisticated deployments I have seen, the agent-based systems that can execute multi-step tasks, operate with human approval gates at every significant junction. The agent researches, plans, and proposes. The human reviews, edits, and approves. Then the agent executes the approved plan. This is Point 2.5 at best. The system is doing more of the work, but the human is still the checkpoint at every stage.

I want to be careful here, because I am not saying these deployments are bad. They are appropriate for the current level of organizational trust, the current state of intent infrastructure, and the current risk profile that most enterprises can tolerate. Pushing further without the prerequisites in place would be irresponsible.

What I am saying is that the way these deployments are described, both internally and by the vendors who sell them, often implies a position on the ambient spectrum that they have not reached. The marketing materials say "autonomous agent." The reality is an agent that proposes actions and waits for a human to click "approve." The pitch deck says "proactive intelligence." The reality is a notification system that pushes alerts based on rules someone wrote three months ago.

The gap between description and reality is not deception. It is aspiration that has outrun the organizational infrastructure needed to support it.

Consider one of the most cited examples of advanced AI deployment: GitHub Copilot, the code-completion tool built on OpenAI models. Copilot watches what you type and suggests the next line of code. Sometimes it suggests the next five lines. Sometimes it fills in an entire function. It is context-triggered. It is proactive in a narrow sense: it offers code before you request it. It is, by many accounts, genuinely useful.

Where does it sit on the spectrum?

Point 3. Maybe 3.2.

It uses local context (the file you are editing, the function signature you just wrote) to infer what comes next. It does not maintain a persistent model of the project's goals, the team's coding conventions, the architectural decisions that shape what good code looks like in this specific codebase. It does not know that the function you are writing is part of a migration away from an old API, that the new API requires a specific error-handling pattern, that the team decided last sprint to deprecate the library Copilot is about to import. It has no closed-loop action authority. It suggests. You accept, reject, or modify. Every suggestion is a micro-interruption, a small cognitive exit, a moment where you shift from writing code to evaluating code someone else (something else) proposed.

And Copilot is one of the best in class.

The honest map of the industry looks like this: most production deployments are between Points 1 and 2. The best are at Point 3. Nobody is reliably at Point 4 for complex knowledge work. Point 5 remains a research aspiration for anything more complicated than spam filtering.

This is not a criticism of the technology companies. The models have improved at an extraordinary rate. Every quarter brings capabilities that would have seemed fictional two years ago. GPT-3 to GPT-4 was a leap that changed what was possible. The next generation will likely change it again.

But the bottleneck is not the model.

I keep saying this because it keeps being true, and because the industry keeps acting as if the next model release will solve it.

The bottleneck is the organizational infrastructure that sits between the model and the work. The persistent intent context that nobody has built. The proactive inference that nobody has validated. The closed-loop action authority that nobody has defined. These are not engineering problems. They are management problems. Leadership problems. Organizational design problems. They cannot be shipped in a software update.

The Organizational Capability That Makes It Possible

Why hasn't anyone built the intent infrastructure?

It is tempting to say the technology is not ready. And for some narrow technical requirements, that is true. But the bigger answer is simpler and more frustrating: nobody owns it.

Think about what persistent intent context would actually require. Someone in the organization would have to define, in structured form, what each team, each function, each role is trying to accomplish, at a level of specificity that a machine can act on. Not a mission statement. Not OKRs at the quarterly level. Something much more granular: the specific outcomes each person is working toward this week, the trade-offs they are willing to make, the constraints they are operating under, the standards they apply to their work, the relationships between their work and the work of the people around them.

Who does this?

Not IT. IT manages the tools. Not HR. HR manages the people. Not the AI vendor. The vendor provides the model. Not the individual user. The individual user has the knowledge but not the framework or the incentive to structure it.

This is an organizational capability that does not exist in most organizations because it has never been needed before. Before AI, no system was sophisticated enough to act on structured human intent. The intent lived in people's heads, expressed through conversations and emails and meetings and the accumulated culture of the team. It did not need to be machine-readable. It just needed to be human-readable, and most of the time it did not even need to be that, because people could infer from context and shared experience.

Now the context has changed. You have a system that could act on your behalf if it knew what you wanted. And it does not know what you want, because your organization has no mechanism for telling it.

Recall the factory analogy from Chapter 1. When electric motors arrived, factory managers bolted them onto existing machines. The layout did not change. The workflow did not change. It took thirty years and a generational turnover before someone realized the real opportunity was not faster machines but a different arrangement of machines. The motor was not the point. The reorganization was the point.

The AI model is not the point. The intent infrastructure is the point.

What would it look like, concretely? I can sketch the outlines, though the detailed architecture will come in later chapters.

At the individual level, it would be a structured profile of what you are working on, maintained and updated through a combination of your own input and system observation. The system watches what you do. It sees you editing a pricing proposal. It sees you scheduling calls with the channel team. It sees you pulling competitive data. From these signals, combined with information you provide (your project timeline, your goals, your constraints), it builds and maintains a model of your intent. Not a static document. A living model that updates as your work changes.

At the team level, it would be a structured map of how individual intents interact. My pricing proposal affects your channel readiness plan, which affects the launch timeline, which affects the SVP's board presentation. The system sees these connections because the intent data is structured, not because it read between the lines of your Slack messages.

At the organizational level, it would be a priority structure that tells the system how to make trade-offs when individual intents conflict. When the pricing proposal timeline and the channel readiness timeline cannot both be met, the system needs to know which one gives. It needs to know who decides. It needs to know how much latitude it has to propose alternatives.

This is hard work. It is the kind of work that organizations resist because it requires making implicit knowledge explicit, turning the unwritten rules and unspoken priorities and culturally understood trade-offs into something structured enough for a machine to process.

And here is the thing that makes it doubly hard: the output of this work is not visible. If you spend six months building a beautiful intent infrastructure, nobody sees it. There is no demo. There is no screenshot for the all-hands meeting. There is no "12,000 hours saved this quarter." The payoff comes later, when the system starts operating at Point 4 on the ambient spectrum, when people stop being interrupted by their tools, when the cognitive exit costs from Chapter 2 start to shrink, when the organization's AI investment stops hitting the ceiling described in Chapter 1.

The payoff is real but delayed. The cost is immediate and visible. This is the worst possible profile for getting organizational investment, which is why most organizations have not started.

But some have. I will describe what they are doing in the proving grounds chapters later in this book. For now, the point is this: the gap between where most AI deployments sit (Points 1-2 on the ambient spectrum) and where the technology could support them (Point 3-4) is not a technology gap. It is a management gap. It is the gap between having a powerful model and having a structured way to tell it what you want.

The organizations that close this gap first will not have better AI. They will have better-defined intent. And that will make all the difference.

The Leader's New Question

So here you are. You have deployed AI tools. You have seen the gains. You have felt the ceiling. You have read two chapters of this book explaining why the ceiling exists, and now you have a spectrum and three prerequisites that explain what sits on the other side of it.

What do you do?

The default instinct is to ask a familiar question: how do I use this tool better? How do I get my team to use this tool better? What training, what incentives, what workflows will squeeze more value out of the current paradigm?

That is the wrong question.

It is not wrong because the answer would be unhelpful. Better training produces better results. Better prompts produce better outputs. These are real gains and they are worth capturing. But they are gains within a paradigm that has a structural ceiling, the ceiling Chapter 1 described, and no amount of optimization within the paradigm will raise it.

The right question is different, and it is harder. It is this:

How do I define my intent with enough precision that a system can act on it without interrupting me?

Read that again. It is the question this book is built around.

The question has a strange shape. It sounds like it is about the system, but it is actually about you. About your organization. About the degree to which you can articulate what you are trying to accomplish, the constraints you are operating under, the trade-offs you are willing to make, and the standards you apply to the work that comes out the other end.

Most leaders cannot answer this question today. Not because they are not smart. Because they have never had to. Their intent has always been expressed through human channels: conversations with direct reports, feedback on deliverables, comments in meetings, the accumulated pattern of what they approve and what they send back. These are rich signals, but they are unstructured. They live in the minds of the people who work with the leader, not in any system.

A system cannot act on signals it cannot read. And the signals that leaders produce are, for the most part, illegible to machines.

This is the intent articulation challenge. It is going to run through the rest of this book. In each of the four proving grounds I will describe in later chapters, the difference between organizations that reach Point 3-4 on the ambient spectrum and organizations that remain stuck at Point 1-2 will come down to this: how well they defined their intent.

Not how good their models were. Not how much they spent on infrastructure. Not how many data scientists they hired. How well they articulated what they wanted.

Consider a practical example. You run a customer success team. Your goal is to reduce churn among mid-market accounts. You have deployed an AI system that monitors customer health signals, usage data, support ticket history, NPS scores. The system can identify at-risk accounts and generate recommended actions.

At Point 1 on the ambient spectrum, your team queries the system each morning: "Show me the accounts most at risk of churning this month." They review the list, decide which accounts to prioritize, and figure out what to do about each one.

At Point 3, the system proactively pushes alerts when an account's health score drops below a threshold, with a suggested intervention attached.

At Point 4, the system detects the drop, selects an intervention from a playbook you approved, and executes it: sends a personalized check-in email from the account manager, schedules a QBR, escalates to the VP of Customer Success if the account's ARR exceeds a threshold. The account manager is notified after the fact. They review what was done. They override if needed. The loop closes.

The difference between Point 1 and Point 4 is not model capability. The model is the same. The difference is that at Point 4, someone defined the intent infrastructure: which signals matter, what thresholds trigger action, which interventions are appropriate for which situations, who gets notified, who gets escalated, what the boundaries of autonomous action are. Someone did the hard, unglamorous, invisible work of making the organization's intent machine-readable.

That someone is you. Or it should be.

You expected your AI vendor to solve this. They cannot. They can give you the model, the APIs, the agent framework. They cannot tell you what your customer success team should do when an account's health score drops, because that depends on your segment strategy, your relationship history, your pricing flexibility, your competitive situation. That is your intent. Only you can define it.

This is a new kind of leadership work. It does not look like traditional strategic planning, which operates at a level of abstraction too high for a system to act on. ("Reduce churn by 15% this year" is a goal, not intent. It tells the system what to measure but not what to do.) It does not look like traditional operational management, which provides instructions to people who can fill in the gaps with judgment and experience. It looks like something in between: detailed enough for a machine to execute, flexible enough to accommodate the real-world variability that any business encounters.

I sometimes call this machine-readable intent, and I know the phrase sounds cold. It is cold. But it is precise, and precision is what this chapter promised.

Here is what I want you to carry forward. The question is no longer "how do I use this tool better." The question is "how do I define what I want with enough clarity that the tool stops being a tool and starts being an environment."

The distinction between a tool and an environment is the whole game. A tool waits. An environment acts. A tool requires your attention. An environment preserves it. A tool sits at Point 1 on the spectrum. An environment operates at Point 4.

You are not missing the right technology. You are missing the right description of your own intent.

That is what ambient intelligence actually is. It is not a product. It is not a feature. It is not a thing you buy. It is the degree to which your organization has done the work of making its intent legible to systems that are already capable of acting on it. It is a measure of organizational self-knowledge made operational.

Every chapter after this one will be about building that self-knowledge in specific domains, with specific methods, against specific resistance. The proving grounds are where the theory meets the org chart.

But the starting point is here. The ambient spectrum. The three prerequisites. The honest assessment of where you are today. And the question that will not go away no matter how much you invest in better models:

What do you actually want?

And can you say it clearly enough that a machine would understand?

Software development answered that question first. Faster, and more completely, than anyone expected.

Chapter 4

Chapter 4

Proving Ground One: Software Development

The Domain Where the Ambient Shift Arrived First and What It Reveals

From Autocomplete to Autonomous: The Arc in Eighteen Months

Here is a number worth remembering: eighteen months.

In January 2023, the most advanced AI-assisted coding tool in wide use was a glorified autocomplete. It watched you type, guessed the next line, and offered it in gray text. You pressed Tab or you didn't. That was the interaction. It was useful the way spellcheck is useful — catching what your fingers already knew but your brain hadn't bothered to finalize.

By the middle of 2024, engineering teams at a handful of companies were using agentic coding systems that could take a task description, read the existing codebase, plan an implementation across multiple files, write the code, run the tests, interpret the failures, fix the errors, and submit a working pull request. The engineer typed a paragraph. The system did the rest.

Eighteen months from autocomplete to autonomous implementation.

No other knowledge-work domain has moved this fast. And because software development moved first, it shows us things about the ambient shift that other domains have not yet had to confront. This chapter is about what it shows us.

Let me trace the arc in concrete steps, because each step represents a specific reduction in what Chapter 2 called cognitive exit cost, and the pattern of those reductions reveals the mechanics of the ambient shift more clearly than any theory could.

Step one: line completion. This is where GitHub Copilot started when it launched publicly in mid-2022. You wrote a function signature. The tool suggested the body. You wrote a comment describing what you wanted. The tool guessed the code that matched the comment. The interaction sat firmly at Point 3 on the ambient spectrum from Chapter 3: context-triggered suggestion. The system watched, inferred, and offered. You accepted or rejected.

The cognitive exit cost was low but constant. Every suggestion required a micro-evaluation. Is this line correct? Does it match my intent? Does it handle edge cases? Does it use the right library? Each evaluation pulled you out of the generative flow of writing code and into the analytical mode of reviewing someone else's code. This is a real cost. Ask any programmer about the difference between writing code and reviewing code. They are different mental states, and switching between them fifty times an hour has a price.

Step two arrived fast. Multi-line, then multi-block, then function-level generation. By late 2023, the tools were not suggesting the next line. They were suggesting the next thirty lines. You described a function in natural language, and the system wrote it. You described a class, and the system structured the whole thing. The granularity of the suggestion went up, which meant the frequency of micro-evaluations went down. Instead of evaluating fifty one-line suggestions per hour, you were evaluating five thirty-line suggestions per hour.

This seems like a simple quantitative change. It is not. It changed the nature of the engineer's attention. At one line per suggestion, you are still in the code. You are thinking about syntax, variable names, logic flow. You are a writer with an aggressive spellchecker. At thirty lines per suggestion, you are no longer in the code. You are above it. You are reading a proposed implementation against your mental model of what the implementation should do. You have shifted from author to editor.

That shift is the seed of everything that follows.

Step three: codebase-aware generation. Early tools had a narrow context window. They could see the file you had open, maybe a few related files. They did not know the rest of your project. They did not know your team's conventions. They did not know that the function they were suggesting duplicated one that already existed in a utility module three directories away.

By early 2024, several tools had expanded their context to include large portions of the codebase. Some ingested the whole repository. Some used retrieval mechanisms to pull in the most relevant existing code. The quality of suggestions jumped, not because the underlying model got smarter, but because the context got richer. The system stopped suggesting generic code and started suggesting code that fit the specific project.

This is persistent intent context at the project level. Not the full organizational intent infrastructure described in the previous chapter, but a meaningful version of it. The system knows what this codebase looks like. It knows the patterns already established. It generates code that is consistent with what exists. The engineer spends less time correcting stylistic mismatches and architectural violations, which means fewer evaluations, which means lower cognitive exit cost.

Step four is where it gets strange.

Step four: agentic implementation. Starting in mid-2024 and accelerating through early 2025, a new category of tool appeared. These were not suggestion engines. They were agents. You gave them a task. "Add a rate limiter to the API gateway. It should use a sliding window algorithm, respect the per-customer limits in the config table, and return a 429 status with a Retry-After header when the limit is exceeded." The agent read the codebase, identified the relevant files, planned the implementation, wrote the code across multiple files, ran the existing test suite, wrote new tests for the new behavior, identified a failing test caused by an unanticipated interaction with the caching layer, fixed the interaction, ran the tests again, and opened a pull request with a description of what it did and why.

The engineer's role in this interaction was: write the task description. Then wait. Then review the pull request.

You can feel the gap between this and line-completion. It is not a gap in degree. It is a gap in kind. The engineer did not write code. Did not read code while it was being written. Did not make implementation decisions. Did not debug. The engineer specified what should happen, and the system made it happen.

On the ambient spectrum, this is somewhere between Point 3 and Point 4. The system inferred the implementation approach from context. It took action (writing code, running tests) without asking for approval at each step. It closed the loop when tests failed by fixing and re-running. The human set the boundaries (the task description) and reviewed the outcome (the pull request). The cognitive exit cost during implementation was zero, because the engineer was not involved during implementation.

That last sentence deserves to sit by itself for a moment.

The cognitive exit cost during implementation was zero.

This is the thing the ambient spectrum predicts and that software development proved first: when you move up the spectrum, the human's attention is no longer consumed by the execution of the work. It is consumed by something else. What that something else is, and whether organizations know how to value it, is the subject of the rest of this chapter.

The eighteen-month arc from autocomplete to agentic implementation happened in public. Anyone paying attention to the software development tooling space watched it unfold in real time. But most organizations did not update their model of what engineering work is, what engineers spend their day doing, or what they should be evaluated on. The tools changed. The job descriptions did not. The tools changed. The hiring criteria did not. The tools changed. The management assumptions did not.

When factories first electrified, most just bolted motors onto the existing steam-driven machines and kept the old floor plan. They got some efficiency gain and missed the real transformation. That is where most engineering organizations are today. The technology arrived. The reorganization has not.

The Team That Stopped Writing Code

I want to tell you about a team. I am going to change some details to protect identities, but the operating facts are real.

This team builds and maintains a set of backend services for a mid-size SaaS company. Eight engineers. One engineering manager. The services handle user authentication, billing, and data export for the company's enterprise customers. It is not glamorous work, but it is critical. When billing breaks, revenue stops.

In early 2024, the team adopted an agentic coding tool for a subset of their work. They started with bug fixes and small feature additions, tasks where the scope was well-defined and the risk of a bad implementation was limited. The engineering manager imposed a rule: every agent-generated pull request had to be reviewed by a human engineer with the same rigor as a human-written pull request. Same code review standards. Same test coverage requirements. Same approval process.

For the first month, the results were mixed. The agent produced code that worked but did not match the team's internal conventions. Variable naming was off. Error handling patterns were inconsistent with the rest of the codebase. The agent-generated pull requests needed almost as much revision as they would have taken to write from scratch. Some engineers started calling the agent "the intern."

Then something shifted.

The team started writing better task descriptions. Not because management told them to. Because they noticed that the quality of the agent's output was directly proportional to the precision of the input. Vague task descriptions produced vague implementations. Specific task descriptions, ones that named the exact files to modify, referenced the existing patterns to follow, specified the error-handling approach, and defined the test cases, produced implementations that needed minimal revision.

By the third month, two things had changed. First, the team had developed a shared format for task specifications. It was not formally documented at that point, just a pattern that emerged from the engineers copying the task descriptions that had worked well. Second, the time engineers spent writing code had dropped by roughly 40 percent. The time they spent writing task specifications and reviewing agent output had increased by roughly the same amount.

The total hours worked did not change much. The composition of those hours did.

Here is what a typical day looked like for an engineer on this team by month four. Morning: review pull requests from the agent's overnight work. (The team had configured the agent to pick up tasks from a queue and work asynchronously.) This took about two hours. The engineer was reading code, running through test results, checking edge cases, verifying that the implementation matched the specification. This is code review, the same activity engineers have always done, but now it constituted the largest block of the day.

Mid-morning to early afternoon: write task specifications for the next batch of work. The engineer would take a feature request or bug report, analyze the existing codebase to understand the scope of the change, identify the files and patterns involved, and produce a detailed specification that the agent could implement. This was the intellectually demanding part of the day. It required deep understanding of the system architecture, awareness of dependencies, and the ability to anticipate edge cases that the agent might miss.

Late afternoon: handle the tasks that the agent could not. Some work was too ambiguous, too architecturally significant, or too entangled with other systems for the agent to handle reliably. These tasks still required human implementation. But they were a shrinking share of the total.

The engineering manager told me something I have been thinking about ever since. She said: "My best engineers were already the ones who were best at this. They were the ones who could look at a bug report and immediately see the shape of the fix, who could describe what needed to happen before they started typing. The agent just made that skill the whole job."

The team's internal hierarchy shifted. The most valued engineers were no longer the fastest coders. They were the clearest thinkers. The engineer who could produce a task specification that the agent executed perfectly on the first try became more valuable than the engineer who could write elegant code by hand but produced vague specifications that the agent botched.

One engineer, a mid-level developer who had been considered a solid-but-not-spectacular performer, turned out to have an unusual gift for writing specifications. She was precise, systematic, and had an instinct for the edge cases that would trip up the agent. Her productivity, measured in completed tasks, roughly tripled. She did not write a single line of code in her last two months on the tools. She specified.

Another engineer, a senior developer who had been the team's best pure coder, struggled with the transition. He found writing specifications tedious and imprecise compared to the direct control of writing code himself. His review of agent-generated pull requests was often a rewrite rather than a review. He was, functionally, still writing code, just doing it in the most indirect and inefficient way possible.

The engineering manager had a problem she had never faced before. Her performance evaluation criteria measured code quality, code volume, test coverage, and architectural contributions. The engineer who was now most productive had zero code commits. The engineer who was struggling had the most commits, because he kept rewriting the agent's output. The metrics said the wrong person was winning.

She rewrote the criteria. Not because a management consultant told her to. Because the old criteria had become fictions.

What does the engineering manager herself do now? Before the agent, she spent most of her time on project planning, sprint management, code review of the hardest pull requests, and the usual management tasks: one-on-ones, hiring, cross-team coordination. Her technical contribution was concentrated in architecture decisions and the review of complex changes.

Now she spends most of her technical time on something she calls "system intent documentation." She writes and maintains a set of documents that describe, at a high level, how each service should behave, what its boundaries are, what patterns new code should follow, and what trade-offs are acceptable. These documents are not for humans, or not only for humans. They are reference material for the agent. When an engineer writes a task specification, they point the agent at the relevant intent document for context.

She is, without using the phrase, building standing context at the team level, the kind of shared, documented description of what the system is meant to do that tells an agent how new work should fit in. She is doing it because the tool forced her hand.

The team's output, measured in features shipped and bugs resolved, increased by what she estimated to be 60 to 70 percent over six months. The team did not grow. The hours did not increase. The composition of the work changed, and the change unlocked capacity that had been locked inside implementation labor.

Was it smooth? No. Two engineers left, both voluntarily, because they preferred writing code to specifying and reviewing it. They went to companies where hand-written code was still the norm. The engineering manager spent uncomfortable weeks rethinking what "senior" meant on her team. And there was a production incident in month five caused by an agent-generated change that passed all tests but introduced a subtle data-consistency bug that only manifested under high concurrency. The postmortem found that the task specification had not mentioned concurrency requirements, because the engineer writing it had not thought about them.

That incident taught the team more about the ambient shift than any success did. The bug was not in the code. The bug was in the specification. The system did exactly what it was told. What it was told was incomplete.

Intent as the Artifact

In traditional software engineering, the primary artifact is code. The thing the engineer produces, the thing that gets reviewed, tested, deployed, and maintained, is the implementation. Design documents exist. Architecture diagrams exist. But they are secondary. They describe intentions. The code is the execution. When the design and the code disagree, the code is right, because the code is what runs.

This ordering is so deeply embedded in engineering culture that it feels like a law of nature. It is not. It is an artifact of a world where the gap between intent and implementation could only be crossed by a human writing code. The design document could not compile itself. The architecture diagram could not deploy. A person had to sit down and translate the intent into instructions the machine could execute. The implementation was the artifact because there was no other way to get from intent to execution.

Now there is another way.

When an agentic system can take a sufficiently precise description of desired behavior and produce a working implementation, the locus of value shifts. The implementation is still real. It still runs. It still matters. But it is no longer the thing that required the human's most sophisticated thinking. The human's most sophisticated thinking went into the specification. The specification is the artifact.

I want to be precise about what I mean by "specification" here, because the word carries baggage from an older era of software development. I do not mean a 200-page requirements document in formal notation. I do not mean UML diagrams. I do not mean the waterfall-era fantasy of fully specified requirements handed over a wall to implementers.

I mean something more like this: a clear, complete description of what the system should do, under what conditions, with what constraints, following what patterns, handling what edge cases, at a level of precision that an agentic system can implement it without asking clarifying questions.

That last clause is where the skill lives. "Without asking clarifying questions."

Consider the difference between two task specifications:

First version: "Add rate limiting to the API."

Second version: "Add rate limiting to the /api/v2/export endpoint. Use a sliding window algorithm with a 60-second window. Per-customer limits are stored in the customer_config table in the rate_limit_per_minute column. When the limit is exceeded, return HTTP 429 with a Retry-After header set to the number of seconds until the window resets. Log rate-limit events to the structured log with customer_id, endpoint, current_count, and limit. Do not rate-limit requests from internal service accounts, which are identified by the X-Internal-Service header. Follow the error-handling pattern established in /api/v2/billing/handler.go."

The first version is a prompt. The second version is an intent artifact. The difference between them is the same as the difference between telling a contractor "make the kitchen bigger" and giving them architectural drawings with dimensions, materials, electrical layouts, and plumbing specifications.

The first version will produce a result. It might even produce a good result, if the agent can infer enough from the codebase context. But it will probably require clarification, revision, back-and-forth. It is a starting point for a conversation, not a specification for execution.

The second version is something an agentic system can act on. It specifies behavior, constraints, patterns, and references. It tells the system not just what to do but how it should fit into the existing system. It is a shared, documented description of what the system is meant to do, the kind of standing context that tells an agent how new work should fit in.

Writing the second version requires more engineering skill than writing the first. It requires the engineer to have already thought through the edge cases, the architectural implications, the interaction with existing systems. It requires the kind of deep system understanding that senior engineers have and junior engineers are still developing.

This is the inversion that the ambient shift produces in software development. Under the old model, deep system understanding was valuable because it enabled the engineer to write better code. Under the new model, deep system understanding is valuable because it enables the engineer to write better specifications. The skill is the same. The output is different. And the output is what gets valued, measured, and rewarded.

Is a specification code? In some meaningful sense, yes. It is an instruction set for a machine. But it operates at a higher level of abstraction. It describes what, not how. It specifies behavior, not mechanism. The agentic system handles the how. The human handles the what and the why.

This has implications that go far beyond tooling. If intent is the artifact, then the quality of the artifact depends on the clarity of the intent. And the clarity of the intent depends on how well the engineer understands the system they are specifying changes to. Which means the ambient shift does not reduce the need for engineering expertise. It concentrates that expertise on a different output.

The engineers on the team I described in the previous section figured this out through practice. The ones who thrived were the ones who could hold the whole system in their heads and express their understanding with precision. The ones who struggled were the ones whose understanding was implicit, embedded in their coding habits rather than available for explicit articulation.

There is a phrase in cognitive science for the difference: tacit knowledge versus explicit knowledge. A skilled carpenter has tacit knowledge of wood grain, tool angles, and joint strength that they cannot easily put into words but that shows up in the quality of their work. If you gave that carpenter an apprentice who could execute any instruction perfectly but could not infer anything unstated, the carpenter would have to make their tacit knowledge explicit. That is exactly what the ambient shift demands of software engineers.

Some will find this freeing. They always wanted to work at the level of system design, and the obligation to also implement their designs was the tedious part. For these engineers, the agentic system is what they were waiting for.

Some will find it threatening, because their value was in their implementation skill, in their ability to write tight, efficient, elegant code. That value does not disappear, but it migrates. It shows up in the ability to evaluate agent-generated code, to catch the subtle bugs that a less skilled reviewer would miss, to know when the implementation is correct even though it is not the implementation they would have written.

And some will find it disorienting, because the craft they spent years developing, the feel for code, the rhythm of debugging, the satisfaction of a working function that they wrote themselves, is no longer the center of the work. It has been moved to the periphery. The center is now a document. A specification. A description of intent.

That is an emotional transition, not just a professional one. And organizations that pretend it is purely professional will lose people they did not need to lose.

What Changes About Leadership When Code Writes Itself

If you manage a team of engineers, here is your new problem: the thing you used to manage is no longer the thing that matters most.

You used to manage implementation. Your job was to allocate engineering time against a backlog of features and fixes, make sure the code was good, keep the architecture from degrading, and ship on schedule. The scarce resource was implementation capacity. Your team had X engineers, each engineer could write Y amount of code per sprint, and your job was to maximize the quality and alignment of that code with business needs.

When agentic systems handle implementation, the scarce resource changes. Implementation capacity becomes elastic. The constraint is no longer how many engineers you have. The constraint is how well your team can specify what needs to be built, how accurately your intent documentation describes the system, and how reliably your review process catches the things the agent gets wrong.

These are different management problems.

Consider architectural intent. In a traditional engineering team, the architecture is maintained through a combination of formal decisions (architecture review boards, design documents, ADRs) and informal culture (code review feedback, pair programming, mentorship). A senior engineer who consistently names variables well, structures error handling consistently, and writes clean abstractions teaches the team's architecture by example. The codebase itself is a teaching artifact.

An agentic system does not learn from example the way a junior engineer does. It does not absorb the team's culture through osmosis. It reads the codebase, and it can match existing patterns if they are consistent enough, but it does not understand why those patterns exist. It will cheerfully introduce a new pattern that contradicts an architectural decision made six months ago if that decision is not documented in a place the agent can access.

The engineering leader's job, then, is to make the architecture legible. To write down the things that used to be transmitted through culture. To produce documents that say: here is why we use this pattern, here is when it applies, here is what the alternatives are and why we rejected them. Not for the human engineers, who already know. For the agent, which does not.

This is the "system intent documentation" that the engineering manager in my earlier example was producing. She stumbled into it. Most engineering leaders have not.

Then there is quality signal design. When a human writes code, code review catches the problems. A reviewer reads the code, understands the intent, and flags discrepancies. This works because the reviewer is a human with the same tacit knowledge as the author. They know what "good" looks like in this codebase without having to define it formally.

When an agent writes code, the reviewer is still human. But the volume of code to review has increased dramatically, because the agent produces code much faster than a human does. And the types of problems are different. Agent-generated code tends to be syntactically correct, well-structured, and logically coherent at the local level. The bugs are at the integration level, the system level, the "this works perfectly in isolation but creates a subtle problem when combined with that other service" level.

Catching these bugs requires a review process designed for them. It requires automated quality signals that go beyond unit tests: integration tests, contract tests, performance benchmarks, data-consistency checks. The engineering leader has to design these signals, because the standard testing infrastructure that was sufficient for human-authored code is not sufficient for agent-authored code at scale.

Why not? Because human authors were slower. A human engineer writing code for two days and submitting one pull request generates a manageable review load. An agentic system processing a task queue overnight and submitting fifteen pull requests generates a review load that exceeds what the human team can absorb by morning. If the quality signals are not automated, the review becomes the bottleneck, and the entire productivity gain of the agentic system is consumed by the review process.

This is the operational reality that most discussions of AI coding tools skip.

The third change is the hardest to describe but may be the most important. It is the responsibility for system coherence.

A codebase written entirely by humans develops coherence organically. The humans who write it talk to each other. They share mental models. They argue about design decisions in pull request comments. The codebase, over time, reflects a shared understanding, imperfect, sometimes contradictory, but real. The architecture is a social artifact as much as a technical one.

A codebase partially written by an agent does not develop coherence organically. The agent does not have a mental model. It does not carry context from one task to the next unless that context is explicitly provided. If Engineer A specifies a task that introduces pattern X, and Engineer B specifies a task the next day that introduces pattern Y, and X and Y are architecturally incompatible, the agent will implement both without complaint. The codebase starts to drift. After six months, you have three different error-handling patterns, two competing approaches to database access, and an authentication module that makes assumptions the billing module contradicts. After a year, an engineer opens a file and cannot tell which pattern is canonical. A new feature interacts with two modules built on incompatible assumptions, and the integration test passes because each module's tests were written against its own logic. The bug surfaces in production on a Friday afternoon, in a payment flow, when a retry mechanism built under pattern X sends a duplicate charge because the idempotency guard was built under pattern Y. The postmortem takes a week. The root cause is not any single specification. The root cause is that nobody maintained the whole.

That coherent mind used to be distributed, emerging from the arguments in pull request comments, the shared habits in the codebase, the senior engineer's instinct for what fit. Agents don't inherit that. Now it has to be written down, and someone has to own the writing.

That someone is the engineering leader.

The job title stays the same. The job does not.

You are no longer primarily a manager of people producing code. You are the maintainer of a system's intent. You are the person who knows what the system should be, who can articulate that vision with enough precision for both humans and agents to act on it, and who catches the drift when individual specifications start pulling the architecture in incompatible directions.

This is harder than managing implementation. Implementation management is a logistics problem. Intent management is a clarity problem.

The Hiring Question Nobody Is Asking Yet

In 2024, the standard job posting for a mid-level software engineer looked identical to one from 2021. Required: 3-5 years of experience. Proficiency in Python, JavaScript, or Go. Experience with cloud infrastructure. Familiarity with CI/CD pipelines. Strong problem-solving skills. Good communication.

Nothing about specification writing. Nothing about intent articulation. Nothing about the ability to review agent-generated code at scale. Nothing about architectural thinking at the level required when you are directing an agent rather than typing the code yourself.

These postings describe a job that is disappearing. They do not describe the job replacing it.

I want to be careful here. I am not saying implementation skill is worthless. It is not. An engineer who cannot read code, cannot understand algorithms, cannot reason about system behavior at the implementation level, cannot write good specifications. You have to understand the machine before you can direct it. A film director who has never held a camera can still make a great film, but a film director who does not understand light, composition, and editing will make a bad one.

The question is not whether implementation skill matters. It is whether implementation skill is the primary thing you should be selecting for when you hire.

If the daily work of an engineer is shifting from writing code to specifying behavior and reviewing agent output, then the primary hiring criteria should test for the skills that work requires. Can this person decompose a complex business requirement into precise, testable behavioral specifications? Can this person read a body of unfamiliar code and identify architectural patterns, inconsistencies, and risks? Can this person communicate system constraints in a way that is both technically accurate and actionable by an agentic system?

These are related to traditional engineering skills. They are not identical to them. And the standard technical interview, which tests whether a candidate can write a sorting algorithm on a whiteboard or invert a binary tree in forty-five minutes, tells you almost nothing about them.

Here is the uncomfortable question that most engineering organizations have not yet confronted: what do you do with the engineers you already have?

Not the ones who are thriving in the transition. They are fine. I mean the ones whose primary value has been their ability to write excellent code, and whose ability to articulate intent at the specification level is average or below.

Some of them will adapt. Given the right support, the right training, and enough time, they will develop the specification skills that the new work requires. Some of them are already good at this and just never had reason to demonstrate it, because the job never asked for it.

Some will not adapt, not because they lack intelligence but because the skill of explicit articulation is genuinely different from the skill of implicit execution. There are brilliant musicians who cannot teach. There are brilliant athletes who cannot coach. There are brilliant coders who cannot specify. The knowledge is real, but it is locked in a form that does not transfer to the new output format.

Organizations owe these people honesty. Not the comfortable lie that "nothing is really changing" and not the brutal announcement that "your skills are obsolete." Something harder. Something like: the center of this work has moved, and we are going to invest in helping you move with it, and we are going to be honest about what "moved with it" looks like.

Most organizations will not do this. Most organizations will keep hiring for the old criteria, keep promoting for the old criteria, keep evaluating for the old criteria, until the gap between what the team does and what the evaluation system measures becomes so wide that something breaks. A top performer who writes zero code will be ranked in the bottom half by the performance system. A struggling engineer who rewrites every agent pull request by hand will look productive because the commit counts are high. The metrics will lie, and the lies will compound, and eventually the best specification writers will leave for organizations that value what they actually do.

This is not a theoretical risk. The team I described earlier already saw it.

The promotion question is just as thorny. In most engineering organizations, the path to senior engineer and beyond runs through technical contributions measured in code. Design documents help. Architecture decisions help. But the backbone of the promotion packet is the code you shipped, the systems you built, the technical problems you solved through implementation.

If the primary artifact is specification, what fills the promotion packet? A collection of task descriptions, no matter how precise and well-crafted, does not look like a senior engineering contribution in the current evaluation culture. It looks like project management. The engineer who specified a complex feature that the agent implemented flawlessly will have a hard time demonstrating the depth of their contribution, because the visible output, the code, has someone else's name on it. Something else's name.

This is a measurement problem masquerading as a talent problem. The talent is there. The measurement system cannot see it.

And then there is the most uncomfortable question of all. If the ambient shift makes specification the primary skill and the primary artifact, and if specification is a skill that can be tested and developed, then the pool of people who can do engineering work expands dramatically. A product manager who can write a precise behavioral specification is doing engineering work, even if they have never written a line of code. A QA engineer who can describe test scenarios with enough specificity for an agent to implement them is doing engineering work. A domain expert who can articulate business rules in structured form is doing engineering work.

The boundary of "who is an engineer" gets blurry. The boundary moves when organizations decide to measure the output, working software that meets behavioral specs, rather than the process, lines of code written by credentialed engineers. That measurement shift is further away than the tooling shift. But it follows the same logic. And the people inside the current boundary have every reason to be nervous about it, because their professional identity, their compensation, their organizational status, all depend on the boundary being where it has always been.

Everything I have described in this chapter, the shift from autocomplete to agentic implementation, the team that stopped writing code, intent as the primary artifact, the transformation of engineering leadership, the hiring question nobody is answering, is happening first in software development. It will not stay there. The same pattern, the same shift, the same uncomfortable questions, will arrive in every knowledge-work domain where the work can be decomposed into intent and execution.

Software development is the canary. The canary is singing. The question is whether the rest of the organization is listening.

Chapter 5

Chapter 5

Proving Ground Two: Growth

When Ambient Systems Enter the Revenue Function

The Campaign That Ran Itself

In March 2025, a B2B software company with about 400 enterprise customers shipped a product update to its growth system and then, for eleven days, nobody in marketing touched it.

Not because they were on vacation. Not because they forgot. Because the system did not need them.

Here is what happened during those eleven days. The system identified that a segment of 38 accounts in the financial services vertical had increased their usage of one product feature by roughly 60 percent over the prior two weeks. It cross-referenced that usage data with the renewal dates in the CRM, flagged twelve accounts as likely expansion candidates, and generated a sequence of three email variations for each account. The variations differed in subject line framing, call-to-action phrasing, and the specific feature benefit highlighted, and the system selected which variation to send based on each account's historical engagement patterns with prior emails.

For accounts that opened but did not click, it adjusted the follow-up. For accounts that clicked through to the pricing page, it routed a notification to the account executive with a summary of the usage data and the suggested expansion offer. For accounts that did not open any of the three emails, it shifted the contact to a different channel, triggering a personalized in-app message the next time the primary user logged in.

The system also noticed something the marketing team had not. A cluster of seven accounts in the healthcare vertical, accounts that were not flagged as expansion targets, had shown a sharp decline in login frequency over the same period. It generated a customer health alert, drafted a re-engagement sequence, and queued it for review. But the re-engagement sequence was configured for a lower autonomy threshold, so it sat in the queue rather than sending automatically.

Eleven days. Thirty-eight expansion targets contacted. Seven at-risk accounts flagged. Revenue influenced: roughly $180,000 in pipeline, according to the attribution model the system maintained. The marketing team reviewed the activity log on day twelve, approved the queued re-engagement sequence, and made two minor copy adjustments to the next round of expansion emails.

The VP of Marketing told me she felt two things simultaneously: excitement that the system worked, and a discomfort she could not immediately name. The excitement was obvious. The discomfort took her a few weeks to articulate.

What bothered her was this: the system had made commercial decisions. Real ones. It decided which accounts to pursue. It decided what message to send. It decided when to escalate to a human and when not to. It decided that the healthcare accounts were at risk before anyone on her team had noticed. Each individual decision was small. The cumulative effect was that the system had executed a growth strategy, and the strategy it executed was reasonable, probably better-targeted than what her team would have produced manually, and she had not approved it in advance.

She had approved the rules. She had approved the thresholds. She had approved the message templates months ago. But the specific application of those rules to those accounts at that moment was the system's judgment, not hers.

That is what an ambient growth system looks like from the outside. A campaign that ran itself. Revenue that appeared in the pipeline without anyone launching anything. Commercial decisions made continuously, at a pace and specificity that no human team could match, by a system operating on inferred intent rather than direct instruction.

And here is the thing about it: the technology was not exotic. The components were a CRM, a product analytics tool, a marketing automation platform, and a language model connected through an integration layer. What was unusual was not the tools but the degree of autonomy the tools had been granted and the quality of the intent infrastructure they were operating on. The system knew what "expansion-ready" meant because someone had defined it precisely enough for the system to act on. The system knew what "at-risk" meant because someone had quantified the behavioral signals that predicted churn. These definitions were the real work. The system just ran on them.

The VP of Marketing's discomfort was not about the technology. It was about what the technology had exposed. She realized that the system was executing against a set of commercial priorities that nobody had formally written down. The definitions of "expansion-ready" and "at-risk" had been specified for the system by an operations analyst who had inferred them from past behavior. The message templates had been written by a content marketer following general brand guidelines. The channel-switching rules had been configured by a marketing ops manager based on industry benchmarks.

None of these people had sat in a room and asked: what is our commercial intent? What are we trying to do with these accounts, and why, and how does that connect to the company's growth strategy for this quarter?

The system had been given operational instructions. It had not been given strategic direction.

For eleven days, that gap did not matter. The system's operational instructions were good enough that the outputs looked right. But the VP of Marketing understood, maybe before anyone else at her company, that "looked right for eleven days" is not the same as "aligned with where we want to be in eighteen months."

She had stumbled into the defining problem of ambient growth systems. Signal debt.

Signal Debt: What Happens When Systems Outrun Strategy

The term is mine, so let me define it precisely.

Signal debt is the accumulating cost of operating growth systems that have more inferential capability than the organization has provided them intentional direction. It is the gap between what the system can figure out on its own and what the organization has told it to care about. And like financial debt, it compounds.

Here is how it works in practice.

An ambient growth system continuously ingests data: product usage, CRM activity, engagement metrics, support tickets, competitive signals, market data. It infers patterns from that data. It makes predictions: this account is likely to expand, this account is likely to churn, this segment is underserved, this message variant outperforms that one. Then it acts on those predictions, within whatever autonomy boundaries it has been given.

The inferences are often correct. The patterns are real. The system is good at finding them. That is the whole point.

The problem is that correct inferences are not the same as strategically aligned inferences.

A system optimizing for engagement will find the messages that get opened and clicked. A system optimizing for pipeline will find the accounts most likely to enter a sales conversation. A system optimizing for retention will find the at-risk accounts and intervene. All of these are good. None of them is a strategy. Strategy is the decision about which of these things matters most, for which accounts, at which point in the customer lifecycle, in service of what commercial outcome over what time horizon.

If nobody tells the system what the strategy is, the system will infer one. It will optimize for whatever the available metrics reward. And the metrics most available to most growth systems are short-term engagement metrics: opens, clicks, replies, meetings booked. So the system will relentlessly optimize for short-term engagement, because that is what it can see and measure.

This is where signal debt begins to accumulate.

Imagine a system that identifies a segment of mid-market accounts showing increased product usage. The system infers expansion potential and begins an outreach sequence. The outreach works: meetings are booked, conversations happen, some accounts expand. The metrics look great. The system learns that this pattern works and does more of it.

Six months later, the head of customer success notices that the accounts that expanded most aggressively are also churning at a higher rate. Why? Because the expansion was real, but it was premature. The accounts were not actually ready for the expanded product. They had increased usage of one feature, which the system correctly identified, but their overall adoption maturity was low. They expanded, struggled with the new capabilities, and eventually contracted or left.

The system did what it was told. It optimized for expansion signals. The problem is that nobody told it the difference between expansion signals and expansion readiness. That difference is a strategic judgment. It lives in a person's head, or in a quarterly planning document, or in a set of implicit assumptions that the customer success team carries but has never made explicit.

The cost of that unexpressed judgment is signal debt. And six months of autonomous operation at scale can produce a lot of it.

Signal debt is sneaky because it looks like success while it is accumulating. The dashboards are green. Pipeline is up. Meetings are booked. The system is working. The debt shows up later, in churn, in customer dissatisfaction, in sales cycles that close fast but produce low-quality revenue, in brand dilution from messages that were technically accurate but tonally wrong for the moment.

I think the analogy to technical debt is useful here but incomplete. Technical debt is the cost of shortcuts in code that you will have to pay back later. You know you are taking the shortcut when you take it. Signal debt is different. You often do not know you are accumulating it, because the system's behavior looks correct at each individual decision point. It is only in the aggregate, over time, that the drift becomes visible.

And the compounding mechanism is pernicious. The system learns from its own outputs. If it sends expansion emails and some of them convert, it learns that expansion emails work. It sends more. If the metric it is optimizing for is pipeline generated, and premature expansions generate pipeline, the system gets rewarded for the wrong behavior. Each cycle reinforces the drift. The system gets better and better at doing the thing nobody actually wanted it to do, and worse and worse at doing the thing nobody told it to do because nobody had articulated what that thing was.

Three conditions make signal debt compound faster:

The first is high system autonomy. The more decisions the system makes without human review, the faster debt accumulates when those decisions are misaligned. The B2B company I described earlier had given its system moderate autonomy, expansion emails sent automatically, re-engagement held for review. If the re-engagement sequence had also been fully autonomous, the system might have made retention decisions that conflicted with the expansion decisions, because neither had been grounded in a unified commercial intent.

The second is metric richness without strategic hierarchy. The system that expanded the mid-market accounts had high pipeline numbers and a worsening retention curve at the same time. Both metrics were accurate. They told opposite stories. Without a hierarchy, the system favored the one with the faster feedback loop. Pipeline numbers updated weekly. Retention data took months to materialize. The system did not wait.

The third is organizational silence about trade-offs. Every growth strategy involves trade-offs. Pursue mid-market or enterprise? Prioritize new logo acquisition or existing account expansion? Invest in product-led growth or sales-led motion? These trade-offs are the substance of commercial strategy. They are also the things that executives least like to make explicit, because explicit trade-offs create losers, and losers push back. So the trade-offs stay implicit. Everybody sort of knows the direction but nobody has written it down with enough precision for a system to act on.

Under the old model, implicit trade-offs were survivable. The humans in the system applied judgment. The marketing manager who "just knew" that the healthcare vertical was not ready for aggressive expansion this quarter made that judgment call invisibly, in the hundreds of small decisions she made about campaign targeting and message tone. The judgment was embedded in the execution. The strategy did not need to be explicit because the people executing it carried the strategy in their heads.

An ambient system does not carry strategy in its head. It carries whatever you put in its configuration. If the configuration is operational instructions without strategic intent, the system will execute operationally without strategic direction, and it will do so at a speed and scale that makes the resulting drift very expensive.

This is why signal debt is a leadership problem, not a technology problem. The system is working. The leadership has not specified what "working" should mean.

And those mid-market accounts that expanded and then churned? They are not coming back.

The Commercial Intent Document Nobody Has Written

If I asked you to show me your company's commercial intent in a form precise enough for a machine to act on, what would you hand me?

Most commercial leaders I have posed this question to go quiet for a moment and then start listing things that are adjacent to the answer but are not the answer.

The annual plan. The board deck. The OKRs. The revenue targets broken down by segment. The competitive positioning document. The ideal customer profile. The brand guidelines.

All of these are real. All of them contain commercial intent. None of them is structured at a level of precision that an ambient growth system can operationalize. They are documents written for humans, full of qualitative judgments, unstated assumptions, and context that only makes sense if you were in the room when they were written.

"Focus on enterprise accounts in financial services with strong product-market fit" is a human-readable strategic directive. It is not a machine-readable commercial intent. An ambient system encountering that statement would need to know: What defines "enterprise"? Revenue above what threshold? Employee count above what number? Is it one or both? What counts as "financial services"? Banks? Insurance? Fintech? Payment processors? All of them? What does "strong product-market fit" mean in terms of measurable signals? Usage of which features? At what frequency? With what breadth of user adoption? Over what time period?

The human reading that directive fills in the blanks from experience, context, and judgment. The machine cannot.

Why has this gap been invisible? Because under the prompt-based paradigm, it did not matter. When a marketer writes a prompt asking an AI tool to draft an email for enterprise financial services accounts, the marketer is the translation layer. The marketer holds the commercial intent in their head and encodes it into each prompt. The imprecision of the strategic directive is compensated for by the precision of the human applying it.

This is exactly the dynamic Chapter 1 described: the cognitive tax of the prompt paradigm falls on the human, and the human absorbs it. The human doing the absorbing is also doing the strategic translation, turning vague intent into specific action through hundreds of micro-decisions embedded in hundreds of prompts.

When the system becomes ambient, the human translation layer disappears. Or rather, it needs to have already happened, at configuration time rather than execution time. The commercial intent needs to be explicit before the system starts acting, not during.

This is a document most growth organizations have never written because they have never needed to.

Let me describe what it would look like. A commercial intent document for an ambient growth system would specify, at minimum:

Which customer segments the organization is pursuing, defined by observable, measurable attributes. Not "enterprise financial services" but "companies with annual revenue above $500M in SIC codes 6020-6159 with more than 50 active users of our platform and at least one executive sponsor identified in the CRM."

What the desired commercial outcome is for each segment, stated in terms the system can act on. Not "grow revenue" but "increase net revenue retention to 115% in this segment by the end of Q3, prioritizing seat expansion over SKU upsell, with no more than 8% of expansion accounts churning within six months of expansion."

What signals indicate readiness for each type of commercial action. Not "accounts that are engaged" but "accounts that have had more than 20 active users in the past 30 days, have completed onboarding for all purchased modules, have submitted fewer than 3 support tickets rated severity-1 in the past 90 days, and have a primary contact who has opened at least 2 of the last 5 marketing emails."

What the priority hierarchy is when objectives conflict. Not "balance growth and retention" but "when expansion signals and churn risk signals are both present for the same account, hold expansion outreach and route to customer success for health assessment before any commercial action."

What the boundaries are. What the system should never do, regardless of what the signals say. "Never send automated outreach to accounts with an open severity-1 support ticket." "Never discount below 15% without sales director approval." "Never contact the CFO directly during a contract negotiation."

Reading that list, you might think: this is just good operational discipline. Any well-run growth team should have this documented.

You are right. And almost none of them do.

The reason is not laziness. It is that the human execution layer has always absorbed the ambiguity. The experienced account executive who would never call a CFO during a contract negotiation did not need a document telling her not to. She knew. The marketing manager who would never push expansion on a struggling account did not need a rule. He had judgment.

The ambient system has no judgment. It has signals, models, and rules. If the rules are not there, the system will act on the signals and models alone. And the signals and models, as we discussed in the previous section, will optimize for whatever is most legible, which is usually not what the organization would choose if it sat down and thought about it.

I want to be direct about something. Writing a commercial intent document of the kind I have just described is hard. Really hard. It forces the commercial leadership team to make explicit a set of decisions they have been comfortable leaving implicit. It forces them to quantify things they prefer to leave qualitative. It forces them to prioritize when they would rather keep all options open. It requires, in the language of the previous chapter, that tacit knowledge be made explicit.

The engineering manager I described in Chapter 4 figured this out because an agentic coding tool forced her hand. She built system intent documentation because the alternative was an agent producing incoherent code. The same forcing function is now arriving in growth organizations. The ambient growth system that runs without commercial intent documentation will produce incoherent commercial outcomes. The debt will accumulate. The question is whether the commercial leader writes the document proactively or discovers the need for it reactively, after six months of the system optimizing for the wrong things.

My prediction: most will discover it reactively, after six months of the system optimizing for the wrong things.

Who Owns the Customer Relationship When the System Is Ambient

Here is a question that sounds administrative until you start trying to answer it: when the ambient growth system sends an expansion email, books a meeting, hands off to sales, triggers an onboarding sequence, monitors health scores, and initiates a re-engagement campaign, all on the same account, over the course of a single month, who is responsible for that account's experience?

Marketing? Sales? Customer success? Product?

The honest answer, in most organizations running ambient growth systems, is: nobody. And everybody.

That answer is not cute. It is a diagnosis.

Traditional commercial organizations are built on handoffs. Marketing generates leads and hands them to sales. Sales closes deals and hands them to customer success. Customer success manages the relationship until the next renewal or expansion opportunity, at which point they hand back to sales. Each function owns a phase of the customer lifecycle. Each function is measured on its phase. Each function is staffed, budgeted, and led as a distinct organizational unit.

These handoffs make sense in a world where commercial actions are discrete events. A campaign launches. A lead comes in. A call gets made. A deal closes. An onboarding begins. Each event has a clear owner because each event is bounded in time and scope.

Ambient growth systems dissolve the boundaries between events. The system does not run a campaign and then stop. It continuously acts on every account, all the time, adjusting its behavior based on signals that flow across what used to be functional boundaries. A product usage signal (traditionally product or customer success territory) triggers a marketing action (email) that leads to a sales outcome (meeting) that creates a customer success obligation (onboarding the new capability). The signal chain crosses three org chart boundaries in a single automated sequence.

Who approved that sequence? The marketing VP approved the email templates. The sales VP approved the meeting-booking workflow. The customer success VP approved the health scoring model. Each approved their piece. Nobody approved the whole.

The organization is built for handoffs. The system doesn't hand off. That gap isn't a coordination failure — it's a design mismatch. The organization is built for phases. The system operates continuously. The organization is built for functional ownership. The system operates across functions.

I have watched three different companies try to solve this, and each took a different approach.

One company tried to solve it with a committee. They created a "Revenue Operations Council" that met weekly to review the ambient system's activity and make cross-functional decisions. It lasted four months. The meetings became a forum for territorial disputes, marketing arguing that the system was booking too many meetings for accounts that were not ready, sales arguing that the system's email tone was too soft, customer success arguing that both marketing and sales were overwhelming accounts that needed support, not outreach. The committee did not make decisions. It staged arguments.

A second company tried to solve it by giving one function ownership over the system. They put the ambient growth platform under the CRO, who had authority over marketing, sales, and customer success. This worked better, structurally, because it concentrated decision-making authority. But it created a different problem. The CRO was a former VP of Sales. His instincts skewed toward pipeline generation and deal velocity. The system's behavior gradually drifted toward aggressive expansion outreach, because the person configuring it valued the signals that supported his native function. Customer success leaders, now reporting to someone whose instincts were sales-first, struggled to get retention and health signals prioritized. The signal debt accumulated in a specific direction.

The third company did something I found more interesting. They created a new role: a commercial system owner. This person did not own marketing, sales, or customer success. She owned the intent infrastructure. Her job was to maintain the commercial intent document, define the signal hierarchies, set the autonomy thresholds, and monitor the system's behavior for drift. She reported to the CEO, not to any functional leader.

She was, functionally, the equivalent of the engineering manager from Chapter 4 who started writing system intent documentation. She maintained the coherence of the commercial system in the same way that engineering manager maintained the coherence of the codebase. Not by doing the work herself, but by making sure the rules governing the work were consistent and aligned.

Did it work perfectly? No. The role was new, the person in it was figuring it out in real time, and the functional leaders resented the perceived loss of autonomy. But the company had the lowest signal debt of the three, because someone was explicitly responsible for the gap between system capability and strategic intent.

This role does not have a standard title yet. Some companies are calling it "Head of Revenue Systems." Some are folding it into Revenue Operations. Some are not creating it at all, which brings us back to the first answer: nobody owns it. And when nobody owns the gap, the gap grows.

The question "who owns the customer relationship" sounds like it is about org charts and reporting lines. It is, partly. But the real fight is about identity. Marketing has an identity. It is the creative function, the brand function, the demand generation function. Sales has an identity. It is the relationship function, the closing function, the revenue function. Customer success has an identity. It is the retention function, the advocacy function, the customer experience function.

Ambient growth systems do not respect these identities. The system does not know it is "doing marketing" when it sends an email. It does not know it is "doing sales" when it books a meeting. It does not know it is "doing customer success" when it flags a health risk. It is just acting on signals according to rules. The functional labels are a human organizational invention, and the system operates below the level at which those labels apply.

This means that the people in those functions experience the ambient system as an encroachment on their territory. The marketer who used to decide which accounts to target now sees the system making that decision. The salesperson who used to own the outreach sequence now gets a meeting handed to them by a system they did not instruct. The customer success manager who used to monitor account health now gets an alert generated by a model they did not build.

Each of these people is, individually, still doing valuable work. The marketer still shapes brand and messaging strategy. The salesperson still conducts the meeting and builds the human relationship. The customer success manager still has the contextual judgment about what an account needs. But the connective tissue between their work, the sequencing, the timing, the selection of which account gets which action at which moment, has been absorbed by the system.

They feel this as a loss of agency. And they are not wrong.

The organizational question is whether you address that loss of agency honestly, by redesigning roles around the new reality, or dishonestly, by pretending the old boundaries still hold while the system operates as though they do not.

Rewriting the Commercial Leader's Job

So what does the CMO actually do?

Not what the job description says. Not the conference-circuit version of the role. What does the chief marketing officer, or the chief revenue officer, or whatever title your organization uses for the person who owns commercial growth, actually do on a Tuesday morning when the ambient system has been running for six months?

I will tell you what they do not do: approve campaigns.

Under the old model, the commercial leader's operating rhythm was built around campaigns and initiatives. You planned a campaign. You allocated budget. You briefed the creative team. You reviewed the assets. You approved the targeting. You launched. You watched the metrics. You adjusted. You planned the next campaign. The work was episodic, structured around discrete actions with clear beginnings and endings.

Under ambient growth systems, the campaign is not an event. The system is always running. There is no launch date. There is no campaign brief. There are rules, thresholds, intent specifications, and autonomy boundaries that govern a continuous stream of commercial actions. The commercial leader's job is not to manage campaigns but to define and maintain the rules that govern the system's behavior.

This is a different kind of work. Let me try to make it concrete.

The first thing the commercial leader does is intent architecture. This is the work of writing and maintaining the commercial intent document I described earlier. It is the work of translating business strategy into operational rules precise enough for the system to act on. It is the hardest part of the job, because it requires the leader to make explicit decisions they have been comfortable leaving implicit.

What does intent architecture look like on a given day? It looks like the CMO sitting with the head of customer success, looking at the data on accounts that expanded and then churned, and asking: what did we get wrong about expansion readiness? What signals were we missing? How do we change the system's definition of "expansion-ready" so this does not happen again? And then writing the updated definition and deploying it to the system.

It looks like the CRO reviewing the system's behavior toward a newly defined market segment and asking: are we treating these accounts the way we would want them treated if I were personally managing each one? And when the answer is no, figuring out which part of the intent specification is wrong.

It looks like argument. Real argument, not committee-style posturing, about what the company's commercial priorities actually are. Because the system will not let you be vague. You can be vague in a board deck. You cannot be vague in an intent specification. The system will do something with whatever you give it, and if what you give it is ambiguous, it will resolve the ambiguity in whatever direction its optimization function prefers.

The second part of the job is signal calibration. The system operates on signals. Product usage. Engagement data. Support ticket patterns. Competitive intelligence. Each signal feeds the system's inferences. The commercial leader's job is to make sure the signals are calibrated correctly, that the system is weighting them in a way that reflects the actual commercial reality.

This is ongoing, not one-time. Signals degrade. A product usage metric that was a reliable predictor of expansion readiness six months ago may not be reliable anymore, because the product changed, or the customer base changed, or the competitive environment changed. The commercial leader has to watch for signal degradation the way a pilot watches instruments. Not the individual readings so much as the pattern. When the readings start to diverge from reality, something in the calibration has drifted.

I talked to a CRO who described this as "tuning a radio you can never turn off." There is no moment when the system is done and you can stop paying attention. The system runs continuously. Its behavior drifts continuously. The drift is usually small, which makes it hard to notice, which makes it easy to ignore, which is exactly how signal debt accumulates.

The third part is boundary maintenance. What should the system never do? What requires human approval, always, regardless of what the signals say? Where are the hard edges of the system's autonomy?

These boundaries are not set once and forgotten. They change as the organization's risk tolerance changes, as the market changes, as the system's track record of good decisions grows or shrinks. A system that has been running for six months without a significant error may earn expanded autonomy. A system that just misjudged a high-value account needs its autonomy pulled back until the root cause is understood.

The commercial leader is the person who moves those boundaries. This is a judgment call, every time. Too much autonomy and the system makes costly mistakes. Too little and you have rebuilt the prompt paradigm, a human approving every action, which defeats the purpose of the ambient system entirely.

These three activities — intent architecture, signal calibration, boundary maintenance — are the job. They replace campaign management, creative review, and team coordination as the primary work of commercial leadership.

I can hear the objection forming. "What about brand? What about creative strategy? What about the human relationships that drive complex sales?"

They still matter. They all still matter. The marketer still shapes the voice and the story. The salesperson still builds trust in a conference room. The customer success manager still calls the account when something goes wrong. These human contributions are not replaced by the ambient system. They are enabled by it. The system handles the continuous, data-driven, operationally intensive work of selecting, timing, and sequencing commercial actions. The humans do the things that require taste, judgment, and relationship, things the system cannot do.

But the leadership job is no longer about managing the humans who do those things day-to-day. That work continues, of course. A CMO still has a team. A CRO still has direct reports. But the center of gravity of the leadership role has shifted from managing people executing campaigns to maintaining the intent infrastructure that governs the system.

Campaign launches have a clear end. Creative reviews produce something you can hold. Team rallies feel like leadership. Those satisfactions are real. Intent architecture doesn't produce any of them.

But it is more consequential. A well-tuned intent specification affects every commercial interaction the system has, across every account, every day. A single campaign affects one segment, once. The leverage is different by orders of magnitude.

The commercial leaders who will thrive in ambient growth organizations are the ones who find satisfaction in systemic impact rather than episodic impact. They are the ones who can think about commercial strategy at a level of precision that most commercial leaders have never been asked to reach. They are the ones who can make trade-offs explicit, hold competing priorities in tension without resolving them prematurely, and translate qualitative business judgment into quantitative operational rules.

This is a new skill set. It is not what most CMOs or CROs were hired for. It is not what most of them were trained for. And like the engineers in Chapter 4 who found that their world had shifted from writing code to writing specifications, some commercial leaders will thrive in the transition and some will not.

The ones who thrive will be the ones who recognize that the ambient shift has not made commercial leadership less important. It has made it harder. The work is more abstract, more precise, and more consequential per decision. It is leadership that expresses itself through infrastructure rather than through action.

The commercial leader of an ambient growth organization is an architect. Not of campaigns. Of intent. The commercial intent document nobody has written is the blueprint. Signal debt is what happens when you build without one. And the eleven days when nobody touched the growth system were not a success story. They were a warning.

The system ran itself. The question is whether it was running toward the right destination. Answering that question is the job now.

Chapter 6

Chapter 6

Proving Ground Three: Knowledge Work

The Domain Where Ambient Intelligence Gets Philosophical

The Analyst Who Disappeared

In the spring of 2024, a financial services firm I know well made a quiet decision. The company's market intelligence group, a team of six analysts and a director, was going to pilot an ambient system for their recurring deliverables. These deliverables were not trivial. Every week, the team produced a competitive landscape report for the executive committee. Every month, they produced a deeper strategic assessment for the board. Every quarter, they assembled a risk outlook that fed into the company's planning process.

The analysts were good at their jobs. Most had been with the firm for five years or more. They had deep industry knowledge, strong relationships with internal stakeholders, and the ability to synthesize messy data into clear recommendations. Their director, a woman I will call Catherine, was widely trusted by the C-suite. When the CEO had a question about what a competitor's move meant, Catherine was the person he called.

The pilot started small. The team fed the ambient system their last twelve months of weekly reports, their source databases, their formatting templates, and a set of annotated examples showing how raw data became analytical conclusions. The system was configured to pull from the same data sources the analysts used, apply the same weighting criteria, and produce a draft weekly report in the team's house style.

The first draft was bad. Not wrong, exactly, but flat. It read like a summary of data rather than an analysis. The conclusions were obvious. The language was generic. Catherine's reaction, relayed to me later, was relief. "It confirmed what I already believed," she said. "You can't automate judgment."

Then the team started doing something they did not plan to do.

They started annotating the drafts. Not rewriting them from scratch, but marking up the specific places where the ambient system had gotten the analysis wrong, with notes explaining why. "This competitor move is flagged as neutral, but it should be flagged as aggressive because of the regulatory filing last week." "This risk is understated because the model doesn't account for the supply chain dependency in Southeast Asia." "This recommendation is too generic. Given our client concentration in healthcare, the implication is different."

The annotations were, in effect, the analysts teaching the system what they knew that it didn't. They were making their judgment visible in a way it had never been visible before. And the system learned from the annotations. Not in the way a human learns, not building genuine understanding, but in the operational sense that the next draft incorporated the patterns from the corrections.

By the eighth week, the weekly draft required about 40 percent of the revision it had needed in week one.

By the twelfth week, two of the six analysts had been reassigned to a new strategic projects group. The remaining four were spending most of their time on the monthly and quarterly deliverables, the ones that required deeper, less structured analysis. The weekly report was produced by the system, reviewed and adjusted by one analyst (rotating), and approved by Catherine. Total human time on the weekly report went from roughly 120 person-hours per week to about 15.

Here is where the story gets interesting, and uncomfortable.

Catherine told me that the quality of the weekly report improved. Not because the ambient system was smarter than her analysts, but because the system was more consistent. It never forgot to check a data source. It never had a bad Monday and missed an implication. It never got anchored on a hypothesis from two weeks ago and filtered new data through that lens. The analysts' judgment was better on their best days. The system's output was better on average.

She said this with something close to grief.

The two reassigned analysts did fine. They moved into work that was less structured and more interesting. But three of the remaining four experienced what I can only describe as a professional identity crisis. One told Catherine directly: "If the machine can do 80 percent of what I do, what is the other 20 percent actually worth?" Catherine did not have a good answer at the time.

What happened to Catherine herself is the part of this story I find most telling. Her calendar changed. She used to spend about a third of her week reviewing and editing analyst work. That time evaporated. She started spending it in meetings with business unit leaders, not presenting reports but discussing what the reports should be about. Which questions matter this quarter? What are we not watching that we should be? What assumptions in our strategic model are most fragile?

She was, without naming it, doing intent work. Defining the inputs to an ambient system rather than managing the people who produced the outputs. Her value to the organization arguably increased. Her sense of what her job was fell apart and was rebuilt, slowly, into something she did not have a name for.

The team that Catherine managed shrank from seven people to five. The output increased. The consistency improved. And the people left in the group were doing work that was harder, less defined, and more exposed to being wrong in ways that mattered.

Nobody in the company wrote a case study about this. There was no presentation to the board about "AI transformation in the intelligence function." It happened quietly, one process at a time, one deliverable at a time. The analyst function did not disappear with a bang. It thinned.

What Judgment Actually Is

Here is the question that makes knowledge workers uncomfortable: what is judgment?

Not in the philosophical sense. In the operational sense. When a senior analyst looks at the same data a junior analyst looked at and reaches a different, better conclusion, what did the senior analyst do that the junior analyst did not?

The standard answer is "experience." The senior analyst has seen more data, more patterns, more cycles. True. And useless as an answer. Experience describes inputs. Not what happens to them.

I want to propose a more useful decomposition. Expert judgment, the kind that knowledge workers get paid for, consists of two categories of cognitive operation, and the distinction between them determines what ambient intelligence can and cannot do.

The first category: pattern recognition on structured criteria. This is the part of judgment that involves applying known rules, weightings, and heuristics to data. An analyst checking whether a competitor's quarterly revenue indicates market share gain. The analyst has a set of criteria, sometimes formal, sometimes internalized through years of repetition, and applies those criteria to new information. The criteria can be listed. The weightings can be stated. The rules for edge cases can be written down. Not easily, and not quickly, but they can be made explicit.

This is the decomposable part of judgment. And it is fully ambient-capable. Once the criteria are explicit and the data sources are connected, an ambient system can apply them continuously, consistently, and at a scale no human team can match. Catherine's weekly report was produced by this part of judgment being offloaded to a machine.

The second category is harder.

It is the recognition that the criteria themselves are wrong. The analyst who looks at a competitor's flat revenue quarter and says: something is wrong with how we're reading this market. Not a pattern mismatch against known criteria. A sense that the criteria themselves have stopped describing reality.

In 2008, the quantitative risk models at every major bank were applying known criteria to known data and producing results that said the risk was manageable. The models were doing exactly what they were built to do. The judgment failure was not in the application of criteria. It was in the criteria themselves: the assumption that housing prices would not decline nationally, the assumption that securitized mortgage tranches were independent risks. The people who saw the crisis coming were not applying better criteria. They were questioning whether the criteria were valid.

That kind of judgment, the meta-judgment that asks whether the right questions are being asked, resists decomposition because it requires awareness of context that cannot be fully specified in advance. It requires the ability to notice what is absent, to feel the weight of an anomaly that does not fit any existing category. Not because machines are dumb. Because the operation is defined by its resistance to prior specification. The whole point is that it fires when the prior specification is inadequate.

This distinction matters because most knowledge work, as it is actually practiced day-to-day, is overwhelmingly composed of the first category. The prestigious, identity-defining work that knowledge workers imagine when they think of their jobs is the second category. The weekly report is category one. The moment when Catherine tells the CEO "I think we're asking the wrong question about that acquisition" is category two.

An honest accounting of time would reveal that most senior knowledge workers spend 70 to 80 percent of their working hours on category-one activities. Reading, summarizing, comparing, applying known criteria, formatting conclusions. The category-two moments are rare, high-stakes, and disproportionately valuable. They are also the moments that define the professional's reputation and self-image.

The ambient shift makes category-one work machine-executable. It does not touch category two. And it makes category two more valuable than it has been in decades, because when the production of analysis is ambient, the only remaining human premium is on the quality of the questions being asked and the validity of the criteria being applied.

Put differently: the ambient shift strips away the work that made knowledge workers feel busy and leaves behind the work that makes them feel exposed. Busy is comfortable. Exposed is not.

Judgment Scaffolding: Making the Implicit Explicit

If you have been reading the previous two chapters, you saw this pattern before. In software development, the shift from writing code to writing specifications required engineers to make their tacit knowledge explicit. A similar shift is required of knowledge workers. I am going to give it a name: judgment scaffolding.

Judgment scaffolding is the practice of articulating the criteria, weightings, edge-case rules, and contextual constraints that constitute your expert judgment in a form that an ambient system can apply. The output of this practice is an intent artifact, a structured document that captures not just what you decide but how you decide it and when you would decide differently.

It is, in practice, unpleasant work.

I say that because I have watched people try to do it, and the most common reaction, across disciplines, is frustration followed by a specific kind of vertigo. The frustration comes from the difficulty of the task. Most experts cannot immediately articulate their own decision criteria. They know what good analysis looks like the way a sommelier knows what good wine tastes like. They recognize it. They can point to it. Describing the recognition process in enough detail for someone else (or something else) to replicate it is a different skill entirely.

The vertigo comes after, when the expert realizes how much of their judgment actually can be described. The process of attempting to write it down reveals that a large portion of what felt like intuition was actually a set of rules applied so automatically that the expert had stopped noticing them. The sommelier who "just knows" the wine is corked is, in fact, running a rapid chemical-sensory assessment against stored references. They cannot describe every micro-step, but they can describe the criteria ("musty," "wet cardboard," "absence of fruit"). And once they describe the criteria, those criteria can be applied by someone, or something, trained to detect them.

Here is what a judgment scaffold looks like in practice for an analytical function. I will use a simplified version of what Catherine's team eventually produced.

The scaffold has four components.

Criteria. What factors do you consider when evaluating this type of input? For a competitive landscape report, Catherine's criteria included: revenue trajectory relative to market average, product launch cadence, pricing moves in the last 90 days, executive hiring patterns, patent filing volume, and regulatory exposure. Each criterion had a weight. Revenue trajectory was weighted highest for the weekly report. Patent filing volume was weighted higher for the quarterly outlook.

Decision rules. Given the criteria, how do you reach a conclusion? If a competitor's revenue growth exceeds the market average by more than two percentage points for two consecutive quarters, flag as aggressive. If a competitor reduces pricing on a core product by more than 10 percent within 30 days of our product launch, flag as defensive response. These rules were not invented for the scaffold. They existed in the analysts' heads. They had just never been written down in operational terms.

Edge-case handling. What are the situations where the standard rules do not apply? If the competitor is a recent acquisition target, weight executive hiring patterns at zero (because the hiring is driven by integration, not strategy). If the data source for pricing is a third-party estimate rather than public filing, reduce confidence level and flag for human review.

Context constraints. What external factors should change the analysis? If a major regulatory action is pending in the competitor's primary market, shift the entire analysis toward regulatory risk. If our own company is in a quiet period before earnings, restrict the distribution of the report and flag any conclusions that might be material.

Each component of the scaffold is a piece of the analyst's judgment made explicit, testable, and transferable to an ambient system. Together, they form an intent artifact that tells the system: here is how I would analyze this data, here is what I would pay attention to, and here is when I would override my own defaults.

Building this scaffold took Catherine's team about six weeks of dedicated effort, in sessions of two to three hours each, with Catherine leading and the analysts contributing their individual expertise. The process surfaced disagreements. Two analysts weighted patent filing data differently. One analyst had an edge-case rule based on a market dynamic from 2019 that the others did not share. The discussion required to resolve these differences was itself valuable, producing a clearer shared model of the team's analytical approach than had ever existed.

The scaffold is not static. It changes as the market changes, as new data sources become available, as the team learns from cases where the ambient system got something wrong. Maintaining it is ongoing work, though less intensive than building it. Catherine estimates her team spends about four hours per week updating the scaffold, compared to the 120 hours they previously spent producing the weekly report.

I want to name something obvious that is easy to miss: the scaffold is a description of the decomposable part of judgment. It captures category one from the previous section. It does not and cannot capture category two. The scaffold tells the system how to analyze. It does not tell the system when to stop analyzing and start questioning whether the analysis itself is valid.

That boundary is the boundary of ambient intelligence in knowledge work. Everything inside the scaffold is machine-executable. Everything outside the scaffold is the remaining, irreducible human contribution.

Where does the scaffold live? In Catherine's case, it lives in a structured document that the ambient system references when generating reports. It is version-controlled. Changes are tracked. The document has an owner (Catherine) and contributors (the analysts). It is reviewed quarterly.

If this sounds like the "system intent documentation" that the engineering manager in Chapter 4 was creating, it should. It is the same practice adapted to a different domain. The engineer writes system intent documentation so the agentic coding system knows how to implement consistently. The analyst builds a judgment scaffold so the ambient analytical system knows how to analyze consistently. The underlying operation is identical: make implicit expert knowledge explicit and operational.

The difference is that software engineers have a culture of documentation, imperfect as it is. Analysts, lawyers, consultants, and strategists often do not. Their knowledge lives in their heads, their experience, their "feel" for the work. Asking them to externalize it feels wrong, like asking a jazz musician to write out their improvisational decisions in advance. The objection is understandable. It is also, for the decomposable portion of their judgment, incorrect.

The Institutions That Will Fight This Hardest

Let me tell you who is not going to like any of this.

Law firms. Management consultancies. Investment banks. Accounting firms. Credentialing bodies for financial analysts. Medical licensing boards. University tenure committees. Any institution whose economic model or status structure depends on the inseparability of the expert from the production of the expert's output.

The resistance will come from several directions, and some of it will be legitimate. Let me sort through them.

Billing models. The large professional services firms, law and consulting in particular, are built on billable hours. A partner at a major law firm sells the time of associates. An engagement manager at a consulting firm sells the time of analysts. The value proposition to the client is: you are buying access to smart people who will spend time on your problem, and the price is a function of how much time they spend and how senior they are.

If the production of analysis becomes ambient, the time component collapses. A legal research memo that took an associate forty hours now takes four hours of scaffold maintenance and two hours of human review. The work product is comparable or better. The hours are gone. And the hours were the thing the client was paying for.

The firms know this. Some are already adjusting. A few have shifted to flat-fee or outcome-based pricing for specific work products. But the infrastructure of professional services, the performance tracking, the promotion criteria, the partner compensation models, the client relationship management, all of it runs on hours. Changing the billing model means changing everything downstream of it. That is not a tweak. It is a reconstruction.

Credentialing bodies. The CFA Institute, the bar associations, the medical boards. These organizations define who is qualified to do knowledge work in their respective domains. Their power rests on the idea that performing the work requires credentials that can only be earned through extensive training and examination. The credentials certify a person's ability to produce expert output.

If the expert output is ambient-producible given a sufficiently good scaffold, what does the credential certify? It cannot certify the ability to produce the output, because the machine produces the output. It could certify the ability to build and maintain the scaffold, but that is not what the current exams test. The CFA exam tests whether you can calculate weighted average cost of capital and interpret financial statements. It does not test whether you can articulate the criteria by which you would evaluate a competitive landscape in a form that a machine can apply.

The credentialing bodies face a choice. They can update their exams and requirements to test for the skills that the ambient shift actually requires, scaffold construction, quality review, meta-judgment, intent articulation. Or they can defend the existing requirements and watch as the gap between what the credential tests and what the work demands grows until the credential becomes a barrier to entry rather than a signal of competence.

Most will choose the second path, at least initially, because the first path threatens the value of every credential they have already issued.

Status hierarchies. The ambient shift threatens professional identity by suggesting that a significant portion of the production, the visible output, is decomposable and transferable. Not all of it. But enough to make the production a less reliable signal of the professional's unique value. The partner who can dictate a brief without notes, whose legal reasoning flows from mind to page with minimal effort, is the highest-status professional in a law firm. The work is inseparable from the person. You are not buying a brief. You are buying that person's brief. Ambient systems loosen that bond.

I want to be clear about what is legitimate in this resistance and what is not.

Legitimate: the concern that ambient systems will make errors that require deep expertise to catch. The analyst who reviews an ambient report, nods at a wrong conclusion, and signs off because they lacked the experience to question it. The lawyer who approves an ambient-generated contract clause that contains a subtle liability. These are real risks, and the argument that credentialed expertise is needed to supervise ambient output is correct.

Legitimate: the concern that judgment scaffolds will be treated as complete when they are not. That organizations will assume the scaffold captures all expert knowledge when it only captures the decomposable portion. That the non-decomposable elements, the meta-judgment, the anomaly detection, the recognition that the question is wrong, will be undervalued because they are invisible in the scaffold.

Not legitimate: the argument that expert output should remain expensive because the expert spent years learning to produce it. This is an argument about the cost of inputs, not the value of outputs. The client does not care how long it took you to learn to write the brief. The client cares whether the brief is good. If the brief is good and it took less time, the client should pay less. The professional's years of training are valuable because they enable scaffold construction and quality supervision, not because they justify billing for production time that is no longer required.

Not legitimate: the argument that ambient knowledge work will "deskill" the profession. This confuses the production task with the profession. Radiology was not deskilled when AI started reading X-rays faster than humans. The radiologists who were doing nothing but reading routine X-rays all day were doing a task that machines turned out to be good at. The radiologists who were handling complex cases, integrating clinical context, and consulting with surgeons on ambiguous findings were doing work that became more valuable, not less.

The institutions that will fight hardest are the ones with the most to lose economically from the separation of production and judgment. Law firms billing at $800 per hour for associate time. Consulting firms selling teams of analysts at $300 per day per person. Accounting firms charging for audit hours. The production was the business model. When the production is ambient, the business model has to change.

Some firms will make this transition. They will sell judgment scaffolding as a service, ongoing scaffold maintenance, quality supervision, the meta-judgment that sits outside the scaffold. The pricing will be different. The margins may be higher, because the cost of production drops while the value of judgment stays constant or increases. But the organizational transformation required is significant, and the partners who built their careers on the old model will resist.

The firms that make the transition early will take market share from the ones that do not. That is the competitive pressure that will eventually force the issue. Not ideology. Not technology advocacy. Market share.

What You Are Actually Paid For Now

Here is the part where I talk directly to you, if you are a knowledge worker whose output is judgment.

You are probably reading this chapter with a mixture of recognition and resistance. Recognition because you have already felt some version of this shift, even if your organization has not named it. Resistance because the shift implies something uncomfortable about what your daily work is worth.

Let me be direct.

If you are a senior analyst, a strategy lead, a consulting partner, a legal director, a research head, the thing you are paid for has changed. It has changed whether or not your organization has acknowledged it. The change is not coming. For many functions, it is already here, running quietly in pilot programs and unofficial workarounds, thinning the work the way Catherine's team thinned.

You used to be paid to produce analysis. The quality of your analysis was a function of your knowledge, your rigor, and your time. You were both the factory and the quality inspector. The production and the judgment were inseparable because you were the only mechanism capable of both.

That inseparability is dissolving. Not completely. Not for every task. But for enough of your recurring output that the economics of your role have shifted under your feet.

What are you paid for now?

You are paid for the scaffold. The quality, precision, and completeness of the judgment scaffold you build and maintain. The intent artifact that tells the ambient system how to analyze, what to weight, when to flag, and what context to consider. This is not a lesser job. It is a harder one. Writing a good analysis is difficult. Describing how you write a good analysis, in enough detail that a system can replicate it, is more difficult. It requires you to understand your own cognitive process at a level of explicitness that most professionals have never attempted.

You are paid for the exceptions. The category-two judgment that the scaffold cannot contain. The moment when you look at the ambient system's output and say, "This is technically correct and completely wrong." The moment when you recognize that the question the system is answering is no longer the right question. The moment when you override the scaffold because the world has changed in a way the scaffold does not yet reflect.

You are paid for the update. The ongoing work of revising the scaffold as conditions change. A judgment scaffold from six months ago may contain assumptions that are no longer valid. Maintaining it requires continuous attention to the gap between what the scaffold describes and what the world is doing.

You are paid for the question. Not the answer. The ambient system produces answers. You produce questions. Which competitor should we be watching? What risk are we not measuring? What assumption in our planning model is most likely to be wrong this quarter? These questions are the inputs that determine whether the ambient system produces useful output or irrelevant output. A beautifully executed analysis of the wrong question is worthless.

And you are paid for the trust. You are the person the CEO calls, the person the board listens to, the person whose name on the report gives it weight. This is not an ambient function. It never will be. Trust is relational, personal, built over years through demonstrated judgment in specific contexts. When the CEO asks Catherine what a competitor's move means, the CEO is not asking for a data synthesis. The CEO is asking for Catherine's read, colored by Catherine's experience, calibrated by Catherine's track record.

This set of responsibilities, scaffold construction, exception handling, scaffold maintenance, question generation, and trust, is the redefined job. If you compare it to the old job, it looks smaller in volume and larger in consequence. Less time spent producing. More time spent thinking. Less busywork. More exposure.

That last point deserves emphasis. Under the old model, a bad week meant a mediocre report that got edited into adequacy by the review process. Under the new model, a bad scaffold means every ambient output for weeks or months is wrong in ways that may not be caught until the damage is done. The cost of error is not distributed across a team of analysts sharing the production burden. It is concentrated in the person who built the scaffold.

Some knowledge workers will thrive in this. The ones who always chafed at the production work, who saw the recurring deliverables as drudgery that kept them from the interesting problems, will find the new job liberating. They get to spend their time on questions and judgment and meta-analysis, the work they always considered their real contribution.

Others will struggle, and not because they lack talent. Some people process their thinking through production. The analyst who only sees the pattern while writing the report. The lawyer who finds the argument while drafting the brief. For these people, the production is not separate from the judgment. It is the mechanism through which the judgment occurs.

I think scaffold construction may serve that same cognitive function for some of them. The act of writing decision rules, of arguing with a colleague about whether patent filing volume should be weighted at 15 percent or 25 percent, of specifying what counts as an edge case and what does not, is itself an act of thinking. It is slower than writing a report. More deliberate. But the same mental motion is there: you are working through what you know by trying to make it precise enough for someone else to act on. For the analyst who thinks by writing, writing a scaffold may turn out to be a better version of the same process, because it forces the thinking to be more explicit, more contestable, more refined than a narrative report ever required. Whether that substitution works for everyone, I do not know. But the shape of the cognitive operation is closer than it first appears.

What I am certain about is this: the premium on clarity has never been higher.

In the old model, you could be somewhat unclear in your own thinking and still produce acceptable work, because the production process itself served as a forcing function. You figured it out as you wrote it. In the ambient model, the production process does not help you figure it out. The system produces exactly what the scaffold tells it to produce. If the scaffold is vague, the output is vague. If the scaffold contains a wrong assumption, the output contains a wrong conclusion. There is no room for "I'll know the right answer when I see it," because the system will show you exactly what your criteria specify, and if your criteria are imprecise, the result will be precisely imprecise.

This is why I said earlier that the new job is harder and more exposed. It is harder because clarity is hard. Knowing what you think is one thing. Being able to describe what you think in enough detail for a machine to act on it is a much higher bar. And it is more exposed because the scaffold is visible. Your judgment criteria are written down, version-controlled, and available for inspection. When the output is wrong, someone can trace the error back to the specific criterion or weighting or edge-case rule that caused it.

For leaders of knowledge-work teams, the managerial shift mirrors what happened to the engineering manager in Chapter 4. You are no longer primarily managing the production of analysis. You are maintaining the intent infrastructure that makes ambient analysis possible. Your job is to hold the vision of what good analysis looks like for your function, to make that vision explicit enough for both humans and systems to act on it, and to catch the drift when individual scaffolds start reflecting outdated assumptions or divergent criteria.

Catherine figured this out through necessity. She did not have a playbook. She is writing one, slowly, through trial and error. The version of her job that existed eighteen months ago is gone. The version that exists now is something she could not have described in advance but can recognize in retrospect.

If you are in a similar position, here is my honest assessment of where this leaves you. The work is not gone. The work is different. The skills that made you good at the old work are still relevant, but they feed into a different output. You need to learn to build scaffolds. You need to learn to articulate your own judgment with a precision you have probably never been required to demonstrate. You need to get comfortable with the fact that your value is no longer visible in the volume of your output but in the quality of the intent that drives it.

And you need to start now, because the organizations that build this capacity first will not just have better ambient systems. They will have something more durable: a clear, explicit, continuously maintained understanding of what good judgment looks like in their domain. That understanding is an asset that compounds over time. Every scaffold revision makes the next one easier. Every edge case documented prevents the next error. Every criteria discussion among experts sharpens the shared model.

The firms that treat judgment scaffolding as a core competence will pull ahead of the ones that treat it as an IT project. The gap will be obvious. Quickly.

Software development was the first proving ground, and it showed us how the ambient shift changes the relationship between intent and execution. Knowledge work is the second proving ground, and it shows us something deeper: what happens when the ambient shift reaches the work that people believe cannot be separated from the person doing it.

Some of that belief is correct. The non-decomposable elements of expert judgment are real, durable, and more valuable than they have been in a generation. But the belief that all of knowledge work is non-decomposable, that the entire profession is irreducible intuition from top to bottom, is a story professionals tell themselves. And the ambient shift is calling the question.

Chapter 7

Chapter 7

Proving Ground Four: Creative Work

Authorship, Identity, and the Question That Has No Comfortable Answer

The Brief That Became the Work

In late 2023, a brand design studio in London with about thirty people discovered that the most valuable thing their designers could produce was a document nobody would ever publish.

The studio had a strong reputation for consumer packaging and visual identity systems. Their clients were mid-market food and beverage brands, the kind of companies that needed to look premium without looking pretentious. The studio's creative director, a man I will call Alistair, had built the team over eight years. They were proud of their craft. They talked about kerning the way surgeons talk about sutures.

What happened was not dramatic. Nobody announced a revolution. Alistair introduced an ambient generative system into the studio's concept phase. The system had been trained on the studio's portfolio, its style guides, its color theory preferences, its typographic principles. It could produce initial concept directions, dozens of them, in the time it used to take to sketch three.

The first week, the designers treated the output as noise. Too many options. Too generic. "Like Pinterest threw up," one designer told me.

The second week, Alistair did something that changed the dynamic. He stopped asking the team to generate initial concepts. Instead, he asked them to write better briefs. Not longer briefs. More precise briefs. He wanted the brief to contain the constraints, the aesthetic territory, the emotional register, the specific things the design should not be. He wanted the brief to be a creative act in itself, one that the ambient system could use as its starting point.

The third week, something shifted. The briefs got good. Really good. The designers discovered that when they poured their creative thinking into the specification of what the work should be, rather than into the direct production of the work, the ambient system's output was no longer noise. It was raw material that already lived in the right territory. Still needed shaping. Still needed human hands. But the gap between the first draft and the final product shrank.

By the sixth week, the studio's workflow had reorganized itself. The brief was the primary creative deliverable. The ambient system generated a field of options from the brief. The designer's role shifted from generating options to editing a field of options, selecting, combining, adjusting, rejecting. The final output was still a human product. The starting point was not.

Alistair described the change to me with a metaphor I have thought about since. "We used to be chefs who grew their own vegetables," he said. "Now we are chefs who select from a very large, very strange farmers' market. The cooking is still ours. The growing is not."

I pushed him on whether the quality suffered.

He said it hadn't. It had gotten more consistent at the middle of the range and slightly less distinctive at the top. The studio's best work, the work that won awards, still came from the designers pushing past what the ambient system suggested. But the baseline, the Tuesday afternoon brand refresh for a regional sparkling water company, was better than it used to be. Faster and more polished and more thoroughly considered, because the designers were spending their creative energy on the brief rather than grinding through initial sketches they mostly threw away.

The economics shifted too. The studio could take on more projects without adding designers. Their average project margin increased by about 20 percent. Alistair did not lay anyone off. He redeployed two junior designers into a new offering: strategic brand architecture consulting. They were now selling the brief, not just the design.

I want to stay with this for a moment because the implications are not obvious.

When the brief becomes the primary creative deliverable, creative authority moves upstream. The person who writes the brief is the person who shapes the work. The person who executes the brief is the system. The human designers are no longer producing the work in the traditional sense. They are curating the conditions under which the work gets produced, then selecting and refining the results.

Authorship moved. It didn't disappear. The judgment is still human, the decisions are still human — they're just being expressed at a different point in the process.

If you are a creative professional, that distinction might feel like the difference between being a painter and being the person who tells a painter what to paint. I understand why it feels that way. I am going to argue, in the rest of this chapter, that the feeling is wrong. But I want to honor the fact that it is real, and that it is doing real damage to real people right now.

What Authorship Was Always Made Of

Here is a question that seems simple and is not: what makes someone the author of a creative work?

The most intuitive answer is execution. The author is the person who made the thing. The painter held the brush. The writer typed the words. The designer moved the pixels. Authorship is the trace of a hand on a surface, whether physical or digital.

This answer is intuitive because it is often true. But it is not always true, and where it breaks down tells you something about what authorship actually consists of.

Consider a film director. The director of a feature film does not operate the camera. Does not edit the footage. Does not compose the score. Does not build the sets. Does not write the screenplay, usually. The director tells other people what to do, makes thousands of small and large decisions about what the final product should look and sound and feel like, and selects from among the options that the production team generates.

Nobody disputes that the director is the author of the film.

Why? Because authorship, in filmmaking, has always been located in intent and selection. The director decides what the thing should be. The director chooses among the versions of it that others produce. The execution is distributed across hundreds of people. The authorship is not.

Architecture is the same. The architect draws plans. The builders build. The architect is the author of the building. But it goes further than these prestige examples. A music producer in a recording studio shapes the sound of an album without playing a single instrument on it. A magazine editor commissions, sequences, and edits pieces she did not write, and the magazine is recognizably hers. A fashion creative director who has not sewn a garment in twenty years is still the author of the collection, because the collection is the expression of her decisions, not her stitching.

The point is not that execution does not matter. It does. The point is that we already accept, in many fields, that authorship lives somewhere other than the hands.

So what is it actually made of? I think it comes down to four operations.

Intent. What is this thing supposed to be? What is it trying to do? What feeling, function, or meaning should it carry? The person who defines the intent is the person who sets the creative direction.

Taste. Among the many possible versions of this thing that could exist, which ones are good and which ones are bad? Taste is the ability to discriminate among options. It is learned, developed, refined through years of exposure and practice. It is subjective, but it is not arbitrary. A person with good taste can usually articulate why one option is better than another, even if the articulation is partial.

Constraint-setting. What are the boundaries of the work? What is it not? What is excluded? Constraints are creative decisions. A sonnet is fourteen lines of iambic pentameter. That constraint is not a limitation on creativity. It is the generator of a specific kind of creativity that cannot exist without it. The person who sets the constraints shapes the space in which the work occurs.

Selection. Among the options that exist within the intent, that meet the taste criteria, and that respect the constraints, which one is the final work? Selection is the terminal creative act. It is the moment where all the other operations converge into a commitment.

Execution, the physical or digital production of the work, is the operation that ambient systems are absorbing. It has been the site of craft, of skill development, of professional identity for centuries. I am not dismissing it. But execution is one of five operations, and the other four are still entirely human.

What I am saying is that intent, taste, constraint-setting, and selection are what authorship has always been made of. Execution was the vehicle through which those operations became visible. In the absence of any other mechanism for realizing creative decisions, the person who executed was necessarily the person who exercised intent, taste, constraints, and selection, because there was no way to separate them.

Ambient creative systems separate them. For the first time in the history of most creative disciplines, you can exercise intent, taste, constraint-setting, and selection without performing the execution yourself. The film director has always done this. The architect has always done this. Now the graphic designer, the writer, the illustrator, the brand strategist can do it too.

This does not mean execution does not matter. The quality of the execution affects the quality of the final work. An ambient system that produces mediocre visual output will produce mediocre visual work regardless of how brilliant the intent is. But the trajectory of these systems is toward higher and higher quality execution. Systems that struggled with typographic coherence and consistent spatial hierarchy two years ago now handle both reliably. That trajectory shows no signs of reversing.

The creative professionals who understand this distinction are the ones who are already adapting. They are pouring their energy into the four operations that constitute authorship and treating execution as a variable they control through specification rather than through direct production.

The ones who cannot make this distinction, who believe that authorship is execution, are in trouble. Not because they are wrong about the history. Execution and authorship were fused for centuries. They are wrong about the future.

Creative Intent Architecture

I need a name for what Alistair's studio is doing. What Catherine's analysts were doing when they made their tacit judgment explicit in scaffolded decision criteria. What the engineering manager described in earlier chapters was doing with system intent documentation. The practice is the same across domains, but the creative version has specific characteristics that deserve their own label.

Creative intent architecture.

Creative intent architecture is the practice of designing the generative conditions under which ambient creative production happens. It is the work of specifying what the ambient system is allowed to produce, what it is not allowed to produce, what criteria the output must meet, and what contextual factors should shape the generation.

It has four components. They map roughly to the four operations of authorship I described in the previous section. But they are operationalized differently, because they need to be legible to a system, not just to a human collaborator. And they are not equally difficult or equally intuitive, which means I am not going to spend equal time on each.

The first component is aesthetic criteria.

This is taste made explicit. When Alistair's designers write a brief that says "warm but not cozy, modern but not cold, approachable but not cheap," they are articulating aesthetic criteria. The ambient system uses these as constraints on its generation. This is harder than it sounds, because taste is usually implicit. Designers know what they like. Describing what they like in terms specific enough for a system to act on is a skill that most designers have not had to develop.

One of Alistair's senior designers told me she spent more time on a single brief in the new workflow than she used to spend on three initial concepts in the old workflow. "I had to figure out what I actually meant by 'editorial,'" she said. "I've been using that word for ten years. Turns out it means about six different things depending on the context."

The second component is constraint systems.

Constraints in creative intent architecture are not just negative rules (do not use red, do not use serif fonts). They are positive structural boundaries that define the space of acceptable output. A constraint system for a packaging design might include: must work at shelf scale (18 inches viewing distance), must read in 1.5 seconds, must differentiate from competitor X and competitor Y on the same shelf, must be reproducible in four-color offset printing. These constraints are not aesthetic. They are operational, commercial, and physical. They define what the work must survive.

The best constraint systems I have seen include something I think of as productive contradiction. A brief that says "premium but accessible" is not confused. It is defining a narrow band of creative territory that exists between two poles. The ambient system has to find options that satisfy both conditions simultaneously. This is where the most interesting creative output tends to come from, and the skill of constructing productive contradictions is a high-level creative skill.

The third component is quality thresholds. This is the one that gets the least attention and matters the most operationally.

How do you know when the output is good enough? In traditional creative work, the answer was "when the creative director says so." In ambient creative work, that answer still applies at the final selection stage. But before final selection, the system is generating many options, sometimes hundreds, and someone needs to define the threshold below which an option should not even be presented. Quality thresholds are the filters that sit between raw generation and human review.

At Alistair's studio, the hard quality thresholds include things like: text must be readable at specified size, color palette must maintain minimum contrast ratio, layout must conform to the golden ratio within a specified tolerance. These are measurable. A system can apply them without judgment.

The soft thresholds are where the real difficulty lives. "If the overall composition feels cluttered, reduce element count." "If the type treatment reads as aggressive rather than confident, soften." These require the system to make aesthetic judgments, and the system gets them wrong often enough that the threshold-setting itself becomes an iterative process.

What Alistair's team learned, through about two months of frustration, is that the right approach is to set hard thresholds tightly and soft thresholds loosely. Tight hard thresholds eliminate the obviously broken options before a human ever sees them. Loose soft thresholds let through a wider range of options in the subjective space, which means the designers review more options but spend less time reviewing garbage. The alternative, setting soft thresholds tight, produced an uncanny-valley effect where everything that came through was technically acceptable but creatively dead. The system had optimized for safety. The designers wanted range.

This insight, hard thresholds tight and soft thresholds loose, turns out to apply well beyond graphic design. I have seen writers arrive at the same conclusion when configuring ambient content systems. Hard rules about factual accuracy, character count, and formatting can be strict. Soft rules about tone and voice need breathing room, or the output collapses into bland compliance.

The amount of time Alistair's team spent calibrating thresholds surprised me. It was roughly equal to the time they spent on aesthetic criteria. Most of the conversations I observed were about edge cases: what happens when a design meets all the hard thresholds but violates the spirit of the brief? What happens when it meets the soft thresholds beautifully but the color reproduction will not work in four-color offset? These conversations were, in effect, the team teaching itself what it actually valued, forced by the system's literal-mindedness into a degree of specificity that human-to-human creative direction never required.

The fourth component is contextual anchors. These are simpler to explain. Context changes everything in creative work. A packaging design for a luxury chocolate brand in Japan requires different visual language than the same brand in Brazil. Contextual anchors tell the ambient system which version of the creative space it should be operating in. They can be explicit (target audience: women 35-50, urban, health-conscious) or referential (the visual territory of Aesop, the typographic sensibility of Kinfolk). Referential anchors are especially powerful because they compress large amounts of aesthetic information into a single reference that the system can pattern-match against.

I have watched creative directors develop intent architectures, and the pattern is consistent: the first attempt is too vague, the second attempt is too rigid, and the third attempt starts to find the right balance between specificity and openness. Too vague, and the ambient system produces generic output that could be for any brand. Too rigid, and the system produces output that is technically compliant but creatively dead. The sweet spot is specific enough to exclude the wrong territory and open enough to allow surprise.

This is the point: creative intent architecture is itself a creative practice. It is not administrative work. It is not project management. It requires taste, judgment, experience, and imagination. The person who designs a great intent architecture is doing creative work at a different level of abstraction than the person who produces a great design, but it is creative work nonetheless.

The analogy that works best for me is urban planning versus architecture. An urban planner defines the zoning, the density limits, the setback requirements, the height restrictions, the use designations. The architect designs the building within those constraints. Both are creative. Both require skill and judgment. The urban planner works at a higher level of abstraction. The architect works at a higher level of specificity. Neither can do the other's job well. Neither is subordinate to the other.

Creative intent architecture is the urban planning layer of ambient creative work. The ambient system is the architecture layer. The creative director who builds the intent architecture is not managing a machine. They are designing the conditions under which creative production occurs. That is authorship relocated.

The Identity Reckoning Nobody Warned You About

Now for the hard part.

Everything I have described so far, the relocation of authorship, the components of creative intent architecture, the theoretical distinction between execution and the other operations of authorship, sounds reasonable when you read it on a page. It sounds like an argument. You can agree with it or disagree with it.

What it does not convey is what it feels like.

I have spent the last year talking to creative professionals across disciplines. Designers, writers, brand strategists, creative directors, art directors, illustrators, copywriters. People who have built their careers and their identities on their ability to make things. People whose sense of who they are is inseparable from the practice of creation.

The ambient shift hits creative professionals differently. An engineer who stops writing code and starts writing specifications has changed their output. Most engineers I know don't grieve that. Code is a means to an end.

For creative professionals, the relationship between the maker and the made is personal in a way that is hard to overstate.

A designer I spoke with in Portland put it this way: "When I design a logo, I am not solving a problem. I am making a thing that did not exist before. That thing came from me. It has my decisions in it, my eye, my hand. When you tell me that an AI can generate fifty logos in ten seconds and three of them are as good as what I would have made, you are not telling me about a tool. You are telling me that something I thought was mine is actually generic."

That's an identity wound.

Another designer, in Berlin, said something that stayed with me: "I became a designer because I loved making things. Not because I loved having opinions about things other people made." She paused. "Which is basically what you're describing."

She was right. I am describing a shift from making to specifying. And for people whose professional identity is built on the act of making, that shift is not a lateral move. It is a loss.

I want to sit with that for a moment rather than rushing past it.

The loss is real. The grief is real. The sense that something has been taken, not given, is real. Creative professionals did not ask for ambient systems. They did not request a redefinition of authorship. The technology arrived, and it brought a philosophical question that nobody had to answer before: if you are not the one making the thing, are you still a creative person?

The answer is yes. But the answer being yes does not make the question painless.

Here is what I have observed separating the creative leaders who are adapting from the ones who are stuck. It is not talent. It is not age. It is not technical literacy. It is the ability to distinguish between two things that feel identical but are not: attachment to the process of making, and commitment to the purpose of the work.

Attachment to the process of making is about the maker. I need to hold the brush because holding the brush is who I am. The physical or digital act of production is the site of my identity. Without it, I am diminished.

Commitment to the purpose of the work is about the work. I need this thing to be as good as it can possibly be. If the best path to that outcome involves me holding the brush, I will hold the brush. If it involves me specifying what the brush strokes should look like and selecting from the results, I will do that instead. My identity is in the quality and intention of the outcome, not in the mechanics of production.

Both of these orientations are legitimate. Both have produced great work. But only one of them survives the ambient shift intact.

The creative leaders I have seen adapt most successfully are the ones who were always primarily committed to the purpose. Alistair, the creative director in London, is one. When I asked him whether he missed the hands-on design work, he gave me a puzzled look. "I stopped doing hands-on design work five years ago," he said. "I've been directing. This is just directing with a different team."

That response is revealing. Alistair had already relocated his creative identity from execution to direction before the ambient shift arrived. The technology did not force a change in his self-concept. It validated one that was already in place.

For the designers on his team, the transition has been harder. The junior designers, who had been in the field for two to four years, adapted faster than the mid-career designers with eight to twelve years of experience. The juniors had less invested in the old workflow. The mid-career designers had built their reputations on execution quality. They were the ones who could kern a headline by eye, who could nail a color palette in three iterations, who could produce a layout that needed almost no revision. Those skills are not worthless in the new workflow. But they are no longer the primary expression of the designer's value. The primary expression is now the brief.

One mid-career designer at Alistair's studio quit. She told Alistair she did not want to be "a creative project manager." She went to a smaller studio that was not using ambient tools. I do not know whether that studio will be using them in two years.

Another mid-career designer stayed and became, in Alistair's words, "the best intent architect on the team." She discovered that her deep knowledge of visual systems, her years of hands-on experience, gave her an advantage in writing briefs that the junior designers could not match. She knew what to specify because she knew what could go wrong. She knew which constraints mattered because she had violated them and seen the consequences. Her execution experience became the foundation of her specification skill.

That is the pattern. Execution experience is not obsolete. It is the prerequisite for good specification. You cannot write a great brief for a brand identity system if you have never built one. You cannot set meaningful aesthetic criteria if you have never developed your own taste through years of making and failing and making again. The ambient shift does not eliminate the need for deep creative experience. It changes where that experience is applied.

But it does eliminate the daily practice through which that experience used to be maintained and renewed. And that is the part nobody has a good answer for yet.

If junior designers skip the execution phase and go straight to specification, how do they develop the taste and judgment that make specification effective? Where does the next generation of creative intent architects come from, if the training ground of execution is no longer part of the job?

I don't have a clean answer. And honestly, neither does anyone else. The studios I have seen experiment with craft rotations, structured periods where juniors work without ambient tools, producing by hand, are too early to show results. Most are three to six months in. The designers going through these rotations report that the work feels artificial, like a training exercise rather than real production, because they know the ambient system is sitting right there. One program director at a design school in Amsterdam told me the dropout rate from their non-digital foundation year increased 30 percent in 2024, because students could not see the connection between hand rendering and the specification work they expected to do after graduation.

The alternative bet, that evaluation is a sufficient substitute for making, is unproven. Some creative directors believe that reviewing a thousand generated options teaches taste faster than producing ten options by hand. The argument has a surface logic: you see more, you compare more, you develop discrimination faster. But discrimination and generation are different cognitive operations, and whether one can fully substitute for the other is an open question that decades of art education suggests the answer to is probably no.

This is the live question, and the people pretending they have solved it haven't.

When Brand Voice Is an Intent Document

Something strange is happening in organizations that invested heavily in brand voice documentation over the past decade.

They are discovering that the work they did for reasons that had nothing to do with AI is turning out to be the most valuable preparation for the ambient shift in creative production.

A brand voice document, in its traditional form, is a guide that tells people how to write and speak on behalf of a brand. It specifies the brand's tone (warm, authoritative, playful, serious), its vocabulary preferences (say "people," not "consumers"), its sentence-level tendencies (short sentences, active voice, no jargon), its personality traits (confident but not arrogant, friendly but not casual), and its boundaries (never use humor about health topics, never make promises about outcomes).

Sound familiar?

It should. A well-built brand voice document is a creative intent architecture for language. It specifies the aesthetic criteria (tone, personality), the constraint systems (vocabulary, sentence structure, boundaries), the quality thresholds (what good brand voice sounds like versus what bad brand voice sounds like, with examples), and the contextual anchors (different rules for different channels, audiences, and situations).

Organizations have been writing these documents for decades. The good ones have been obsessively precise about them. Mailchimp's voice and tone guide became famous in the content strategy world around 2012. It did not just say "be funny." It specified the type of humor (dry, self-deprecating, never at the user's expense), the situations where humor was appropriate (success messages, empty states) and where it was not (error messages involving financial data, account security alerts), and the emotional register for each context.

That document, and others like it, was created for human writers. Its purpose was to ensure that anyone writing for the brand, whether an in-house copywriter or a freelancer hired for a single project, would produce work that sounded like the brand. Consistency across many human hands.

Now those documents have a new audience: ambient systems.

An ambient content production system that has access to a detailed, precise, example-rich brand voice document can produce on-brand content with far greater consistency than one working from a vague set of guidelines. The precision that brand strategists invested in for human consistency pays off directly for ambient consistency. The voice document is a ready-made intent artifact.

I have seen this play out at three organizations in the past year, and the pattern is striking.

The first was a direct-to-consumer e-commerce brand with about forty product lines. They had a brand voice document that ran to twenty-two pages. It included six tone dimensions, each with a spectrum (e.g., "irreverence: 7 out of 10 for social media, 4 out of 10 for customer service emails, 2 out of 10 for legal disclosures"). It included a "never say" list with over a hundred entries. It included thirty annotated examples of good and bad brand copy, with explanations of why each one worked or did not.

When they connected this document to an ambient content system, the first-draft quality was high enough that their content team could shift from writing to editing. Product descriptions, email subject lines, social media posts, ad variations. All generated by the system, all recognizably on-brand, all requiring human review and selection but not human generation.

Their head of content put it this way: "We spent two years on that document because we were sick of freelancers getting the tone wrong. Turns out we were writing the instruction manual for a system that didn't exist yet."

The second organization was a B2B SaaS company with a much less detailed voice document. Three pages. Vague. "Professional but approachable. Clear and concise. Avoid jargon." When they connected this to the same class of ambient content system, the output was generic. It could have been any B2B SaaS company. The system had no meaningful constraints to work with, so it produced the default: inoffensive, forgettable, interchangeable.

The contrast was instructive. The quality of the ambient output was directly proportional to the precision of the intent document. The technology was the same. The result was completely different. The differentiator was the quality of the specification.

The third organization is the one I find most interesting. A regional healthcare system with a brand team of three people. They had no formal voice document at all. Their brand consistency had been maintained through a single senior writer who had been with the organization for eleven years. She was the voice. When she wrote, it sounded like the brand. When anyone else wrote, it did not.

When the organization wanted to deploy an ambient content system, they realized they had a problem. The brand voice existed entirely in one person's head. It was tacit knowledge, the same kind that shows up everywhere the ambient shift reaches: engineers carrying architectural understanding they have never written down, analysts applying decision criteria they have never articulated. It had never been made explicit.

So they did what every other team in this situation has done. They had the senior writer annotate a hundred pieces of existing content, marking what made each one right, what the voice was doing in each sentence, why this word and not that word, where the tone shifted and why. It took her three months. The result was a forty-page document that she said captured about 70 percent of what she knew about the brand voice. The other 30 percent, she said, was "the stuff I just feel."

That 70 percent was enough. The ambient system, armed with the document, produced content that the senior writer rated as "good first drafts" about 60 percent of the time and "needs significant revision" about 40 percent of the time. Before the document, the rate had been roughly reversed.

The senior writer is still there. She still reviews everything. She still catches things the system gets wrong. But her output has roughly tripled, because she is editing and approving rather than writing from scratch. And the organization now has something it never had before: a written record of its brand voice that will survive the senior writer's eventual departure.

That last point is not trivial. Organizations lose brand consistency all the time because the knowledge of the brand voice lives in one or two people, and when those people leave, the voice leaves with them. A well-built brand voice document is institutional memory made operational. It is an asset that persists beyond any individual.

Here is what this means for the competitive landscape.

Organizations that have invested in precise, detailed, example-rich brand voice documents, tone of voice guides, creative principles, visual identity systems with clear rules and annotated examples, have a head start in the ambient creative shift. Their intent infrastructure already exists, at least for language and brand expression. They built it for one reason and are now benefiting from it for a completely different reason.

Organizations that skipped this work, that relied on talented individuals to carry the brand voice in their heads, that produced three-page guidelines full of vague adjectives, are starting from scratch. They need to do the slow, difficult work of making tacit creative knowledge explicit before they can benefit from ambient creative systems. The ambient system is only as good as the intent it receives.

This creates an unexpected competitive advantage. The companies that are most prepared for ambient creative production are not necessarily the ones that invested most in AI. They are the ones that invested most in articulating what their brand is. The ones that did the unglamorous work of writing twenty-page voice documents and annotating hundreds of examples and arguing about whether "quirky" means the same thing to everyone on the team.

The AI investment is table stakes. Everyone will have access to roughly equivalent ambient creative systems within a few years. The intent infrastructure is the differentiator. And the intent infrastructure takes years to build well, which means the lead time is real and the advantage is durable.

I want to close this section with an observation that connects back to an argument I have been making throughout this book. The bottleneck for ambient operation is not the AI model. It is the organizational infrastructure that sits between the model and the work. In creative domains, that infrastructure is the intent architecture: the brand voice documents, the creative principles, the style guides, the annotated examples, the constraint systems, the quality thresholds.

Organizations that have this infrastructure, whether they built it intentionally or inherited it from years of careful brand stewardship, are ready. Organizations that do not have it are not, regardless of how much they spend on AI tools.

The question is not "do you have ambient creative technology?" The question is "do you know what your brand is, in enough detail for a system to act on it?"

Most organizations, if they are honest, will find that the answer is no. And that the gap between what they think they know about their brand and what they can actually specify is larger than they expected.

Closing that gap is the work. It is creative work. It requires the best creative minds in the organization. And it starts with admitting that the brief, the specification, the intent document, is not the thing that comes before the creative work.

It is the creative work.

Chapter 8

Chapter 8

The New Leadership Competency

Intent Architecture as the Strategic Skill of the Ambient Era

What the Four Proving Grounds Have in Common

Four domains. Four different types of work. One structural pattern.

In software development, the value of the engineer shifted from writing code to specifying what the code should do. In growth and marketing, the value shifted from running campaigns to defining the conditions under which campaigns should run themselves. In knowledge work, the value shifted from producing analysis to designing the judgment scaffolds that shape what ambient systems analyze and how. In creative work, the value shifted from making the thing to specifying what the thing should be, how it should feel, and what it must not become.

Each of these shifts looked different on the surface. The engineer writes system intent documentation. The growth leader defines targeting constraints and optimization boundaries. The analyst builds judgment scaffolds. The creative director writes briefs that are themselves creative acts. The daily texture of each job is specific to its domain. The vocabulary is different. The artifacts are different.

But the underlying structure is identical.

In every case, the ambient shift relocated leadership value from orchestrating execution to defining the conditions that make autonomous execution possible. In every case, the human who adapted successfully was the one who could answer a deceptively simple set of questions: What are we trying to achieve? What must we not do? How will we know if we are on track? And how much latitude should the system have before it needs to check in?

The engineers who thrived were the ones who could write specifications precise enough that an agentic coding system could implement a feature without asking clarifying questions. The creative directors who thrived were the ones who could construct briefs that were specific enough to exclude bad territory and open enough to allow surprise. The analysts who thrived were the ones who could articulate their judgment criteria clearly enough for an ambient system to apply them.

In every case, the leaders who failed were the ones who could not.

The failure was not technical. It was cognitive. They could not make explicit what they had always left implicit. They could not describe the shape of the thing they wanted because they had never had to. In the old workflow, the shape emerged through the process of making. You wrote the code, and the code revealed the architecture. You designed three concepts, and the third one showed you what you actually wanted. You ran the analysis yourself, and the running of it taught you what to look for.

The ambient shift strips that away. The system needs to know what you want before it starts working. Not after. Not during. Before.

That inversion, from learning-what-you-want-through-doing to knowing-what-you-want-before-doing, is the common structural challenge across all four proving grounds. It is the reason the ambient shift is hard. It is the reason some people adapt and others do not. And it is the reason a new competency is needed, one that does not exist in any traditional management curriculum, any MBA program, any leadership development workshop I have seen.

Intent Architecture: A Definition

I am going to name this competency.

Intent architecture is the ability to define, structure, and maintain the intent context that ambient systems require to act without supervision.

That sentence is dense. Let me unpack it.

"Define" means knowing what you want. Not in the vague, aspirational way that strategic planning documents tend to express it, but with enough precision that a system can act on it. "Increase revenue" is not defined intent. "Reduce churn among customers in the $50-100/month tier by 15% over six months, primarily through improved onboarding experience in the first fourteen days" is closer.

"Structure" means organizing that intent into a form the system can use. Raw human intent is messy. It lives in conversations, in intuitions, in half-formed preferences. Structuring it means breaking it into components: objectives, constraints, quality signals, scope boundaries. It means creating the documents, the briefs, the system intent artifacts that the engineering manager in Chapter 4 created, that Alistair's designers created, that Catherine's analysts created.

"Maintain" means keeping the intent context current as conditions change. Intent is not a one-time act. The market moves. The strategy shifts. A competitor does something unexpected in August. The intent architecture has to be updated, and someone has to own that update. A stale intent context is worse than no intent context, because the system will continue acting on outdated assumptions with complete confidence.

Now. Intent architecture sounds like it might be the same thing as strategic planning. Or goal-setting. Or design thinking. It is not. Each of those disciplines is adjacent, and each falls short.

Strategic planning is about direction. It answers "where are we going?" But strategic plans are written for humans. They contain ambiguity that humans resolve through judgment, conversation, and improvisation. A strategic plan that says "become the market leader in enterprise collaboration" is perfectly useful for a human executive team. It is useless for an ambient system, because the system cannot infer what "market leader" means in operational terms, what tradeoffs are acceptable on the path to getting there, or when a specific action would violate an unstated organizational value.

Goal-setting, whether OKRs or SMART goals or any other framework, gets closer. Goals are more specific than strategy. But goals describe outcomes, not operating conditions. A goal tells you what success looks like. It does not tell a system how to pursue that success, what constraints to observe, how to handle ambiguity, or when to stop and ask for guidance. Goals are the destination. Intent architecture is the map, the route, the rules of the road, and the fuel gauge.

Design thinking is about problem discovery and solution generation. It is a process for figuring out what problem you are solving. But design thinking assumes a human will execute the solution. It generates empathy, insights, and prototypes that inform human decisions. It does not produce machine-readable operating instructions. A design thinking workshop might produce the insight that "customers feel anxious during the checkout process." Intent architecture would take that insight and turn it into a constraint: "the checkout flow must never require the customer to leave the page, must display running totals at every step, and must not introduce any new information after the payment method is entered."

Intent architecture borrows from all three of these traditions. The strategic clarity of planning. The precision of goal-setting. The user-centered orientation of design thinking. But it adds something none of them contain: the translation of human intent into a specification that a non-human system can act on without further clarification.

That translation step is the new part. It is where the skill lives.

The Four Constituent Skills

Intent architecture breaks down into four skills. I have named them based on what I have observed across all four proving grounds. These are not theoretical categories. They are the things I have watched people do well or do badly in practice.

Strategic Clarity

Strategic clarity is the ability to say what success actually means.

This sounds easy. It is not. Most leaders operate with what I think of as comfortable ambiguity, a working understanding of what they are trying to achieve that is detailed enough for human conversations but nowhere near detailed enough for a system to act on. "We want to grow the enterprise segment" is comfortable ambiguity. Everyone on the team nods. Everyone pictures something slightly different. The differences get resolved through ongoing human interaction: a meeting here, a Slack thread there, a product review where the VP says "no, that is not what I meant."

Ambient systems do not attend meetings. They do not read Slack threads for subtext. They act on what they are given. If what they are given is "grow the enterprise segment," they will optimize for enterprise segment growth by whatever path their training and data suggest, which may or may not align with what the VP actually had in mind.

Strategic clarity means closing the gap between what the leader intends and what the leader has specified. It means answering questions like: Which enterprise customers? Defined by revenue tier, employee count, industry vertical, or something else? Grow by what metric? New logos, expansion revenue, seat count, product adoption? Over what time period? At what cost? At the expense of what other priorities?

The engineering manager in Chapter 4 had to do this. His team's agentic coding system could implement features. But "implement the rate limiter" was not enough. He had to specify the rate limiter's behavior under edge cases, its failure modes, its interaction with other system components. The specification had to be complete enough that the agent did not have to guess. When it guessed, it guessed wrong about 30 percent of the time, and the month-five production incident was a direct result of a guess the agent made about data consistency under high concurrency.

Alistair's designers had to do this too. "Make it feel premium" was comfortable ambiguity. "Warm but not cozy, modern but not cold, approachable but not cheap" was strategic clarity. Still subjective, still requiring taste to interpret, but specific enough to exclude the wrong territory.

Strategic clarity is the hardest of the four skills for most leaders, because it requires them to do something they have spent their careers learning how to avoid: commit to specifics before they know whether the specifics are right. In a human-driven workflow, you can stay vague and course-correct as you go. In an ambient workflow, the course-correction is expensive, because the system has already done significant work before you realize it went in the wrong direction.

Constraint Articulation

Constraint articulation is the ability to say what the system must not do.

It is the negative space of intent. And in my observation, it is where most intent architecture fails.

Leaders are trained to think about what they want. They are not trained to think about what they want to prevent. But ambient systems, left unconstrained, will optimize for the stated objective by whatever means available. This is not malice. It is not even error. It is the system doing exactly what it was told to do, in ways the leader did not anticipate because the leader never specified what was off-limits.

The classic example is an advertising optimization system told to minimize cost per acquisition. Without constraints, it will find the cheapest possible customers, who are often the lowest-value, highest-churn customers. The system hit its target. The business lost money. The constraint that was missing: "do not acquire customers with predicted lifetime value below $200."

Constraint articulation gets harder as the domain gets more complex. In creative work, the constraints include aesthetic boundaries, brand values, cultural sensitivities, legal requirements, and operational limitations, all of which interact. A constraint that says "never use humor in healthcare communications" might be too broad. A constraint that says "never use humor when discussing treatment outcomes or medication side effects, but humor is acceptable in wellness tips and preventive care content" is more precise and more useful.

The healthcare brand team from the previous chapter learned this the hard way. Their ambient content system, armed with their senior writer's forty-page voice document, produced a cheerful, slightly playful blog post about managing chronic pain. The tone was technically on-brand according to the general guidelines. But the senior writer rejected it immediately. "You don't joke about someone's pain," she said. That constraint had been tacit, living in her judgment. It was not in the document. They added it.

The best constraint systems I have seen share a characteristic: they are built from failure. Each time the ambient system produces output that is technically correct but wrong in a way that matters, the leader adds a constraint. Over time, the constraint system becomes a detailed map of the organization's values, preferences, and non-negotiable boundaries. It is organizational self-knowledge, made legible.

This is why constraint articulation is a leadership skill, not a technical one. The constraints are not about the system. They are about the organization. What do we refuse to do, even if it would work? What tradeoffs are unacceptable? What values override efficiency? These are questions that only a leader can answer. The system does not care. The leader has to.

Signal Design

Signal design is the ability to define how the system will know if it is on track.

Every ambient system needs feedback loops. Without them, the system operates in the dark, optimizing toward objectives it has no way of verifying. Signal design is the practice of identifying the right metrics, indicators, and checkpoints that allow the system to self-assess and that allow the leader to monitor performance without micromanaging.

This is not the same as setting KPIs. KPIs are outcome measures. They tell you whether you arrived. Signals are process measures. They tell you whether you are heading in the right direction while you are still on the road.

In software development, the engineering manager from Chapter 4 designed signals into his team's workflow. Code coverage thresholds. Architecture conformance checks. Test pass rates. These were not just quality gates for the final output. They were mid-process signals that told the agentic system and the reviewing engineer whether the implementation was tracking toward the intended behavior.

In creative work, Alistair's studio used what they called "territory checks," intermediate outputs that the system generated specifically to verify that it was operating in the right aesthetic space before producing full concepts. A territory check might show three color palettes, three typographic directions, and three compositional grids. If the territory check was wrong, the designer adjusted the brief. If the territory check was right, the system proceeded to full concept generation. The territory check was a designed signal.

Signal design is tricky because the obvious signals are often the wrong ones. Lines of code produced per day is an obvious signal for software development. It is also useless, because it measures volume rather than value. Cost per click is an obvious signal for advertising. It is also misleading, because it can improve while customer quality deteriorates. Revenue per employee is an obvious signal for organizational productivity. It can rise while institutional knowledge collapses.

Good signal design requires you to think about what could go wrong and build an early warning for it. The production incident in Chapter 4, the data-consistency bug that only appeared under high concurrency, happened because the team had no signal that would have detected the problem before it hit production. After the incident, the engineering manager added a concurrency stress test as a required signal in the agentic workflow. The signal was not a new idea. What was new was making it a formal part of the system's operating loop rather than something a human engineer might or might not remember to check.

The skill of signal design is the skill of asking: what is the earliest possible indicator that something is going wrong? And can I build that indicator into the system's own feedback process so that I do not have to watch constantly?

If you can answer those questions well, you can extend autonomy to the system with confidence. If you cannot, you are either micromanaging (checking everything) or gambling (checking nothing). Neither scales.

Scope Authority

Scope authority is the ability to decide how much autonomy to extend, and when to pull it back.

Of the four skills, this one is the most dynamic. Strategic clarity, constraint articulation, and signal design are all things you define and then maintain. Scope authority is something you exercise continuously, adjusting in real time as conditions change and as your confidence in the system grows or shrinks.

Think of it as a dial, not a switch. The question is never "should the system be autonomous?" The question is "how autonomous should the system be, for this task, in this context, at this moment?"

The ambient spectrum from Chapter 3 is relevant here. A system operating at Point 2 (template-assisted explicit) has very limited scope authority. The system fills in templates, but the human approves every output. A system operating at Point 4 (proactive with bounded autonomy) has significant scope authority. It acts on its own within defined boundaries and only checks in when it encounters something outside those boundaries.

Scope authority means knowing where on that spectrum to set the dial for each type of work, and having the judgment to adjust it.

Early in Alistair's studio's transition, the ambient system had narrow scope authority. It generated concepts from briefs, but every concept was reviewed before it went to the client. As the team gained confidence in the system's output and as the briefs got more precise, Alistair expanded the scope. For routine projects, recurring packaging updates for existing product lines, the system now generates concepts and the designer reviews a curated selection rather than the full output. For new brand identities, scope remains narrow. Every concept is reviewed.

The engineering manager in Chapter 4 made a similar judgment. For well-understood features with clear specifications, the agentic system could implement and submit pull requests with minimal human involvement. For features that touched core data models or that interacted with systems the agent had not previously encountered, the scope was pulled back. The agent would implement, but the review was intensive.

Scope authority is the skill that most directly replaces the old leadership task of delegation. In a human team, delegation is a judgment call about a person: is this person capable of handling this task independently? In an ambient system, the judgment call is about the intersection of the task, the system's demonstrated capability, the quality of the intent context, and the cost of failure.

A useful heuristic I have heard from several leaders: extend scope when the cost of a wrong answer is low and easy to detect, constrain scope when the cost of a wrong answer is high or hard to detect. The analytics team that lets the ambient system generate routine weekly reports without review is making a sound scope judgment. The healthcare content team that reviews every patient-facing document is also making a sound scope judgment. The difference is not about the technology. It is about the stakes.

Scope authority also means having the willingness to pull scope back when something goes wrong. The month-five production incident in Chapter 4 was a scope authority failure. The team had extended more autonomy than the system's intent context could support. After the incident, the engineering manager reduced scope for data-layer changes and added new constraints and signals before re-extending it. That was good scope authority in response to a failure.

The hardest version of scope authority is knowing when to extend scope to something that feels uncomfortable. The teams I have seen grow fastest are the ones where the leader extends scope slightly beyond the team's comfort zone, monitors closely, learns from what happens, and extends again. Too cautious, and the team never reaches higher levels on the ambient spectrum. Too aggressive, and you get production incidents.

There is no formula. This is judgment.

Why This Is Harder Than What It Replaces

Most writing about leadership in the age of AI makes it sound like the new skills are lighter, more elevated, more strategic than the old ones. "Stop doing and start thinking." "Move from execution to strategy." The implication is that the ambient shift promotes leaders from hard work to better work.

That is wrong. Intent architecture is harder than orchestrating execution. It is more cognitively demanding, more uncomfortable, and more exposed.

Here is why.

When you orchestrate execution, you manage a process. You assign tasks. You check progress. You resolve blockers. You adjust timelines. The work is reactive. Someone brings you a problem, you deal with it. The feedback is immediate. You know whether the sprint is on track. You know whether the campaign is performing. You know whether the deliverable is going to be ready by September.

Orchestrating execution also has a hidden benefit: it allows you to defer hard questions about intent. If the product roadmap is vague, you can figure it out as you go. You assign the first few tasks, see what the team produces, adjust the direction based on what you learn. The act of doing teaches you what you want. Many leaders are not even aware that this is what they are doing, because the feedback loop between doing and wanting is so tight that it feels like planning.

Intent architecture strips that crutch away.

When you are designing intent for an ambient system, you have to commit to what you want before the doing starts. You have to say, explicitly and precisely, what success looks like. You have to name the constraints. You have to define the signals. You have to decide on scope. And you have to do all of this before you have the benefit of seeing the work in progress.

That means you have to surface assumptions you have previously left implicit.

Every organization is full of implicit assumptions. The product team assumes that "quality" means fast performance. The design team assumes it means visual polish. The customer success team assumes it means reliability. These assumptions have never been reconciled because they never had to be. The human team worked through the contradictions in real time, negotiating, compromising, and improvising their way to a result that approximately satisfied everyone.

An ambient system will not do that. It will take your definition of quality and optimize for it. If your definition is inconsistent, the output will be inconsistent. If your definition is incomplete, the output will fill in the gaps with its own defaults, which may not match yours.

Intent architecture forces you to resolve contradictions that the old workflow allowed you to leave unresolved.

It also forces you to accept responsibility in a way that orchestrating execution does not. When a human team produces bad work, responsibility is distributed. The designer made a bad choice. The developer introduced a bug. The project manager missed a deadline. When an ambient system produces bad work because the intent was wrong, the responsibility is concentrated. The intent architect specified the wrong thing. The system did what it was told.

This is uncomfortable. It is supposed to be.

One leader I spoke with, a VP of marketing at a mid-size SaaS company, put it this way: "I used to be able to blame the team for a bad campaign. Not their fault, exactly, but the problem was always in the execution. Now the execution is fine. If the campaign is wrong, it is because I gave the system the wrong brief. There is nowhere to hide."

That is not a small thing. It is a fundamental change in the accountability structure of leadership. And most leaders I have talked to are not ready for it.

There is a second reason intent architecture is harder, and it is more subtle.

Orchestrating execution is a convergent task. You have a goal. You have a team. You bring the two together and manage the process of getting from here to there. The path may be complex, but the direction is clear. You are always moving toward a defined outcome.

Intent architecture is a divergent task, at least in its early stages. Before you can define what the system should do, you have to explore the space of what it could do. You have to consider possibilities you have not considered before. You have to ask questions like: "If we removed all human involvement from this process, what would we need to be true?" And: "What is the worst thing the system could do if it interpreted our intent in a way we did not anticipate?" And: "Which of our stated values actually constrain our behavior, and which are just things we say?"

These are hard questions. They require a kind of thinking that most operational leaders have not had to practice. Strategic planners think this way, but strategic planners do not usually have to translate their thinking into specifications precise enough for a system to act on. Intent architecture requires both the divergent thinking of strategy and the convergent precision of engineering. It is a rare combination.

I am not saying this to discourage anyone. I am saying it because the leaders who will succeed at intent architecture need to know what they are signing up for. It is not a promotion. It is a different and harder type of work. The reward is that it scales in a way that orchestrating execution never could. But the difficulty is real, and pretending otherwise does not help.

The Organizations That Will Pull Away

Not all organizations will make this transition at the same speed. Some are structurally ready for intent architecture. Others are structurally locked out of it. The difference between the two is already visible, though it does not look like an AI gap. It looks like an ordinary operational gap.

That is what makes it dangerous.

The organizations that will master intent architecture early share a set of characteristics. I have seen these characteristics in the small number of organizations that are already operating at Point 3 or above on the ambient spectrum, and I have seen them missing in the larger number of organizations that are stuck at Point 1 or 2.

The first characteristic is that leadership can articulate what the organization actually does, in operational terms, without resorting to abstractions.

This sounds basic. It is not. Ask a random VP at a large company to describe, in concrete operational terms, what success looks like for their function this quarter. Not the OKR. Not the mission statement. The actual conditions that would have to be true for them to say "yes, that was a good quarter." Most will struggle. They will give you the KPI. They will tell you the revenue number or the NPS target. But when you push for the operational specification, the thing a system could act on, they run out of words.

Organizations where leaders can do this, where the habit of operational specificity already exists, are ready. The intent architecture skill is a natural extension of how they already think. Organizations where leaders traffic in abstractions, where "strategic alignment" is the answer to every question, will find intent architecture agonizing.

The second characteristic is a high tolerance for making implicit knowledge explicit.

The senior writer at the healthcare system, the one who spent three months annotating a hundred pieces of content, the engineering manager who wrote system intent documentation, the creative director who insisted on twenty-two-page brand voice guides. These are people who believe that what they know should be written down, specified, shared, and debugged. Not everyone believes this. Some leaders prize tacit knowledge. They value the unspoken understanding that develops among people who have worked together for a long time. They see explicit documentation as bureaucratic overhead.

In a human-only organization, tacit knowledge works. People fill in gaps. They read context. They adjust.

Ambient systems do not. They need the explicit version. Organizations that resist making knowledge explicit will be unable to build the intent artifacts that ambient operation requires. Their systems will operate on thin, vague specifications, and the output will be thin and vague.

The third characteristic is a willingness to change evaluation criteria.

The engineering manager in Chapter 4 rewrote his team's performance evaluation criteria because the old criteria measured the wrong things. Lines of code no longer mattered. Specification quality, review thoroughness, system coherence contribution: those were the new measures. He did this on his own, without direction from HR or senior leadership.

Most organizations will not do this. Performance evaluation criteria are sticky. They are tied to compensation, to promotion decisions, to professional identity. Changing them is politically expensive and organizationally disruptive. But if you continue to evaluate people on execution output when the work has shifted to intent specification, you will reward the wrong behavior. Your best intent architects will be invisible on the metrics that matter for their careers. They will leave, or they will stop doing the new work and revert to the old.

Organizations that can move their evaluation systems quickly will retain and develop intent architects. Organizations that cannot will lose them to the ones that can.

The fourth characteristic is tolerance for delayed payoff.

I said in Chapter 3 that the payoff from intent infrastructure is real but delayed. Building a twenty-two-page brand voice document takes months. Writing system intent documentation takes weeks per system. Developing constraint systems and signal designs takes iteration, failure, and learning. The benefits show up later, sometimes much later, and they show up as things that did not go wrong rather than things that visibly went right.

Organizations that can invest in slow-payoff work, because their leadership has patience, or because their competitive position gives them room, or because their culture values building for the future, will pull away. Organizations that need to show ROI in the current quarter on every AI investment will never build the intent infrastructure that makes ambient operation possible. They will keep buying tools, deploying them in the prompt-and-response mode described in Chapter 1, and hitting the same 15-20% productivity ceiling over and over.

Here is how the gap compounds.

In year one, the difference between an ambient-ready organization and an ambient-locked one looks marginal. Maybe a 10% productivity difference. Maybe slightly faster cycle times. Maybe somewhat better content consistency. Nothing that would alarm a competitor.

In year two, the ambient-ready organization has refined its intent architecture across multiple functions. Its constraint systems are mature. Its signal designs have been debugged through experience. Its people have developed the cognitive skills of intent specification. The ambient systems are operating at Point 3 or above across a growing number of workflows. Output quality is high and consistent. Human attention is focused on the things that actually require human judgment. The ambient-locked organization is still writing prompts.

In year three, the gap has become structural. The ambient-ready organization is doing things the ambient-locked one literally cannot do: producing at a speed and scale and consistency that is only possible when the intent infrastructure is in place. And the ambient-locked organization cannot close the gap quickly, because intent architecture is not something you can buy. It is something you build, slowly, by doing the difficult cognitive work of making your organization's intent explicit and precise.

This is the pattern of the factory and the electric motor all over again. The technology arrived at every factory at roughly the same time. Thirty years later, some factories had tripled their output and others had gone bankrupt. The difference was not the motor. The difference was whether the factory reorganized itself around what the motor made possible.

The organizations that will pull away are the ones that are reorganizing now. They may not call it intent architecture. They may not have read this book. But they are doing the work: writing the specifications, building the constraint systems, designing the signals, practicing the scope judgments. They are building the organizational muscle that the ambient era demands.

The ones that wait will find that the gap is wider than they thought and harder to close than they expected. Not because the technology is expensive. Because the competency is slow to develop. You cannot hire intent architecture off the shelf. You cannot acquire it through a vendor contract. You cannot install it in a quarter.

You have to build it, through practice, through failure, through the patient and unglamorous work of making what you know into something a system can act on. The organizations that start now will have an advantage that compounds with every month that passes. The ones that start later will be playing catch-up against competitors who have been practicing a skill those competitors have had years to refine.

The technology is the same for everyone. The intent is not.

That is where the next decade of competition will be decided. Not in who has the best AI. In who knows, with the most precision and honesty, what they actually want.

Chapter 9

Chapter 9

The Three-Initiative Test

The Practice of Intent Architecture Begins with a Humbling Exercise

The Exercise

Stop reading for a moment. Get a piece of paper. Or open a blank document. Whatever you use when you think.

Now write down the three biggest initiatives you are currently responsible for. Not all of them. The three that matter most. The ones that, if they succeed, define your year.

Got them?

Good. Now, for each one, write the intent behind it. Not the plan. Not the timeline. Not the milestones or the budget or the team assignments. The intent. The actual desired state of the world if this initiative succeeds.

Write it in plain language. No jargon. No acronyms. No shorthand that only your team would understand.

And write it with enough precision and depth that an ambient system, one that has never attended your staff meeting, never read your Slack channels, never heard the conversation you had with your CEO last quarter, could act on it without asking you a single clarifying question.

Take as long as you need. I will wait.

I mean it. Do the exercise before you read on. The rest of this chapter will be twice as useful if you do.

If you are still here, you either did the exercise or you skipped it. If you skipped it, I understand. Most people do. But I am going to ask you to go back and do it anyway, because what happens next only works if you have experienced the difficulty firsthand.

Here is what I predict happened, if you tried.

Your first sentence came easy. Something like "We want to launch the new platform by Q3" or "We want to reduce customer churn in the mid-market segment." Clear enough. Feels like intent.

Then you tried to write the second sentence, the one that specifies what "launch" means or what "reduce churn" means operationally, and you slowed down. You started qualifying. You wrote something, deleted it, wrote something else. The words felt either too specific (committing you to something you were not sure about) or too vague (not really saying anything a system could act on).

By the third initiative, you were probably frustrated. Or you had unconsciously switched from writing intent to writing a plan. Dates crept in. Milestones crept in. You started describing what your team would do instead of what the world would look like when they were done.

Most senior leaders I have given this exercise to cannot complete it in under forty-five minutes. Many cannot complete it at all, not to the standard I described. And these are people running hundred-million-dollar business units, people with MBAs and decades of operational experience, people who would tell you without hesitation that they know exactly what their initiatives are supposed to achieve.

They do know. In their heads. In the accumulated context of dozens of conversations, hundreds of decisions, years of pattern recognition.

What they cannot do is write it down in a form that stands on its own.

That gap is the subject of this chapter. And closing it is the single most important practical skill this book has been building toward.

The Six Dimensions of a Complete Intent Document

What does it actually take to write intent that is complete enough for an ambient system to act on?

After watching leaders attempt this across dozens of organizations, and studying the intent artifacts that actually worked, I have identified six dimensions that a complete intent document needs to cover. Miss any one of them and you leave a gap the system will fill with its own defaults. Those defaults may be wrong.

Here are the six.

1. Desired Outcome State

This is the most obvious dimension and the one people get wrong most often. The desired outcome state is not a goal. It is not a metric. It is a description of what the world looks like when the initiative has succeeded.

A goal says: "Reduce churn by 15%."

A desired outcome state says: "Customers in the $50-100/month tier who complete the onboarding sequence in their first fourteen days remain active users at the six-month mark at a rate of 85% or higher, up from the current 70%. Active use is defined as logging in at least three times per week and using at least two of the product's five core features. The improvement comes primarily from changes to the onboarding experience, not from pricing changes, contract lock-ins, or account management interventions."

See the difference. The goal tells you what number to hit. The outcome state tells you what the world looks like when you hit it, how to recognize it, and which paths are acceptable.

When I read intent documents that work, the outcome state section reads almost like a scene description. You can picture the situation it describes. You can tell whether you are looking at it or not. If you cannot picture it, it is not specific enough.

2. Success Signals

Success signals are not the same as the desired outcome. They are the intermediate indicators that tell you, while the initiative is still in progress, whether it is tracking toward the outcome state or drifting away from it.

For the churn reduction example: the desired outcome state describes what the world looks like at the six-month mark. The success signals describe what you should see at week two, week four, week eight.

A strong set of success signals for this initiative might include: "Completion rate of the onboarding sequence increases from 45% to 65% within the first month of changes. Users who complete the sequence show a 20% higher login frequency in weeks 3-4 compared to those who do not. Support ticket volume related to setup and configuration decreases by at least 30%."

The signals should be things you can measure while the work is happening, not after it is done. They should be leading indicators, not lagging ones. And they should be specific enough that an ambient system can monitor them automatically and flag when something is off.

A signal like "customer satisfaction improves" is not a success signal. It is a wish. A signal like "NPS scores from customers in their first thirty days increase from 32 to 40 or above, as measured by the post-onboarding survey sent on day 15" is a success signal.

3. Explicit Constraints

Constraints define what the initiative must not do, even if doing it would move the numbers in the right direction.

This is where most leaders balk. Writing down what you will not do feels like limiting your options. It is. That's the point. An ambient system that is not constrained will find the most efficient path to the outcome, and the most efficient path often violates assumptions you have never articulated because you assumed everyone shared them.

For the churn reduction example, constraints might include: "Do not change pricing or contract terms. Do not add friction to the cancellation process. Do not send more than two emails per week during the onboarding period. Do not collect additional personal data beyond what the current signup flow requests. Do not make changes to the core product features themselves; only the onboarding experience is in scope."

Each of those constraints closes a door. Each door, if left open, is a door the system might walk through. The constraint against adding cancellation friction is the kind of thing every leader in the company would agree with if you asked them. But if you do not write it down, you are trusting the system to share your instincts about what is and is not acceptable. It does not have instincts. It has instructions.

I think of constraint articulation as the ethical backbone of the intent document. It's where your values stop being aspirational and start being operational.

4. Contextual Anchors

Contextual anchors are the facts about the current state of the world that the system needs to know in order to act appropriately. They are the background information that you carry around in your head and never think to mention because it is obvious to you.

For the churn example: "The current onboarding sequence was designed eighteen months ago for a product that had three core features. The product now has five. The sequence has not been updated. The mid-market segment ($50-100/month) represents 40% of total revenue but 60% of churn volume. The company is planning a pricing restructure in Q4, which means any changes to the onboarding experience need to be compatible with a potential tier consolidation. The primary competitor launched a free tier last month, which may be affecting new customer expectations about time-to-value."

None of that is intent. All of it shapes how the intent should be pursued. A system that does not know about the upcoming pricing restructure might spend three weeks building onboarding flows that reference tier names that will not exist in four months. A system that does not know about the competitor's free tier might optimize for a time-to-value benchmark that is no longer competitive.

Contextual anchors are the most perishable dimension. They change constantly. The competitor does something new. The market shifts. Internal plans change. Maintaining contextual anchors requires a discipline of regular review that most organizations do not currently have, because in the prompt-and-response world, you provide context in every conversation. In the ambient world, the context has to be maintained as a standing document that the system can reference at any time. And the cost of stale contextual anchors is not just inaccuracy. It is wasted work, sometimes weeks of it, that the system executed confidently in a direction that is no longer valid. By the time you catch it, the damage is done and the rework is expensive.

5. Scope of Autonomous Action

This is where you tell the system how much latitude it has.

Not all parts of an initiative deserve the same level of autonomy. Some actions are low-risk and easily reversible: the system can take them without checking in. Others are high-stakes and hard to undo: the system should propose them but wait for approval.

For the churn example: "The system may modify the copy, sequencing, and timing of onboarding emails without human approval, provided the modifications stay within the defined constraints. The system may redesign the in-product onboarding walkthrough, but proposed designs must be reviewed by a product designer before going live. The system may not change any onboarding step that involves payment information or account permissions without explicit approval from the VP of Product."

This dimension is the practical translation of the scope authority skill I described in the previous chapter. It is where you decide how much of the dial to turn, and for which types of action. Writing it down forces you to think through the risk profile of each part of the initiative, which is something most leaders do intuitively when managing a human team but rarely make explicit.

6. Escalation Criteria

Escalation criteria define the conditions under which the system should stop acting and ask for human guidance.

They are the trip wires. The "if you see this, pause and come find me" instructions.

Good escalation criteria are specific and measurable. "Escalate if something seems off" is not an escalation criterion. "Escalate if the onboarding completion rate drops below 35% for any seven-day period, if support ticket volume increases by more than 50% from the pre-change baseline, or if any A/B test shows a statistically significant negative result at the 95% confidence level" is an escalation criterion.

Escalation criteria are the safety net. They're what make it possible to grant autonomy with confidence, because you know the system will pull you back in when something goes wrong in exactly the way you defined as unacceptable.

The production incident in Chapter 4, the data-consistency bug under high concurrency, happened in part because the team had not defined escalation criteria for that type of failure. After the incident, the engineering manager added concurrency stress tests as a mandatory check and defined specific performance thresholds that would trigger escalation. The escalation criterion was not complicated. It just had not existed.

Take a look at what you wrote earlier, if you did the exercise. How many of these six dimensions did your intent documents cover?

In my experience, most leaders cover the first dimension partially and the other five not at all. Some cover the first and second. Almost nobody covers the third, fourth, fifth, and sixth on their first attempt.

That is not a judgment. That is data. And it is the most useful data this chapter will give you.

Where Senior Leaders Break Down

I have watched several hundred leaders attempt some version of this exercise over the past two years. The failure modes are remarkably consistent. There are four, and they show up in roughly the same order, regardless of industry, seniority, or function.

Strategic Vagueness

This is the most common failure mode and usually the first one to appear. The leader writes something that sounds like intent but is actually a direction. "We want to become the leading platform for mid-market collaboration." "We want to build a world-class customer experience." "We want to accelerate our data strategy."

These are not intent documents. They are bumper stickers. They tell you roughly which way to face but nothing about how far to walk, what the destination looks like, or how to recognize it when you arrive.

The thing is, strategic vagueness is not laziness. It is a learned behavior. Most leaders have spent their careers communicating in exactly this register. In a human organization, vague strategic direction is perfectly functional. You say "world-class customer experience" in a meeting, and your team fills in the meaning based on shared context, past conversations, the culture of the organization, their own expertise. The vagueness is a feature, not a bug, because it gives the team room to exercise judgment.

Ambient systems do not have shared context. They do not fill in meaning from past conversations. They take the words you give them and act on those words. "World-class customer experience" means nothing to a system. It cannot generate a single action from that phrase. It needs to know what "world-class" looks like in operational terms, for which customers, measured how, within what constraints, at what cost.

Strategic vagueness is the prompt paradigm in miniature. You give the system a vague input and then spend your time correcting the outputs. The whole point of intent architecture is to move the precision upstream, so the outputs require less correction.

Milestone Substitution

The leader starts trying to write the desired outcome state and, within a sentence or two, slides into writing a plan. "We will launch Phase 1 by March, onboard the first ten enterprise customers by June, and have a self-serve signup flow by September."

That is a plan. A good one, maybe. But it is a sequence of actions, not a description of the desired state of the world.

Milestone substitution is sneaky because milestones are specific. They have dates and numbers and deliverables. They feel precise. But they describe the journey, not the destination. An ambient system given milestones will dutifully work toward hitting them, but it will have no way to tell whether hitting them is actually producing the intended outcome. If the milestones were wrong, if Phase 1 should have been scoped differently, if the first ten enterprise customers should have been in a different segment, the system has no way to know. You never told it what success looks like beyond the milestones.

I have seen this pattern so often that I think of it as the default mode of senior leadership communication. Leaders are trained to manage through milestones. They break big goals into smaller deliverables, track progress against timelines, and adjust when things fall behind. All of that is good management. None of it is intent architecture. Intent architecture starts where milestones end: with the question "if we hit every milestone perfectly, what is actually true about the world that was not true before?"

Constraint Avoidance

When I ask leaders to write down what their initiative must not do, I get one of two responses. Either they say "there are no constraints, we should be open to anything that works," or they write constraints so broad they are meaningless: "stay within budget," "comply with legal requirements."

The first response is always wrong. Every initiative has constraints. The leader just has not thought about them. In a human-led process, the constraints are enforced through culture, judgment, and conversation. Nobody on the marketing team would propose running ads that are misleading, not because there is a written constraint against it, but because the culture would not tolerate it and someone in the room would say "we can't do that." An ambient system is not in the room.

The second response, the too-broad constraint, is just vagueness wearing a different costume. "Stay within budget" is not operationally useful because the system needs to know what the budget is, what types of expenditures count against it, and whether there are sub-budgets for different categories of spend.

I will say something that might sound extreme: an organization that cannot articulate its constraints cannot articulate its values. Not in any form a system can respect. Constraint avoidance is the failure mode that keeps me up at night, because the damage it produces is the hardest to detect and the hardest to reverse.

Signal Blindness

Ask a leader "how will you know at the halfway point whether this initiative is on track?" and they give you the outcome metric. "We will know because churn will be down." That is the destination, not a signal. If churn is your only indicator, you will not know you are off course until you arrive somewhere else.

Signal blindness is the most technical failure mode, and fixing it usually requires the leader to work with someone in data or analytics who can identify the leading indicators that predict the lagging outcome. The leader knows what success feels like. The analyst knows how to measure the things that predict it. The intent document needs both.

Here is the important thing about these four failure modes: none of them are personal deficiencies.

Every one of them is a rational adaptation to the prompt paradigm. In a world where you manage through human conversation, strategic vagueness works. Milestone substitution works. Constraints can stay implicit. Signals can be informal. The prompt paradigm, whether you are prompting a human team or prompting an AI through a chat window, tolerates imprecision because the receiver of the prompt fills in the gaps.

The ambient shift removes the gap-filling. And when the gap-filling disappears, the gaps become visible.

That is what the exercise reveals. Not that you are bad at strategy. That you have never had to be this precise about it. No one has. The exercise is not a test of your current ability. It is a measure of a skill that almost nobody has developed, because until now, nobody needed it.

The Breakdown Is the Data

So you tried the exercise. You hit the wall. Your intent documents are incomplete, vague in some dimensions, missing others entirely.

Good.

I mean that without a trace of sarcasm. The places where your intent articulation broke down are the most valuable output of this entire exercise. More valuable than any finished intent document you could produce.

Here is why.

Every gap in your intent document, every place where you could not be specific, every constraint you could not articulate, every signal you could not define, marks a spot where a strategic assumption has been left implicit. And implicit assumptions are invisible. They are invisible to you, because you have internalized them so thoroughly that they feel like obvious truths rather than choices. They are invisible to your team, because different people have internalized different versions of the same assumption without ever comparing notes. And they are invisible to any system working on your behalf, because the system has no access to what you have not said.

Say you are running a product initiative and you tried to write the desired outcome state. You wrote: "The new dashboard gives enterprise customers real-time visibility into their usage patterns." And then you got stuck. What does "real-time" mean? Within one second? One minute? One hour? What counts as a "usage pattern"? Login frequency? Feature adoption? API call volume? All of those? What does "visibility" mean? A chart? A table? An alert? A scheduled report?

Each of those questions represents an assumption you have not made explicit. And here is the thing: you probably have answers to all of them. If someone on your team walked into your office and asked "what do we mean by real-time?" you would say "within five minutes for most metrics, within thirty seconds for anything billing-related." You know that. You just have not written it down. And because you have not written it down, different members of your team may be operating with different assumptions. Your engineering lead might be building for sub-second latency because that is what "real-time" means in her world. Your designer might be building for daily summaries because that is what the customer research suggested. Both are working hard. Both are wrong, in different directions, about what you actually want.

That gap, the gap between what you know and what you have specified, is the gap the ambient shift makes impossible to ignore. In the old workflow, you would discover the mismatch during a sprint review, course-correct, lose a week or two, and move on. Annoying but manageable. In an ambient workflow, the system has been building toward the wrong specification for days or weeks before anyone notices. The cost of the implicit assumption is much higher.

When leaders do this exercise in workshops, I ask them to mark every place they got stuck with a red flag, literal or metaphorical. Then I ask them to look at the pattern of flags. Where are the clusters? Which dimensions have the most gaps? Which initiatives are hardest to specify?

The patterns are revealing. A leader with flags concentrated in the constraint dimension is someone who has not thought carefully about what the organization will not do. That is a values conversation waiting to happen. A leader with flags concentrated in the success signals dimension is someone who does not have a clear causal model for how their initiative produces results. That is a strategy conversation waiting to happen. A leader with flags concentrated in contextual anchors is someone who is holding critical information in their own head that the rest of the organization does not have access to. That is a communication problem waiting to happen.

None of these are AI problems. They are leadership problems that have always existed but have been invisible because the human process papered over them. The ambient shift strips the paper away.

One VP of engineering I worked with put it bluntly: "I thought I was writing an intent document for the AI. I was actually writing a strategy document for myself. I figured out more about what I actually think we should do in two hours of this exercise than in a month of strategic planning sessions."

He was not being hyperbolic. The exercise works because it forces a different kind of thinking. Strategic planning sessions tend to operate at the level of comfortable ambiguity, which I talked about in the previous chapter. Everyone agrees on the general direction. Nobody pushes hard enough on the specifics to surface the disagreements. The intent articulation exercise, because it demands machine-level precision, pushes past the comfort zone. It makes you commit. And in committing, you discover what you actually believe.

So do not be discouraged by the gaps. Be curious about them. Each one is telling you something you need to know.

Building the Capability Across a Team

The solo exercise makes your assumptions visible to yourself. The team version makes them visible to everyone who is acting on them.

Here is why: your blind spots are not the same as your colleagues' blind spots. The assumptions you have left implicit are different from the assumptions your VP of Marketing has left implicit. When you bring multiple intent documents into the same room and compare them, the contradictions and gaps between them become visible. And those contradictions, the places where two leaders have different assumptions about what the same initiative is trying to achieve, are the most dangerous gaps of all, because they produce work that pulls in different directions without anyone realizing it.

I have a protocol for doing this. It has been refined through about thirty facilitation sessions over the past eighteen months. It is not complicated. It does not require a consultant. But it does require a facilitator, someone in the room whose job is to keep the group honest, and it requires a willingness to be uncomfortable.

Individual Preparation

Each member of your leadership team does the three-initiative exercise independently. They write intent documents for their three biggest initiatives, covering all six dimensions as completely as they can. They work alone. No collaboration. No comparing notes.

Give people at least a week. This is not a thirty-minute homework assignment. If you give people less than a few days, they will write what they think they should write rather than what they actually think. The time pressure creates surface-level responses. You want the responses that come after a leader has sat with the document, come back to it the next morning, and realized the third paragraph was wrong.

The Reveal Session

Bring the team together. Two hours minimum. Three is better.

Have each leader read their intent document for one initiative out loud. Not summarize. Read. The exact words.

Reading intent out loud, in front of peers, exposes vagueness that looks fine on a screen. A phrase like "drive meaningful engagement" sounds reasonable when you write it at your desk. It sounds hollow when you read it to six people who are listening carefully. The room creates accountability that solitary writing does not.

After each reading, the other leaders in the room have one job: identify the places where they would need to ask a clarifying question if they were the system acting on this intent. Not critiquing the strategy. Not debating the goal. Just flagging the gaps.

"What do you mean by engagement? What counts?"

"You said 'within budget' but did not specify the budget."

"You described the outcome for enterprise customers but this initiative also affects our mid-market segment. Is that intentional?"

Each question gets written down. No responses from the author in the moment. The author just listens and records the questions. This is the hardest part of facilitation: preventing the author from defending or explaining. The point is not to have a conversation about the initiative. The point is to collect the evidence of where the intent is not clear enough to stand on its own.

Cross-Initiative Comparison

After each leader has presented one initiative and collected questions, the facilitator looks for overlaps and contradictions between different leaders' intent documents.

I have never run a session where there were not at least three significant contradictions. Two leaders define the same customer segment differently. One leader's initiative depends on a constraint that another leader's initiative explicitly plans to change. One leader assumes a resource allocation that conflicts with another leader's success signals.

These contradictions are not failures of communication. They are the natural result of leaders optimizing independently, each with their own context, their own conversations with the CEO, their own understanding of the strategy. In a human-managed organization, these contradictions get resolved through ad-hoc coordination: a meeting here, a hallway conversation there. They rarely surface all at once.

Surfacing them all at once is the point. An ambient system acting on contradictory intent from two different leaders will not resolve the contradiction. It will act on whichever intent it encounters first, or it will produce incoherent output that reflects both intents simultaneously.

The facilitator's job in this phase is to map the contradictions without trying to resolve them immediately. Write them on a whiteboard. Let the team see the pattern. Let them sit with the fact that their organization's stated strategy contains internal conflicts that nobody had noticed.

Gap Prioritization

Not all gaps are equally urgent. A gap in the escalation criteria for a low-risk initiative is less pressing than a contradiction in the desired outcome state for the company's flagship product launch.

After the contradictions and gaps have been mapped, the team prioritizes. Which gaps, if left unresolved, create the most risk? Which contradictions, if left unaddressed, will produce the worst outcomes when ambient systems start acting on them?

This prioritization is a leadership conversation, not a technical one. It requires judgment about risk, about organizational values, about competitive dynamics. And it often surfaces strategic disagreements that the team has been avoiding.

I watched one leadership team spend forty minutes on a gap that turned out to be a fundamental disagreement about whether the company was a platform or a product. Both definitions had been floating around the organization for months. Both were embedded in different leaders' intent documents. Neither had been confronted directly. The intent exercise forced it into the open.

They did not resolve it in that session. That is fine. The point of the exercise is to find the gaps, not to fill all of them on day one.

Calibration Practice

After the initial sessions, you need a regular rhythm for calibrating intent documents against each other and against reality. Monthly. Sixty to ninety minutes. One or two leaders bring an updated intent document for one initiative. The team reviews it against the six dimensions, flags new gaps, and checks for consistency with other leaders' intent documents.

Over time, two things happen.

First, individual leaders get better at writing intent. The six dimensions become a habit. The vagueness decreases. The constraints get sharper. The signals get more specific. I have watched leaders go from producing two paragraphs of comfortable ambiguity in their first session to producing three-page intent documents that could actually direct an ambient system by their fourth or fifth session. The improvement is not linear. The second session is usually worse than the first, because the leader is now aware of how much precision is required and becomes paralyzed by it. The third session is better. By the fifth, the skill is internalized.

Second, the team develops a shared language for talking about intent. They stop saying "let's align on the strategy" and start saying "your success signals contradict my constraints, and we need to resolve that." They stop debating plans and start debating specifications. The conversations get shorter and more productive, because the intent documents do most of the work that previously happened through lengthy discussion.

From Documents to Infrastructure

At some point, individual intent documents need to become organizational intent infrastructure. This is the step where most teams stall, because it requires moving from a practice to a system.

Intent infrastructure means the intent documents are stored somewhere accessible, maintained on a schedule, and connected to each other. It means there is a process for updating contextual anchors when conditions change. It means there is a way to check new initiatives against existing intent documents for contradictions. It means someone, probably not a single person but a rotating responsibility, owns the health of the intent system.

This does not need to be fancy. I have seen it work with a shared folder and a spreadsheet that tracks which documents have been updated in the last thirty days. I have seen it work with a wiki that has a simple tagging system. The technology is the least important part. The discipline is what matters.

The engineering manager from Chapter 4 built a version of this for his team. His system intent documentation was exactly this: a set of persistent, maintained intent artifacts that the agentic coding system could reference. He did not call it intent infrastructure. He called it "stuff we write down so the agent does not break things." The label does not matter. The practice does.

Here is what the protocol gives you that individual practice does not.

Individual intent articulation makes your own assumptions visible to yourself. Team calibration makes everyone's assumptions visible to everyone. And organizational intent infrastructure makes those assumptions visible to the systems that act on them.

Each layer matters. Skip the first, and you are building on vagueness. Skip the second, and you have precise but contradictory intent across the organization. Skip the third, and you have good intent documents sitting in desk drawers, disconnected from the systems that need them.

The teams I have seen do this well start small. One leadership team. Three initiatives each. One reveal session. One monthly calibration. They do not try to build organizational infrastructure on day one. They build the muscle first. The infrastructure follows when the team realizes, usually after the third or fourth session, that they keep losing their work, that the documents are getting out of date, and that someone needs to own the system. That realization, when it comes from the team rather than being imposed from above, creates infrastructure that actually gets maintained.

I want to close with the thing I should have said at the beginning.

This exercise is humbling. I have done it myself. The first time I tried to write intent for my three biggest priorities with the precision this chapter demands, I failed at least as badly as any leader I have coached through it. I had constraints I had never articulated. I had success signals I could not define. I had two priorities whose desired outcome states, when I finally wrote them honestly, contradicted each other.

That was the most productive afternoon I had all quarter. Not because I fixed the problems. I did not, not that day. But because I could see them. And once you can see them, you can stop being managed by them.

The whole argument of this book comes down to something simple. The technology is waiting. It is waiting for you to tell it, with precision and honesty, what you actually want. The gap between what AI can do and what AI is doing in your organization is not a technology gap. It is a self-knowledge gap. Every vague intent, every implicit assumption, every unwritten constraint is a ceiling you have placed on your own systems without realizing it.

The three-initiative exercise does not produce perfect intent documents. It produces the map of what you do not yet know about your own strategy. That map is the starting point for everything.

Do the exercise. Find the gaps. Sit with the discomfort. Then, one dimension at a time, start closing them.

The organizations that win the next decade will not be the ones with the best AI. They will be the ones that knew, most precisely, what to tell it.

Chapter 10

Chapter 10

The Ambient Organization

What Changes When Intent Architecture Is the Operating System

From Competency to Operating System

You have spent the last nine chapters building a skill. Now the question changes.

Individual leaders who can write precise intent documents, who can calibrate signals, who can set autonomy boundaries for ambient systems, are valuable. But an organization full of individually skilled intent architects, each producing intent documents that sit in their own folders, is not an ambient organization. It is a collection of people who are good at something the organization has not decided to require.

The transition point is structural, not personal. You can see it clearly when it happens, because three things change at once.

First, intent documents stop being optional artifacts and become the required input for any initiative that receives funding. Not a project plan. Not a roadmap. Not a slide deck with arrows pointing right. The intent document is the thing the budget committee reads. If it does not exist, or if it is vague in the six dimensions from the previous chapter, the initiative does not get approved. This is a policy change, not a culture change. Someone with authority has to make it.

Second, intent quality becomes a reviewable output. Performance reviews for senior leaders include an assessment of the precision, completeness, and consistency of the intent documents they have produced. This sounds bureaucratic. It is. Bureaucracy is how organizations make things real. If intent architecture is a nice-to-have that leaders do when they feel like it, it stays a personal skill. If it is something your boss evaluates you on, it becomes organizational practice.

Third, and this is the one most people miss: the systems themselves are configured to refuse action without sufficient intent. The ambient growth system from Chapter 5 ran for eleven days on inferred intent because nobody had told it not to. In an ambient-native organization, the system is built to pause and request clarification when the intent it is operating on falls below a quality threshold. The system becomes the enforcement mechanism. Not just the tool.

I have seen this pattern complete itself in exactly two organizations. That number is low. It is supposed to be low. We are early. Both are mid-size technology companies, both under a thousand employees. Neither announced a transformation initiative. Neither hired a consultant to design the new operating model. What happened in both cases was that a specific leader, in one case the COO and in the other the head of product, got tired of cleaning up the messes created by ambient systems running on vague intent. They started requiring intent documents for their own teams. The practice worked well enough that other leaders copied it. After about nine months, the CEO made it official.

The pattern reminds me of something I read about the factory electrification era. The plants that reorganized earliest around electric motors were not the biggest or the best-funded. They were the ones where a single floor manager got frustrated enough with the old layout to rearrange his own section. Other floor managers saw it working and copied it. The front office formalized the new layout only after it had already spread organically across most of the plant. The structural shift started in the middle, not at the top.

That is what I am seeing now. The organizational transition happens when enough mid-level leaders are doing it that formalizing the practice is just acknowledging reality.

Most organizations are not there yet. Most organizations are still in the phase where individual leaders are doing the three-initiative exercise from the previous chapter, getting better at intent articulation, and wondering why nobody else seems to care. If that is where you are, keep going. The structural transition requires a critical mass of people who have the skill before the organization has a reason to build the infrastructure.

The New Roles Nobody Has Hired For Yet

Here is a job posting that does not exist yet:

Intent Architect, Enterprise Operations. Responsible for maintaining the organization's library of intent documents across all active initiatives. Works with initiative owners to ensure intent documents meet quality standards across all six dimensions. Identifies contradictions between intent documents from different business units. Flags stale contextual anchors. Reports to the Chief Operating Officer.

That job does not exist because the category does not exist. But the work is already being done, poorly and part-time, by people who have other titles.

In one of the two organizations I mentioned above, the person doing this work is a former project manager who noticed that most project failures traced back to vague or contradictory intent specifications. She started reviewing intent documents before initiative kickoffs, checking for gaps in constraint articulation and contradictions with other active initiatives. Her official title is still Senior Program Manager. Her actual work is about 70% intent architecture.

This is the clearest of the emerging roles, and the one I can describe with the most confidence. This person does not own strategy. They do not decide what the organization should do. They own the quality and coherence of how the organization's strategic decisions are expressed. They are an editor, not an author. They sit in a staff function, usually attached to operations or to the office of the CEO, and their authority comes from the ability to send an intent document back to a leader with the note "the system cannot act on this, and here is why." The senior program manager I mentioned has sent that note eleven times in the past six months. Nine of the eleven resulted in intent documents that were materially rewritten before launch. The other two resulted in arguments that surfaced genuine strategic disagreements the leadership team had been avoiding.

The second role taking shape is what I have been calling a signal calibrator. This person monitors the feedback loops between ambient systems and the intent documents that direct them. When the success signals defined in an intent document stop correlating with the desired outcome, the signal calibrator notices. When an ambient system's actions begin drifting from the intent document because the contextual anchors are stale, the signal calibrator flags it. The CRO in Chapter 5 described this work as "tuning a radio you can never turn off." He was doing it himself, personally, in addition to running the commercial organization. That is not sustainable.

This role is the least settled of the three. The organizations I have seen trying to staff it cannot agree on whether it belongs in data, in strategy, or in the COO's office. That confusion is itself a data point. The work cuts across every function, which means every function claims it and no function wants to fund it.

The third role is the ambient audit lead. This person reviews what ambient systems actually did, after the fact, and compares it to what they were supposed to do. Think of it as internal audit for AI-driven operations. The marketing team in Chapter 5 reviewed the activity log on day twelve and found the system had made reasonable decisions. An ambient audit lead would have been reviewing those logs on day one, day two, day three, checking each decision against the intent document, flagging the moment the system's inferred priorities began to diverge from the organization's stated ones.

Where do these roles sit? That question is more revealing than it sounds.

In a traditional organization, you would stick them in IT, or in a new "AI Center of Excellence," or in some other functional silo. That is the wrong answer, for the same reason the Revenue Operations Council from Chapter 5 failed after four months: because these roles are cross-functional by nature. An intent architect who sits in IT will never have the strategic context to evaluate whether an intent document's desired outcome state is actually what the business needs. A signal calibrator who reports to the CMO will calibrate commercial signals but miss the contradictions with product or engineering intent.

The organizations that are getting this right, the very few of them, place these roles in the office of the COO or in a small team that reports directly to the CEO. The reporting line matters because it determines whose agenda takes priority when intent documents from different leaders contradict each other.

One thing I want to be clear about: these are not AI roles. They are leadership infrastructure roles. The intent architect does not need to know how a large language model works. They need to know how to read an intent document and spot the gaps. The signal calibrator does not need to build dashboards. They need to know what the numbers should look like and notice when they do not.

Reimagining Planning, Governance, and Accountability

What does a planning cycle look like when the primary deliverable is an intent document rather than a project plan?

Shorter, for one thing.

Annual planning in most organizations is a three-to-four-month exercise that produces a set of project plans, budgets, and OKRs. The plans describe what the organization will do. The budgets describe what the organization will spend. The OKRs describe what the organization hopes will happen as a result. Then the organization spends the year executing the plans, tracking the budgets, and discovering in Q3 that the OKRs were either wrong or unmeasurable.

In an ambient-native organization, the planning cycle produces intent documents. The intent documents describe the desired outcome states, the success signals, the constraints, the contextual anchors, the scope of autonomous action, and the escalation criteria for each initiative. These documents are the primary output. Project plans, to the extent they exist, are generated downstream, often by the ambient systems themselves, as implementation paths toward the intent.

This changes the planning conversation. Instead of debating timelines and resource allocations, the leadership team debates outcome states and constraints. Instead of asking "can we get this done by September?" they ask "is this intent document precise enough that a system can pursue it?" Instead of negotiating headcount, they negotiate autonomy boundaries.

Board-level AI oversight in this world looks different from what most boards are doing today. A board reviewing project plans asks: are we on schedule, are we on budget, are we hitting milestones? A board reviewing intent documents asks whether the desired outcome states are defined with enough precision that systems can act on them, whether constraints are appropriate, and whether escalation criteria are tight enough to prevent unsupervised drift.

The mechanism that matters is timing. A board that reviews intent quality catches strategic vagueness before it enters the operational system. A board that reviews milestones catches it after. Consider the growth system from Chapter 5. If someone had reviewed that system's intent document before launch, the eleven-day drift would have been caught in the first hour of the planning cycle. The constraint against unsupervised commercial outreach to at-risk accounts would have been written down before the system ever sent an email. The cost of that drift, in customer relationships that the head of customer success later identified as damaged, in repositioning work, in the six months of higher churn among aggressively expanded accounts, was orders of magnitude larger than the cost of a one-hour intent review. Most boards are still asking "how is our AI adoption going?" That is approximately as useful as asking "how is our electricity adoption going?" in 1925.

Accountability changes too. In a traditional organization, a leader is accountable for execution. Did you hit the plan? Did you deliver the project? Did you make the number? In an ambient organization, a leader is accountable for intent quality. Did you specify the outcome clearly enough that the system could pursue it? When the system drifted, was it because the intent was vague, or because the system made a bad inference? If the intent was vague, that is a leadership failure, not a technology failure.

This reframing feels uncomfortable to leaders who have built careers on execution excellence. Execution still matters. But execution is increasingly what the systems do. The leader's contribution is upstream: the precision and coherence of the intent that directs the execution. Being accountable for intent quality rather than execution quality is not a demotion. It is a recognition that the bottleneck has moved. The factory floor runs itself. The question is whether the blueprints are any good.

Governance in this world also requires something most organizations have never built: an ambient feedback loop. The systems need a way to surface intent gaps in real time. Not a quarterly review. Not an annual audit. A continuous signal that says "I am acting on intent document #47, and here is what I cannot determine from the specification."

The engineering manager in Chapter 4 stumbled into a version of this. His agentic coding system would flag task specifications that were ambiguous, asking for clarification before proceeding. He called those flags "annoying at first and then indispensable." The system was surfacing his intent gaps in real time. The production incident in month five happened on a task where the specification did not anticipate high-concurrency conditions. After the incident, he added concurrency behavior to his standard specification template. The system taught him what he had failed to specify.

Scale that pattern across an organization and you get something new: a governance model where the systems themselves participate in intent quality assurance. Not by making decisions about strategy, but by reporting, continuously and specifically, on the places where the strategy is not clear enough for them to act.

The Monday Morning That Is Actually Different

You are in the same seat. Same desk, same coffee, same Monday morning.

But it is different now.

You open your laptop and the first thing you see is not a list of unread emails. It is a brief from your ambient system, generated overnight, telling you three things. First: the growth initiative you own is tracking within the success signal bands you defined in your intent document. No action needed. Second: one of your contextual anchors is stale. The competitor pricing data you referenced was updated on Friday, and the new numbers change the constraint you set around your own pricing flexibility. The system has paused the pricing-related actions in your initiative and is waiting for you to update the constraint. Third: there is a contradiction between your intent document and the one your VP of Product filed last week. Your desired outcome state assumes the onboarding flow will remain unchanged through Q3. Her intent document includes a redesign of the onboarding flow starting next month. The system flagged the conflict because it cannot pursue both intents simultaneously.

You spend twenty minutes reading the brief. You update the pricing constraint, which takes five minutes because you just need to change one number and one sentence. You send a message to your VP of Product suggesting a fifteen-minute call to resolve the onboarding contradiction. You flag the stale contextual anchor for the signal calibrator on your team so she can check whether other intent documents are affected.

By 9:15, you are done with system management for the morning. The system is back in motion, operating on updated intent. The contradiction will be resolved by noon, once you and the VP of Product agree on whether the onboarding redesign happens before or after your initiative's success signals are measured.

You spend the rest of the morning on something the system cannot do. You are preparing for a conversation with your largest customer's CTO, who is concerned about the direction of your product roadmap. This conversation requires judgment, relationship, empathy, context that cannot be written in any document. It requires you.

At lunch, you check the system brief again. The signal calibrator has confirmed that the competitor pricing update does not affect any other active intent documents. The system has resumed full operation on your initiative. The pipeline it influenced last week, acting within the boundaries you set, is $140,000 larger than it was on Friday.

You did not touch the pipeline. You did not write the emails. You did not select the accounts. You did not decide the timing or the channel or the message. The system did all of that. But it did it inside a container you built, a container made of outcome states, success signals, constraints, contextual anchors, autonomy boundaries, and escalation criteria. The system's work is your work, because the system's direction is your direction, specified precisely enough that you can trust it.

In Chapter 1, you were sitting in the same seat, feeling a dissatisfaction you could not name. You had the tools. You had the adoption metrics. You had the hours saved. But you were still the one translating, initiating, correcting, filling in the gaps between what the AI could do and what you needed it to do. The cognitive tax was on you, all of it, every time.

That tax is gone now. Not because the AI got smarter, though it did. Because you got clearer. You told the system what you wanted with a precision that let it work without asking. And the work it does while you are in the room with your customer's CTO, the work it does while you are thinking, while you are building relationships, while you are exercising the judgment that no specification can capture, is work that would not have happened at all in the old model. Not because the technology could not do it. Because nobody had told the technology what to do.

That is the Monday morning this book has been building toward.

The Only Question Left

I said at the beginning of this book that the AI is not the problem. The paradigm is the problem. I have spent ten chapters explaining what I mean by that, and what a different paradigm looks like, and what it demands of leaders and organizations.

But if I had to compress the entire argument into a single question, it would not be the question most people expect.

It is not "how do I use AI better?"

It is not "which tools should I buy?" or "how do I train my team?" or "what is our AI strategy?"

The question is this: Can you define what you actually want, precisely enough that a system can pursue it while you go do something only you can do?

That is it. That is the whole thing.

If you can, you will lead in the ambient era. Your systems will work while you think. Your intent will compound, the way interest compounds, producing returns you did not have to personally execute. Your organization will operate at a speed and coherence that is impossible when every action requires a human to initiate it, translate it, and verify it.

If you cannot, you will stay in the prompt paradigm. You will keep typing. You will keep translating. You will keep correcting outputs that were never quite right because the inputs were never quite precise. You will work harder than your competitors who figured this out, and you will produce less. Not because you lack the technology. Because you lack the self-knowledge to direct it.

I am asking you to do something most leaders have never done. To write down, in explicit and operational language, what you actually want. Not what sounds good in a board meeting. Not what fits on a strategy slide. What you actually want, including the trade-offs you would rather not confront, the constraints you would rather keep flexible, the success signals you would rather leave vague because defining them means committing to a specific version of success and giving up the others.

That is hard. It is supposed to be hard. If it were easy, you would have done it already.

The VP of engineering from the last chapter said the intent exercise taught him more about his own strategy in two hours than a month of planning sessions. That is the payoff. Not better AI outputs. Better self-knowledge. A clearer picture of what you want, what you will not accept, and what you are willing to let a system do on your behalf.

The technology will keep getting better. Every quarter, the models improve. None of that matters if the intent they are acting on is vague, contradictory, or stale. Better engines do not help if you have not decided where you are going.

We are in the bolted-on phase. Most of us have been here since November 2022, and we have been calling it progress. The models are extraordinary. The paradigm is inherited from 1960s teletype terminals. The gap between what the technology can do and what organizations are directing it to do is the largest unrealized opportunity I have seen in twenty years of watching this industry.

Closing that gap is not a technology project. It is a leadership project. And the decade is not waiting.

ABOUT THE AUTHOR

Jean-Philippe LeBlanc is SVP Engineering at CircleCI, one of the world's largest continuous integration platforms. He leads engineering strategy across the R&D organization and sits at the center of the shift he writes about in this book, watching teams discover in real time that the bottleneck is never the technology.

He is also a founder. His most recent work is in Agent-Led Growth, a framework for organizations deploying autonomous agents in revenue functions. He has built products across publishing, performance intelligence, and engineering culture for over a decade.

After the Prompt draws on years of direct observation across engineering, growth, and knowledge-work organizations navigating this shift… and on the uncomfortable realization that the gap between what these systems can do and what organizations tell them to do is, above all, a leadership problem.

He lives in Montreal, Canada.

End of book

Thank you for reading.

↑ Back to the top