Managing AI Agents Like Teammates

The five things only a human can give an AI agent: a job, context, autonomy, tools, and oversight.

Why this paper exists

This is the third paper in a series for executives whose organizations are starting to put AI agents to work. The earlier papers cover what an agent actually is, and how to choose which ones are worth building. This one is about the part that catches leaders off guard.

Once you have agents doing real work, the job that decides whether it goes well is not building them. It is managing them. And that job lands on leaders and managers, not on engineers. The model is somebody else's problem. The job description, the context, the limits, the access, and the standard the work is held to are yours.

That is good news, because it is work you already know how to do. The whole paper rests on one idea: an AI agent needs the same things a new teammate needs, and you already know how to give them. What follows is the five things, why each one matters, what goes wrong when you skip it, and how to tell whether you are doing it well.

The five things at a glance

When you bring a capable new person onto a team, you give them five things. A job, so they know what they are responsible for. Context, so they understand why it matters. Autonomy, so they know what they can decide alone. Tools, so they can actually do the work. And oversight, so someone is watching the results and the direction.

An AI agent needs the same five. Get them right and the agent earns its place on the team. Skip them and you have hired a fast, confident new starter and then left the building. The rest of this paper takes them one at a time.

The five things

1. Job

Define what the agent is actually for. "Use AI in finance" is not a job. "Reconcile these two ledgers every morning and flag anything over a threshold for review" is a job. The first is a wish. The second is something you could hand to a person and hold them to.

A vague brief makes a person drift. It makes an agent drift faster, all day, at full confidence, which is how a loose instruction turns into an expensive one. A person with an unclear job will usually sense the confusion and come back to ask. An agent will not. It will take whatever you gave it and run, and it will sound completely sure while doing the wrong thing well.

So most of the work here is upstream of any technology. It is writing the job down clearly enough that two people would read it the same way: what the agent does, what it produces, what counts as done, and what it should never touch. If you cannot write that paragraph, you are not ready to hand the work to an agent, and you were probably not ready to hand it to a new hire either.

You can tell you have this right when the job has an edge. A good agent job says what is out of scope as clearly as what is in it. "Answer routine supplier questions from the approved list, and send anything about price or contracts to a person" is a job with an edge. "Help with supplier questions" is not.

2. Context

An agent needs to know what matters and why, not only the steps. Give a new hire the procedure without the reasons and they will make decisions that are confident, defensible, and wrong. An agent does exactly the same, for the same reason: the procedure covers the normal case, and the reasons are what you need for everything else.

This is where most agent projects quietly succeed or fail. The context that matters is rarely written down. It is the definitions your company uses that the rest of the industry does not. The exceptions that everyone senior knows and no document records. The "we tried that in 2021 and it went badly" rules that live in three people's heads. An agent has none of this unless you give it, and the work of getting it out of people's heads and in front of the agent is slow, human, and unavoidable. There is no shortcut, and the teams that look for one are the teams whose agents embarrass them later.

You can tell you have this right when the agent handles the second-most-common case well, not just the obvious one. Anything can manage the happy path. The value, and the risk, lives in the exceptions.

3. Autonomy

How much should the agent decide on its own, and when should that change? This is the call that takes the most judgment, and it is squarely an executive call rather than a technical one.

Start from the cost of being wrong. If a decision is easy to reverse and cheap to get wrong, let the agent make it and check the results later. If it is a door that only swings one way, expensive or impossible to undo, have the agent propose and a person commit. Most work is a mix, so autonomy is not one setting you choose once. It is something you tune task by task, and loosen over time as the agent earns trust, the same way you would with a junior who keeps getting it right.

The common mistake is treating autonomy as a switch: fully manual, which wastes the agent, or fully automatic, which scares everyone the first time it is wrong in public. The useful version sits in between and moves. A new agent on a high-stakes process should propose and wait. The same agent, six months and a thousand good decisions later, can act and report. You are not setting a permission. You are managing a relationship that earns more rope as it proves itself.

You can tell you have this right when people can describe, without checking, exactly which decisions the agent makes alone and which come to a human. If the line is fuzzy in conversation, it is fuzzy in practice.

4. Tools

An agent with a clear job, good context, and the right autonomy can still only talk, until you give it access. Real work needs real tools: the systems where the work actually happens, and the permissions to use them. An agent that can read the supplier database and send from a monitored mailbox can handle supplier queries. An agent that can only describe how it would handle them is a very expensive memo.

This is also the moment governance stops being abstract. The day an agent can act in your systems, the question changes from what it might do in theory to what you actually allowed and whether you can see it. That makes equipping the agent and governing it the same piece of work, done at the same time. You decide what it can reach, what it can change, and what it leaves an unmistakable trail of. Hand out access without that, and you have given a fast, tireless worker the keys to systems nobody is watching.

You can tell you have this right when you can answer two questions instantly for any agent: what can it touch, and where would I look to see what it did. If either answer takes research, the access ran ahead of the oversight.

5. Monitor

Agents drift. The data moves, the business moves, and an agent that was right in spring can be quietly wrong by autumn, with nobody having changed a thing. The world changed around it. So oversight is not a sign-off you do once at launch and file away. It is a standing job.

Someone has to keep asking whether the outputs are still good, whether the process is still sound, and whether the behavior has wandered from what you intended. This is the part that never transfers to the machine, because the standard for good is yours to set and yours to defend. An agent can tell you what it did. It cannot tell you whether that was the right thing to want, and it cannot notice that the definition of right has shifted. A person has to own that, with the time and the authority to act on what they find.

You can tell you have this right when monitoring has a name attached and a rhythm: who looks, how often, and what they are allowed to change when something is off. "We'll keep an eye on it" is not monitoring. It is how drift goes unnoticed until a customer finds it first.

What this changes for the leadership team

Look at the five again. None of them is technical. Setting a job, supplying context, judging stakes, granting access, and holding a standard are management work, not engineering work. That is the real shift in what AI agents ask of an organization. The thing slowing down your agents is almost never the model. It is whether your organization can manage well, clearly, and quickly, at the scale agents now run at.

For most leaders that should be encouraging, because it is the job you already have. It is also a warning. A company that manages its people badly will manage its agents badly too, and the agent will do it faster and in more places at once. An agent runs on whatever clarity, or confusion, already exists around it. It does not fix a vague organization. It scales one.

A worked example

Take a team that wants an agent to handle the first line of inbound supplier queries. The instinct is to start with the tool: which platform, which model. Start with the five instead.

The job: triage incoming queries, answer the routine ones from approved material, and pass the rest to a named person. The context: the supplier policy, the approved answers, and the handful of situations that always go to a human. The autonomy: answer freely inside the approved set, never make a commercial commitment, and escalate anything that touches price or contract. The tools: read access to the supplier database and send access to one mailbox somebody watches, nothing wider. The monitoring: a weekly look at a sample of answers, plus an alert whenever the agent escalates.

An agent like that can be live in a few weeks. The technology was never the slow part. The five decisions were, and this team made them on purpose. (Real field stories, where leaders describe what this looked like inside their own companies, are the subject of a later paper in this series and the book it feeds.)

How to use this

The five are a checklist you run before you build, and a diagnostic you run on what you already have.

Before you build, take one candidate process and write the five on a single page. Be honest about the job: if you cannot state what is out of scope, you are not done. When you set autonomy, resist the urge to be brave. It is cheaper to start the agent narrow and widen it than to start it wide and explain the mistake. When you list tools, list the access and the audit trail in the same breath; if you find yourself granting access now and promising oversight later, stop and design the oversight first.

On something you have already shipped, score each of the five from honest experience, not intention. Where do answers actually come from, and would you bet they are right. Does the agent handle the awkward second case or only the obvious one. Can your team describe the autonomy line without looking it up. Could you, right now, see what the agent did yesterday. Has anyone looked at its output this month, and do they have the authority to change it. The five usually fail in the same two places: context that was never captured, and monitoring that was never assigned. Those are where to look first.

What to do this week

Pick one process you would consider handing to an agent. Before you look at a single tool, put the five on one page: the job, the context it needs, where its autonomy starts and stops, the tools and permissions it requires, and how you will keep an eye on it. If you cannot fill the page, the AI was never the problem. The work was never defined well enough to hand to anyone, person or agent. Filling that page is where managing your first AI teammate actually starts.

Frequently asked questions

Do executives need to be technical to manage AI agents? No. Setting a job, giving context, judging stakes, granting access, and holding a standard are management decisions, not engineering ones. The technical build sits underneath them. If you can manage a person, you can manage an agent.

How is an AI agent different from a copilot or a chatbot? A copilot helps a person who stays in charge of each step. An agent takes a job and works through the steps itself, using tools, with the autonomy you give it. You are handing over an outcome rather than asking for suggestions, which is exactly why how you manage it matters.

Where do most AI agent initiatives go wrong? Context and autonomy. A team gives the agent a tool and a loose job, skips the context that lives in people's heads, and sets autonomy to all-or-nothing instead of matching it to the stakes. The fix comes before any technology.

What about governance? It lives inside tools and monitoring. The moment an agent can act in your systems, permissions and oversight are the controls you have. Granting an agent access and governing what it does are one set of decisions, which is why governance cannot be a separate thing you handle later.

How many agents can one person manage? More than you would expect, and fewer than the vendors imply. The limit is not the number of agents. It is how much context and monitoring each one genuinely needs. A well-defined agent on a stable process needs little ongoing attention. An ambitious agent on a shifting process needs a lot. Plan for the second until it proves it is the first.