What people actually mean by 'AI agents'

"Agent" might be the most over-loaded word in tech right now. Vendors slap it on everything, demos make it look like magic, and somewhere underneath the hype there's a genuinely useful idea getting buried. So let me try to do the unglamorous thing and say plainly what an agent actually is — and, just as importantly, when you shouldn't reach for one.

Here's the whole idea in one sentence: an agent is a model put in a loop, with tools and a goal. That's it. Instead of answering your question in a single shot, it plans a step, takes an action — calls a tool, runs some code, queries a database — looks at what came back, and decides what to do next. It keeps going around that loop until the goal is met or it gives up. Anthropic's own write-up on building agents lands on a similar definition: systems where the model dynamically directs its own process and tool use, rather than following a fixed script.

One-shot vs. the loop

A plain chatbot is one-shot. You ask, it answers from what it already knows, and the exchange is over. That's fine for "explain this concept" or "rewrite this paragraph." But ask it "what were our top three products by margin last quarter, and email the summary to the team" and a one-shot answer can't be right — the model doesn't have your numbers, and it can't send email. It would have to hallucinate the figures.

An agent handles that differently. It breaks the goal into steps, and for each step it can actually do something: query the warehouse, get real numbers back, notice the query returned nothing, fix the filter, try again, format the result, then call the send-email tool. The loop is what turns a model from something that talks about work into something that does work.

The loop: plan a step, act with a tool, observe the result, repeat — until the goal is met and it stops.

Why this feels familiar to me

The first time I read a clean definition of an agent, my reaction wasn't "the future is here" — it was "oh, this is an orchestrator." I build and run an automated data analytics platform where every job is scheduled, retried on failure, and watched from a console that shows what ran, how long it took, and whether it succeeded. An agent is the same skeleton with one difference: instead of a human author wiring the steps in advance, the model decides the next step at runtime.

That difference is the whole pitch and the whole risk. The flexibility is real — an agent can handle a task you didn't fully script. But a fixed pipeline can't suddenly decide to do something you never intended, and an agent can. Which is exactly why the disciplines I already apply to pipelines aren't optional for agents — they're the price of letting one loose.

An agent is an orchestrator that writes its own next step. That's the superpower and the liability in the same sentence.

Where agents genuinely earn their keep

Agents shine on tasks that are multi-step, tool-using, and well-bounded — where the path isn't fully knowable up front but the goal and the guardrails are clear. Think "triage these incoming support tickets, look each customer up, and draft a reply," or "investigate why this dashboard number looks off by checking the source query and the last few loads." The loop lets the model adapt as it learns what's actually there, instead of failing on the first surprise.

Where they're overkill or risky is the mirror image: a one-shot question doesn't need a loop, and a high-stakes, irreversible, or vaguely-specified task is where the loop turns dangerous. If the goal is fuzzy, the agent will cheerfully optimise for the wrong thing. If an action can't be undone — moving money, deleting records, sending something to a customer — you want a human between the model and the button.

Good fit for an agent

Multiple steps where the path isn't fully known up front
Real tools to call — data, APIs, code — not just text
A clear, checkable goal and bounded scope
Cheap to verify, easy to undo if it's wrong

Poor fit (keep it simple, or keep a human in)

One-shot answers a plain prompt already nails
High-stakes or irreversible actions — money, deletes, sends
Vague goals with no way to tell "done" from "wrong"
Hard real-time limits the loop can't meet

Being honest about the limitations

If you've only seen the demos, agents look unstoppable. In day-to-day use they're more like a fast, eager junior who occasionally wanders off. Three failure modes are worth naming plainly. They drift: over a long loop the model loses the thread and starts solving a subtly different problem. They over-act: given a hammer-shaped tool, they'll find nails — taking actions you didn't ask for because the loop rewards doing something. And they compound errors: a wrong observation early can send the next ten steps confidently in the wrong direction.

None of that makes them useless — it makes them something to engineer around. The same orchestration disciplines apply, and they're not negotiable.

Bounded permissions. Give the agent the narrowest set of tools and access it needs. A reporting agent doesn't need delete rights. Least privilege isn't a nice-to-have; it's the blast-radius limiter.
Observability. Log every step — what it planned, which tool it called, what came back. If you can't replay why an agent did something, you can't trust it or fix it. It's the run console all over again.
Human gates. Put a person in front of anything irreversible or high-stakes. The agent proposes; a human approves the action that can't be taken back.

My one-line takeaway An agent isn't a magic worker — it's a model in a loop with tools. Treat it like a pipeline you'd actually put in production: bounded permissions, full observability, and a human on the dangerous buttons. Do that and it's genuine leverage. Skip it and you've automated a way to be confidently wrong, faster.

So when someone tells you they've "built an agent," the useful follow-up isn't "wow" — it's "what's the goal, what tools can it touch, and what happens when it's wrong?" Strip away the hype and that's all an agent is: a loop, some tools, and a goal. The interesting engineering — the part that decides whether it helps or hurts — is everything you wrap around that loop to keep it honest.