AI coding assistants in practice: autocomplete, chat, and agents

I build a lot of small tools — pipelines, dashboards, the odd internal app. Over the last year, AI assistants went from a novelty I tolerated to something woven through how I work. But "AI coding assistant" is now a label stretched over three quite different things, and conflating them is the fastest route to disappointment. So here's the practitioner's view: what the three modes actually are, when each one earns its keep, and the one rule that applies to all of them.

The categories blur at the edges — most popular tools now offer more than one mode — but it's worth separating them by what they do to your code and how much they do unattended. Roughly: autocomplete finishes the line you're typing; chat talks through a problem and hands you a snippet; agents read your repo, change several files, and run commands on your behalf. The difference that matters is autonomy — how far the tool runs before it stops and asks you.

The same underlying models, dialled to different amounts of autonomy. Pick by how much you want done before it checks in with you.

Mode 1 — Inline autocomplete

This is the one most developers met first. As you type, the editor offers ghost text — sometimes the rest of the line, sometimes a whole block — and you accept it with a keystroke. GitHub Copilot popularised the pattern, and it now ships as completions plus "next edit" suggestions inside most editors (GitHub's own docs are the clearest description of how this surface behaves). Cursor, an AI-native editor built as a VS Code fork, leans hard into the same idea with a fast, high-acceptance completion engine.

Autocomplete shines where the answer is obvious from context but tedious to type: boilerplate, repetitive transforms, the predictable second half of a pattern you've already started, a config block that looks like the last one. It stays in your flow — no context switch, no prompt to write. The friction is near zero, and so is the blast radius: it only ever touches the few lines under your cursor, and you read every one as it appears.

Where it breaks down is anything that needs a plan. Ask autocomplete to "implement the whole feature" and it can't — it doesn't have a turn to think, only the next few tokens. It will also happily complete a line in a confidently wrong direction if your surrounding code nudged it there. It's a brilliant typist, not a designer.

Mode 2 — Chat / ask

The second mode is a conversation. You ask a question in a side panel or a separate window — "what does this regex do?", "why is this test flaky?", "what's a cleaner way to structure this?" — and you get an explanation and usually a snippet to copy back. This is where I do my thinking-out-loud: design discussions, debugging a stack trace, comparing two approaches before I commit to either.

Chat is strongest for understanding and deciding rather than producing. It's a patient rubber duck that occasionally knows something you don't. It's great for unfamiliar code, an error message you've never seen, or sketching the shape of a solution before any code exists. Because it hands you a snippet rather than editing your project, you stay firmly in control of what actually lands.

Chat doesn't touch your repo, and that's a feature. The gap between "here's a snippet" and "it's in your codebase" is exactly where you do your reviewing.

Its limits are the flip side of that strength. Chat doesn't see your whole codebase unless you paste it in, so its advice can be locally sensible but globally wrong — confident about a function it can't actually see. And the copy-paste-adapt loop gets slow once a change spans more than a file or two. That's precisely the gap the third mode was built to close.

Mode 3 — Agentic tools

An agent is the assistant let off the leash, within guardrails. You describe an outcome; it reads the relevant files itself, plans an approach across several of them, makes the edits, runs your tests or build, reads the failures, and loops back to fix them — checking in with you at the points you've configured. Anthropic's Claude Code is a terminal-native example of this category, and it's the one I reach for; Cursor's agent mode and GitHub's agent mode do comparable work inside the editor. Anthropic's product page and the independent comparisons rounding up these tools (Built In has a fair one) describe the same shift: from completing your line to carrying out a task.

Agents earn their place on larger, multi-file changes with a clear definition of done: a rename that ripples through a dozen files, scaffolding a new module to match an existing one, wiring up a migration, writing a test suite for code that has none. The leverage is real — the agent does the legwork of finding every place that needs to change, which is the part humans get wrong by omission.

But this is also the mode that demands the most supervision, because it's the only one that acts. It can edit files you didn't expect and run commands you didn't anticipate. Sensible tools default to cautious — asking before they write or run — and that default exists for a reason. An agent confidently heading down a wrong path is more expensive than a wrong autocomplete, simply because it does more before you look. Scope it tightly, work on a branch, and keep your tests as the thing it has to satisfy.

The one rule for all three Whatever the mode, you are still the author. Read every line, run the tests, and understand why it works before it ships. AI moves the cost of producing code towards zero — which means the value, and the responsibility, shifts almost entirely onto review and judgement.

Choosing by task size and risk

I don't pick a tool so much as pick a mode for the task in front of me. The rough heuristic: match the autonomy to the size of the change and the cost of getting it wrong.

Reach for less autonomy when…

The change is small or local — a line, a block, a single file.
You already know the answer and just want it typed faster.
You're exploring or learning, and want to decide each step.
The code path is high-risk and you want eyes on every edit.

Reach for more autonomy when…

The change spans many files and the pattern is clear.
"Done" is well-defined — tests pass, build is green.
The legwork (finding every call site) is the hard part.
You can sandbox it: a branch, a review gate, good tests.

Notice the right-hand column has a precondition the left doesn't: you can only safely hand off more when you can check the result cheaply. An agent without a test suite to satisfy is just generating plausible code at speed. The guardrails aren't bureaucracy — they're what make the autonomy worth having.

Where each mode breaks down

Every mode has a failure shape, and knowing it is half of using the tool well. Autocomplete fails silently and locally — a subtly wrong line you accepted on muscle memory. Chat fails out of context — advice that's right in isolation and wrong for your actual codebase. Agents fail at scale — a confident, wide-reaching change built on a misread assumption. The mitigations are different too: for autocomplete, slow down and read; for chat, give it more context or verify against the real code; for agents, narrow the scope and lean on tests.

What none of them fix is the need to understand your own system. The assistant is leverage on judgement you already have, not a substitute for it. On a good day an agent saves me an afternoon of mechanical edits. On a bad day it would have shipped a confident mistake if I hadn't read the diff. Both of those are true at once, and holding both is the whole skill.

How I actually work

Autocomplete is always on. It's the lowest-friction win — boilerplate and known patterns, no thought required, every line still under my eyes.
Chat is my thinking partner. For understanding unfamiliar code, debugging, and arguing through a design before I write it.
Agents do the heavy mechanical lifting. Multi-file changes with a clear finish line, on a branch, with tests as the contract — and I read the entire diff before it lands.

I won't pretend I'm neutral about which agent I prefer — I do most of my building with an agentic CLI — but the honest answer to "which tool should I use?" is "which mode does this task need?". The categories aren't competitors so much as gears. The developers getting the most out of all this aren't the ones who picked the one true tool; they're the ones who learned when to shift, and never stopped reading the output. The hype keeps moving. The verify-everything habit is the part that ages well.