What AI coding tools actually replace (and what they don't)

The conversation about AI and engineering jobs has settled into two camps. One says AI is going to replace developers wholesale by next year. The other says AI doesn't replace anyone — it just makes them slightly faster. Both camps are wrong in symmetric and useful ways.

The tools — Claude Code, Cursor, Windsurf, the agentic coding products — are doing real work. They're also leaving real work undone. Two years into using them in production, here's the line we'd draw, and what it means if you're hiring or being hired.

What AI coding tools actually replace

Junior throughput on well-defined tasks. "Add a column to the users table called email_verified, write a migration, update the type, and add it to the response shape" is a task that took a junior engineer most of an afternoon two years ago. It now takes a senior with Claude Code about eight minutes. The work was real; it was also rote.

This is the most consequential change. It's not that AI does the work better than a junior would have — it's that the work is no longer the gate that holds up everything else. The senior who used to delegate the column-add and wait for it to come back tomorrow now does it themselves, in-flight, and keeps moving on the harder problem.

Translation across boilerplate. Form components, API client wrappers, validation schemas, ORM scaffolds, type definitions matching API responses — all the work that's been "the same shape but different details" since 2010. AI does this in seconds. It does it well. Reviewing AI-generated boilerplate is faster than writing it.

Lookups and scaffolding. "How do I configure cron jobs on Vercel?" used to be a 15-minute trip through three tabs and a half-read GitHub issue. AI gets you the right answer in 30 seconds, with an example that's usually correct. Yes, you should still read the docs. But the cost of starting from cold has collapsed.

The first 80% of a CRUD app. For straightforward business apps — a booking system, an internal admin dashboard, a customer-facing form that writes to a database — AI tools reliably get you to a runnable prototype in a single sitting. This is real. It's the reason AI-assisted product development has become a practical category.

If you sum these up, what the AI coding tools have replaced is the work that has a definite right answer and that anyone with sufficient training could produce. That's a lot of work. It's also not the same work as "engineering."

What AI coding tools don't replace

Scope decisions. "What should this app actually do?" is still the hardest question in software, and it remains entirely human. AI will happily build whatever you describe. The cost of building the wrong thing well has, if anything, gone up — because you can now get to "wrong thing well-built" in two weeks instead of two months.

The senior judgement here is the same as it's always been: figuring out what the user actually needs as opposed to what they're asking for, recognising which features are MVP-critical and which are vanity, and saying "no" to ideas that sound good but don't compound.

Architecture decisions you'll regret in six months. AI tools are excellent at writing the next file. They're noticeably worse at deciding whether the shape of the codebase is the right shape. Should this be a monolith or three services? Is the database schema going to survive the next product pivot? Should this background job be a queue worker or a cron, and where does the retry budget live?

These are decisions where the cost of being wrong shows up months later, in the form of a refactor that takes weeks. AI will not save you from making them. It will, sometimes, make them confidently and incorrectly on your behalf.

Debugging the weird stuff. The bugs AI tools handle well are bugs with reproducible local behaviour: a typo, a missing import, a misconfigured package.json. The bugs they handle poorly are the ones that matter — the race condition that only happens at 2 am under load, the timezone bug that shows up for one customer in Tonga, the production memory leak that doesn't reproduce locally.

These bugs require knowing what to suspect — which only comes from having seen them before in different forms. AI doesn't know the bug, even if it has the bug's pattern in its training data, because matching the pattern requires correctly framing what you're looking at. That framing is a senior skill.

Code review with taste. There's a class of code review feedback AI does well: "missing null check," "this should be a type, not a string," "consider extracting this into a function." That's helpful. It's also the easy 30%.

The other 70% — "this is technically correct but it makes the next change three times harder," or "we have a pattern for this in lib/ that you should use instead," or "this introduces a subtle inconsistency with how we handle auth elsewhere" — requires taste. Taste is the accumulated judgement about what good code looks like in this codebase, with this team, against this product roadmap. AI doesn't have it. Senior engineers do, and they earn it the slow way.

The boring 20% that decides production. We covered this in detail in our production-readiness post, so we'll be brief here: error handling, observability, auth boundaries, database integrity, rollbacks. AI tools generate these inconsistently and without the wider context — what failure modes you're protecting against, what your team's on-call patience is, how often you'll need to roll back a deploy.

This is the work that distinguishes a production system from a demo. AI hasn't replaced any of it.

What this implies for hiring

The implication isn't "AI replaces juniors." Junior engineers still learn the same things — they just learn them faster, with AI as a force multiplier rather than a manual that gets thrown out at chapter four. The implication is harder.

The implication is that throughput per senior has roughly doubled. A senior who used to ship two features a week now ships four, because the rote work below their level is done in minutes instead of days. This means the most expensive thing in software is no longer "writing the code." It's deciding what code to write. That's a permanent shift toward judgement-as-the-product.

So if you're hiring: you need fewer engineers than you did three years ago, and the ones you hire need to be more senior than they used to be. The role of "junior who learns by doing rote work over six months" has compressed. There's still a path in, but it's faster and it requires more taste, sooner.

If you're being hired: the premium for senior judgement has gone up. The work that used to differentiate you — clean code, careful error handling, good abstractions — is now table stakes that anyone with AI assistance can produce. The work that now differentiates you is everything that isn't the code: scope, architecture, debugging at the system level, code review with taste, and the boring 20% that decides whether a production system survives its first crisis.

What we tell founders

When founders ask us "should I hire one senior or three juniors with AI?" — we tell them, almost always, one senior. Not because AI has made juniors useless. It hasn't. But because the bottleneck on a small team is no longer typing speed; it's the quality of the decisions being made. And one experienced person, augmented with AI, makes higher-quality decisions than three less-experienced people, augmented with AI.

This is a different answer than we'd have given in 2022. The work changed. The hiring should too.

If you've built an AI-assisted MVP and want a senior read on whether you're ready to scale it past the first ten customers, that's exactly what we do at Think and Form. We run a one-week production-readiness pass and write up where the system is and isn't ready. Get in touch at admin@thinkandform.co.nz.