Path to self-driving software

24 Jan 2026

I came back from holiday and every engineer I follow on Twitter was using coding agents. Claude Code, Codex, Gemini, all of them getting better. I’d been using AI in code editors, but not in an agentic way. This time I gave it a proper go.

We had a bunch of bugs to clear before starting the year fresh. Small fixes, contained changes. I pointed an agent at them and let it run.

It worked surprisingly well. The agent would read files, write code, run tests, loop when something broke. It just kept going until the problem was solved.

But when I tried to go further, I hit friction. I’d imagine many others did too.

Most of our code wasn’t written yesterday. It carries context that lives in people’s heads. Local dev looks different from staging, staging looks different from prod. Even amongst humans, there’s no standard way of doing things. And all my life as an engineer, I’ve worked on one thing at a time. Take a task, write the code, open a PR. That’s the rhythm.

AI agents make parallelisation possible. Not for complex work, but for the simpler stuff. But when the agent writes code, how do I know it’s right? How do I check on what it’s doing without waiting until the end? The terminal scrolls by. I can’t parse it. I don’t want to stop it, but I don’t trust it either.

What I wanted was to verify the work the way I’ve always verified work. Open it in my editor. Read the code. See the diff. Use the tools I already know.

That’s the gap I started thinking about.

. . .

Adoption becomes autonomy

The transition to AI isn’t a flip. It’s not about replacing everything overnight.

It’s humans and agents working together, gradually building toward autonomy. The more you work with agents, the more context you generate. The more context they have, the more they can do.

I wasn’t looking at what others are doing with AI and trying to copy it. I was thinking about how to bring agents into the workflows we already have.

It should feel familiar. Same git, same editor, same workflow. When you take over from an agent, you’re working the way you always have.

It should enable parallelisation. This is a real shift in how you operate. Humans take one task at a time. But with isolated workspaces, you can spin up multiple agents on different tasks, check back when ready, review and merge what works. It takes time to get comfortable with. But it’s how leverage actually scales.

Eventually, with strong enough verification (tests, deploys, rollbacks), software starts driving itself. But you don’t get there by rewriting everything. You get there by working together.

. . .

An environment for humans and agents

Agents should work the way you already work. Same tools, same patterns. Environments you control. The ability to check in whenever you need to.

You already branch for features. You already isolate work so it doesn’t conflict. A workspace for agents is that same pattern: a branch, a set of repos, everything kept separate from your other work. You can have many running at the same time.

You already run things in containers. An agent can run in one too, with full permissions inside, isolated from your system. It can install packages, run commands, do whatever it needs. When you want to check on its work, you open the same files in your editor.

You already spin up Postgres, Redis, whatever the app needs. Agents need the same. They should be able to run tests against real services, not mocks. The environment should feel like yours because it is yours.

The handoff is what matters most. One command opens the workspace in your editor. Same files, same git history, same workflow. You verify the work using tools you already know.

The agent works in your world, not some parallel universe you need to translate from.

. . .

Self-driving software

You start writing things down that you never bothered to document before. Agent docs, notes on how things connect, context that used to live only in your head. Now something else needs to understand your codebase, and that something is actually reading what you write.

The agents get better at your codebase specifically. They pick up your conventions, your preferences, the way you structure things. Knowledge that used to live in one person’s head starts living in the system.

This is the flywheel. Adoption generates context. Context improves capability. Capability makes adoption easier. Teams that start accumulating this context now will have compounding advantages later.

That’s the bet. Not rewriting everything for AI, but working together and building up context over time.

I took a bunch of Waymo rides last year. It’s a car. I could see the road, the steering wheel turning, the world outside. Familiar, but also kind of magical.

Self-driving software should feel the same. Same git, same editor, same workflow. You can open the workspace, see the code, understand what changed. Familiar, but the work moves forward on its own.