EngineeringMay 8, 2026

What AI actually does (and doesn't do) when we build your software

By Aaron McClendon, Founder & CTO, Arkitekt AI

Every prospect call lately includes some version of the same question: *how much of this is actually built by AI?* It's a fair thing to ask. The honest answer is somewhere between the marketing pitch you've heard from larger shops and the skepticism you're probably bringing to the conversation.

Here's how it actually works on our side.

AI writes a lot of first drafts. It ships none of them.

When we kick off a build, agentic tools handle a real chunk of the early work: scaffolding services, drafting database migrations, writing the first pass of tests, generating CRUD endpoints, stubbing out integrations. On a typical small internal tool, that might be 60-70% of the initial code volume.

What AI doesn't do is decide what to build, choose the architecture, or merge anything into main. That stays with a human engineer, and we're deliberate about it.

The 2025 DORA report on AI-assisted development backs up what we see on the ground: AI adoption is now near-universal among developers, but the teams getting real throughput and stability gains are the ones with strong review practices and clear guardrails around where AI gets to operate. The teams without those guardrails often get worse, not better.

The METR study is the one we point clients to

There's a randomized study from METR that found experienced open-source developers using early-2025 AI tools were actually 19% slower on familiar codebases, even though they *felt* 20% faster. We bring this up because it matches our experience. AI is great at the unfamiliar, the boilerplate, and the tedious. It's mediocre at nuanced changes inside code it didn't write, and it's confidently wrong often enough that you can't trust it without review.

So our rule is simple: an agent can draft, but a person reads every line that ships. No exceptions for "small" changes. Most of our review time goes into the parts AI handled, not the parts the engineer wrote.

What "days to deliver" actually requires

When we tell a client we can have something working in a week, that speed isn't the agent. It's the stack underneath it. We build on a small set of patterns we know cold: a managed Postgres, a typed API layer, a queue, a deployment pipeline that's already configured, a monitoring setup that's already wired up. The agent fills in the business logic on top.

CIO's piece on agentic workflows in 2026 calls this shift "delegate, review and own." That's roughly right. Our engineers spend less time typing and more time deciding what's worth typing, then validating that what came back is actually correct.

What this means for you

If you're evaluating an AI-assisted shop, ask two questions. Who reviews the code before it ships, and what happens when the agent gets something wrong in production? If the answer to either is hand-wavy, keep looking.

The magic isn't the AI. The magic is knowing where not to use it.

Arkitekt AI builds production-grade custom software on managed infrastructure, delivered autonomously at AI speed. If you're paying for tools that almost fit, let's talk.

arkitekt-ai.com

Source: “Inside Big Software's fight for its life,” Ashley Stewart, Business Insider, April 7, 2026.

← All posts