Karpathy Says AI Coding Agents Made Programming 'Unrecognizable' Since December

Andrej Karpathy, the former Tesla AI chief and OpenAI co-founder, said Wednesday that programming has changed more in the last two months than in decades, arguing that AI coding agents "basically didn't work before December and basically work since." Karpathy pointed to a specific shift in model quality, long-term coherence, and what he called "tenacity," the ability for agents to power through large tasks without losing the thread. Bloomberg reported Thursday that a "productivity panic" is spreading through tech companies as AI coding tools like Claude Code and Cursor redraw the boundaries of what software engineers actually do.

The claim carries weight because of who is making it. Karpathy helped build Tesla's Autopilot, co-founded OpenAI, and runs one of the most-watched AI accounts on the internet. Three weeks ago he rebranded "vibe coding" as "agentic engineering." Now he's saying something bigger: the era of humans typing code into editors, the default since computers were invented, is ending.

The Breakdown

Karpathy says AI coding agents crossed from unreliable to functional in December 2025, turning weekend projects into 30-minute tasks.
Deep technical expertise becomes a bigger multiplier with agents, not a smaller one, Karpathy argues.
Production engineers push back: debugging AI-generated code takes 3x longer and governance frameworks barely exist.
Bloomberg reports a 'productivity panic' in tech as young software workers face a 16% job posting decline.

The December threshold

Karpathy drew a hard line in time. Not a gradual improvement. Not the usual creep of better tooling. He pinpointed December 2025 as the month AI coding agents crossed from unreliable to functional, a break from everything that came before.

To illustrate, he described handing an AI agent a single dense prompt: log into a local DGX Spark, set up SSH keys, download and benchmark a vision model, build a video analysis dashboard for home security cameras, wire up system services, and write a markdown report. The agent ran for roughly 30 minutes. It hit errors and researched solutions online. Resolved them one by one, wrote the code, tested it. Configured the services and came back with a finished report. Karpathy didn't touch anything.

"All of this could easily have been a weekend project just three months ago," he wrote. "Today it's something you kick off and forget about for 30 minutes." Walk away. Make coffee. Come back to a finished product.

DHH, the Ruby on Rails creator who has been shipping software since the Reagan administration, agreed. "Biggest and fastest change in the 40 years I've tried to make computers do my bidding," he wrote. Fun, too, apparently.

But the DGX Spark example is a showcase, and Karpathy knows it. Clean-sheet project. No legacy constraints. One user. Testable output. The more revealing detail sits in his description of the new daily workflow. Developers now spin up multiple AI agents, assign them tasks in plain English, and review their output in parallel. The agents carry tools, memory, and instructions. They research on their own. They course-correct when they hit walls.

The biggest prize, Karpathy argued, is in "ascending the layers of abstraction," building orchestrator systems that manage multiple parallel coding instances. Think of it as delegation with a very fast, very literal staff. The person at the top still needs to know what good output looks like. But the mechanics of typing, searching documentation, debugging syntax errors, all of it collapses into a monitoring function.

He coined the term "agentic engineering" for this practice three weeks ago. The security implications of that rebrand drew immediate scrutiny, particularly around an AI coding tool market that Goldman Sachs valued at $45 billion earlier this year and its thin track record on autonomous code safety. Karpathy is now making a bolder claim. The programming workflow most engineers learned, the one that has persisted since the 1960s, just broke.

Not prompters, multipliers

When a follower asked whether teams of hundreds would be "replaced by a few chosen prompters," Karpathy pushed back hard. The word "prompters" was doing the shift a disservice, he said, and represented a misunderstanding of what's actually happening.

"At the top tiers, deep technical expertise may be even more of a multiplier than before," he argued. The tools don't replace knowledge. They amplify it.

That's a specific claim about who benefits. Not everyone with a keyboard gets the same returns. The people who already understand what an agent is doing on their behalf, who know which tools are at its disposal, what's hard and what's easy, those are the ones completing weekend projects in 30 minutes. For everyone else, the gains shrink and the failure modes get stranger.

Karpathy was blunt about the mechanism. "It's not magic, it's delegation." Anyone who has managed a team recognizes the dynamic. The people who decompose work well for junior engineers decompose it well for agents too. The ones who micromanage humans micromanage machines, and get about as far.

The argument fits what early adoption data showed. A 2025 study of vibe coding practices found startups using AI coding tools at rates 20% higher than large companies, with some building 95% AI-generated codebases. But the highest-performing teams weren't the ones handing everything off blindly. They were the ones with enough technical depth to decompose tasks precisely and catch mistakes before they compounded.

Karpathy acknowledged the limits openly. AI coding works "a lot better in some scenarios than others," he said, particularly for well-specified tasks where you can verify and test the output. Building intuition for what to hand off and what to keep, that's the new core skill. And it cannot be prompted into existence.

The people who disagree

Not everyone in the replies shared the enthusiasm. The split was predictable: builders on greenfield projects were emboldened, engineers maintaining production systems were defensive.

Stay ahead of the curve

Strategic AI news from San Francisco. No hype, no "AI will change everything" throat clearing. Just what moved, who won, and why it matters. Daily at 6am PST.

No spam. Unsubscribe anytime.

Daniel Ost flagged what he called the obvious problem. "When AI fails, debugging takes 3x longer because you're trying to understand code you never wrote." Anyone who has inherited a colleague's codebase knows that particular frustration. Now multiply it by an agent that made architectural choices it cannot explain, and the time cost compounds fast.

Yacine Mahdid, an AI researcher, put it shorter. "You can outsource your thinking but you cannot outsource your understanding."

Rafał Kobyliński, a developer working with production code, challenged Karpathy directly. On UI, networking, and concurrency, the things that actually break in production, he said he was getting "hardly better results than last year." Karpathy's response was revealing. Rather than contesting the point, he suggested Kobyliński might be "holding it wrong" and pointed to new browser-based agent tools and better context management techniques. That's an answer from someone who knows the criticism has teeth but doesn't want to concede the narrative.

The exchange exposed a tension sitting right at the center of the agentic engineering pitch. Karpathy's showcase, a greenfield home project with no legacy code, no team dependencies, no compliance requirements, and no users filing bug reports at 2 AM, lands very differently than a Fortune 500 production environment carrying 15 years of accumulated technical debt. Setting up a video dashboard from scratch tests one set of capabilities. Keeping a banking application running while three concurrent regulatory audits demand attention tests something else entirely.

Leandro Alvarenga pressed on governance. If agents get credentials, how do you ensure real sandboxing? How do you audit code that the agent researched and generated autonomously? How do you maintain accountability when multiple layers of orchestration sit between the developer and the final commit?

Nobody offered answers. Including Karpathy. The thread moved on to workflow tips.

The productivity panic

Bloomberg's timing was not accidental. The outlet reported the following day that AI coding agents are fueling a "productivity panic" across tech, with companies scrambling to determine whether these tools compress development timelines, eliminate roles, or both. Cursor announced a major update days earlier as the AI coding tool market accelerates. Anthropic's CEO told Fortune that AI coding would deliver an impact comparable to the printing press.

The anxiety inside these companies isn't about whether agents work. That argument is settling. It's about what happens to the people between the orchestrator and the output.

A record share of unemployed Americans now hold bachelor's degrees while productivity metrics keep climbing. That gap between output and opportunity widened again in January. Stanford research from August 2025 showed young workers in software and customer service already experiencing a 16% decline in job postings. The SF Standard asked the question on February 19 without any cushioning: what's left for software engineers when AI writes the code?

Karpathy's answer, that expertise becomes a bigger multiplier, offers cold comfort if you're a junior developer who planned to build that expertise through years of hands-on coding. The very activity that taught you how to delegate is now the activity being delegated. That's the math nobody wants to show on a whiteboard.

The delegation economy

Strip away the excitement and the anxiety, and Karpathy's post describes a specific economic force. The cost of executing known programming tasks is collapsing toward zero. The value of knowing which tasks to execute, and how to verify the results, keeps climbing. One number falls while the other rises. The people who sit at that intersection get richer.

The pattern has precedent. Architects didn't disappear when CAD software replaced hand drafting. Drafters did. Designers with good taste and structural instincts gained speed. People who spent their days rendering elevations by hand lost their purpose. The analogy is imperfect, every analogy is, but the underlying force runs the same direction: when execution gets cheap, judgment gets expensive. And the transition never waits for the people it displaces to retrain.

Karpathy himself said this is "nowhere near 'business as usual' time in software." He's right about that much. December may or may not have been the clean inflection point he claims. Agents still choke on production code. Security frameworks for autonomous code generation barely exist. Governance remains an afterthought at most companies deploying these tools.

But 30 minutes for a weekend project is a data point, not a prediction. And the engineers watching that number shrink aren't debating whether the shift is real. They're counting the distance between their desk and the delegation line.

Frequently Asked Questions

What changed about AI coding agents in December 2025?

Karpathy says models gained higher quality, long-term coherence, and the ability to power through complex multi-step tasks without losing context. Before December, agents required constant supervision. Now they can autonomously research solutions, debug errors, write code, and configure services over 30-minute sessions.

What is agentic engineering?

A term Karpathy coined in February 2026 for managing multiple AI coding agents in parallel rather than writing code directly. Developers assign tasks in English, provide context and tools, then review output. The skill is in decomposing work correctly and verifying results, not typing syntax.

Do AI coding agents work on production code?

Results are mixed. Karpathy acknowledges agents work best on well-specified tasks with testable output. Developer Rafał Kobyliński reported hardly better results than last year on production UI, networking, and concurrency. Greenfield projects see the biggest gains while enterprise codebases with technical debt remain challenging.

Will AI coding agents replace software engineers?

Karpathy argues technical expertise becomes more valuable because skilled engineers extract more from agents. But Stanford data shows a 16% decline in software job postings for young workers. The risk concentrates among junior developers who need hands-on coding to build the expertise that makes delegation effective.

What is the productivity panic Bloomberg reported?

Bloomberg reported February 26 that tech companies are scrambling to determine whether AI coding tools compress timelines, eliminate roles, or both. The concern intensified as Cursor announced a major update and Anthropic's CEO compared AI coding's impact to the printing press.