Is English the New Programming Language?

💡 TL;DR - The 30 Seconds Version

💬 Programming computers now happens in plain English through prompts, marking the third major shift in software development after 70 years.

📊 Karpathy built iOS apps without knowing Swift and created functional web tools in hours using natural language descriptions.

🏭 Tesla's autopilot neural networks literally deleted thousands of lines of C++ code, replacing traditional functions with learned weights.

⚡ AI models work like 1960s mainframes - expensive, centralized in the cloud, accessed through time-sharing via APIs.

🧠 Language models are "people spirits" with superhuman memory but human-like flaws, insisting 9.11 is greater than 9.9.

🚀 The future belongs to partial autonomy tools where humans control an "autonomy slider" rather than fully automated systems.

Andrej Karpathy stood before a packed auditorium in San Francisco and dropped a simple truth: we're programming computers in English now. The former Tesla AI director and Stanford researcher didn't sugarcoat it. Software has changed twice in 70 years. The third revolution is happening right now.

Most people missed it. They see ChatGPT as a fancy search engine or a homework helper. Karpathy sees something else entirely. He sees a new computer that runs on words instead of code.

He calls it Software 3.0. The progression makes sense when you step back. Software 1.0 was traditional programming - humans writing explicit instructions for computers in languages like Python or Java. Software 2.0 emerged with neural networks, where humans curated datasets and algorithms learned patterns. Now Software 3.0 flips everything again. The prompt is the program. English is the programming language.

This isn't just a clever analogy. Karpathy spent years watching neural networks eat through Tesla's autopilot codebase. Functions written in C++ got replaced by neural network weights. The pattern repeated across the entire stack. Now he sees it happening again with large language models.

The New Operating System

Language models aren't just better search engines. They're operating systems. Karpathy draws the parallel deliberately. Like Windows or MacOS, they manage memory and compute resources. The context window acts as RAM. The model orchestrates different capabilities - reasoning, tool use, multimodal processing - just like an OS manages applications.

The comparison extends further. You can take an AI application like Cursor and run it on different models - GPT, Claude, or Gemini. It's like downloading software that works across Windows, Mac, and Linux. Same app, different underlying system.

We're stuck in the 1960s equivalent of computing right now. Models are expensive, so they live in the cloud. Everyone shares time on these massive systems through APIs. Personal AI computing hasn't arrived yet, though Karpathy sees early signs with devices like Mac minis running local models.

The distribution model resembles electricity more than traditional software. Labs spend enormous capital training models, then sell access through metered APIs. Users demand utility-level reliability - low latency, high uptime, consistent quality. When models go down, it creates what Karpathy calls "intelligence brownouts" across the planet.

Programming People Spirits

Here's where things get weird. Karpathy describes language models as "people spirits" - stochastic simulations of human behavior trained on human text. This isn't mysticism. It's practical psychology for working with these systems.

These artificial minds have superhuman memory and encyclopedic knowledge. They can recall SHA hashes and phone books like the savant in Rain Man. But they also hallucinate, insist that 9.11 is greater than 9.9, and claim there are two Rs in "strawberry." They display what researchers call jagged intelligence - genius in some areas, elementary mistakes in others.

Most importantly, they suffer from a kind of digital amnesia. Unlike human coworkers who build institutional knowledge over time, language models start fresh with each conversation. Their context window is working memory, not long-term storage. You have to program this working memory directly.

The Partial Autonomy Revolution

The real opportunity isn't full automation. It's partial autonomy. Karpathy learned this lesson watching self-driving cars. He rode in a perfect Waymo demo in 2013 - no interventions for 30 minutes around Palo Alto. Twelve years later, we still don't have widespread autonomous vehicles.

Software autonomy will follow a similar timeline. The future belongs to tools that let humans and AI work together, with humans controlling the autonomy slider. Cursor exemplifies this approach. You can use it for simple autocomplete, targeted code changes, or full repository modifications. The user decides how much control to surrender.

Successful AI applications share common patterns. They manage context automatically, orchestrate multiple model calls behind the scenes, and provide custom interfaces for human oversight. Most importantly, they optimize the generation-verification loop. AI generates, humans verify. The faster this cycle runs, the more productive everyone becomes.

Everyone Is a Programmer Now

The most radical shift isn't technical - it's social. Programming used to require years of study. Learning syntax, debugging, understanding computer science fundamentals. That barrier just disappeared.

Karpathy calls it "vibe coding" - building software through natural language descriptions rather than formal programming knowledge. He built an iOS app without knowing Swift. Kids are creating applications by describing what they want in plain English.

This democratization terrifies some programmers and thrills others. The optimists see it as a gateway drug to deeper technical skills. The pessimists worry about quality and understanding. Both perspectives miss the bigger picture. We're not replacing programmers. We're expanding the definition of programming.

Building for Digital Spirits

If AI systems are new consumers of digital information, we need to redesign everything for them. Karpathy advocates for simple changes that unlock enormous capabilities.

Documentation should be written in markdown, not formatted for human readers. Websites should include LLM-friendly descriptions alongside traditional content. APIs should provide clear, parseable instructions instead of forcing AI to navigate visual interfaces.

Some companies are already adapting. Vercel and Stripe offer model-specific documentation. Tools like git-ingest convert GitHub repositories into AI-readable formats. The Model Context Protocol from Anthropic creates standardized ways for AI systems to access information.

This isn't about replacing human interfaces. It's about meeting AI systems halfway. Yes, future models will navigate websites and click buttons. But making information directly accessible reduces costs and improves reliability.

The Iron Man Paradigm

Karpathy's favorite metaphor is the Iron Man suit. Tony Stark can pilot it manually or let it operate autonomously. The suit augments human capabilities while maintaining the option for full automation.

This captures the sweet spot for AI applications. Build augmentation tools that can gradually become more autonomous. Focus on the suit, not the robot. Keep humans in the loop while progressively expanding AI capabilities.

The industry obsession with fully autonomous agents misses this point. A thousand-line code diff is useless if a human can't verify it quickly. Better to generate smaller changes that humans can review and approve rapidly.

Why this matters:

We're not just getting better tools - we're building a fundamentally different kind of computer that thinks in language rather than logic.
The companies that figure out human-AI collaboration will dominate the next decade, while those chasing full automation will struggle with the complexity of keeping fallible systems on track.

Read on, my dear:

❓ Frequently Asked Questions

Q: How much coding knowledge do I need to start "vibe coding"?

A: None. Karpathy built a working iOS app without knowing Swift and created Menugen.app in a few hours using natural language descriptions. The barrier is now understanding what you want to build, not how to code it.

Q: What's the difference between Software 2.0 and 3.0?

A: Software 2.0 uses neural network weights trained on data. Software 3.0 uses prompts written in English to program language models. The key shift: you write instructions in natural language instead of tuning datasets.

Q: Why does Karpathy compare this to the 1960s computing era?

A: Computing in the 1960s was expensive, centralized in mainframes, and accessed through time-sharing. Today's AI models cost millions to train, live in the cloud, and serve multiple users through APIs - same pattern, 60 years later.

Q: What happened to Tesla's autopilot code when neural networks took over?

A: The neural networks literally "ate through" the C++ codebase. Functions for stitching camera images and processing data across time got replaced by neural network weights. Thousands of lines of traditional code disappeared.

Q: How long did it take Waymo to go from perfect demos to real deployment?

A: Karpathy rode in a flawless 30-minute Waymo demo in 2013. Twelve years later, we still don't have widespread autonomous vehicles. Even current Waymo cars require significant human oversight and remote assistance.

Q: What makes language models "people spirits" instead of just advanced search engines?

A: They're trained on human text, so they simulate human reasoning patterns. They have superhuman memory but make distinctly human-like errors - insisting 9.11 is greater than 9.9 or claiming "strawberry" has two Rs.

Q: What's an "intelligence brownout"?

A: When major AI models go down, global productivity drops as millions of users lose access to AI assistance. Like electrical brownouts reduce power reliability, model outages reduce the planet's available intelligence capacity.

Q: Why does the generation-verification loop matter so much?

A: AI generates content instantly, but humans still verify quality. The faster you can review and approve AI output, the more productive you become. Tools like Cursor optimize this with visual diffs and keyboard shortcuts.