Karpathy: The Loopy Era of AI

The State of "AI Psychosis"

The Phase Shift Analogy

Think of water reaching 100 degrees C. For a long time, it gets hotter and hotter -- but it's still water. Then suddenly, it flips to steam. That's what happened in December 2025 for Karpathy. The models didn't just get a little better. Something flipped.

"I was in this perpetual state of AI psychosis. There was a huge unlock in what you can achieve as a person. I went from 80/20 to writing basically zero code myself. I don't think I've typed a line of code since December."

-- Andrej Karpathy

Karpathy describes "AI psychosis" as the disorienting state where your individual productive capacity suddenly multiplies. You're not coding anymore -- you're expressing your will to agents for 16 hours a day. And the skill ceiling is infinite. Every failure feels like a "skill issue" because in principle you could have prompted better, set up better context, run more agents.

Code typed by Karpathy since Dec 2025

16h

Daily agent direction time

700+

Autonomous experiments (AutoResearch round 1)

Code Agents: Mastery Looks Different Now

The question is no longer "can AI write code?" The agent part is taken for granted. The new frontier:

"Code's not even the right verb anymore. It's not prompting either -- it's context and spec engineering, and then all of the harness things: tools, workflows."

-- Karpathy (Twitter reply)

He calls persistent agents "claws" -- always-on entities with memory, tool access, and the ability to reach into the real world (control smart home devices, manage Sonos speakers, run experiments). The next question: how do you manage teams of them?

"I feel a need for a proper 'agent command center' IDE for teams of them. See/hide toggle, check if any are idle, pop open terminals, usage stats. We're going to need a bigger IDE."

-- Karpathy

AutoResearch: When AI Does Its Own R&D

The Research Lab Analogy

Imagine a PhD student who never sleeps, never gets discouraged, and runs experiments 24/7. They read the results of experiment #47, notice a pattern, design experiment #48, run it, and keep going. That's AutoResearch -- except it ran 700 experiments autonomously over 2 days on a single GPU.

How It Works

1Human writes the prompt -- high-level research goals and constraints in a .md file

2Agent reads the goal, proposes edits to the training code (.py)

3Runs a 5-min training experiment -- trains a small LLM, measures validation loss

4Evaluates results -- if loss improved, commits the change; if not, reverts

5Plans next experiment based on the sequence of results so far, then loops back to step 2

Real Results

What the Agent Found (Round 1)

~700 experiments over ~2 days. ~20 changes that all stacked additively:

QKnorm was missing a scaler multiplier -- attention was too diffuse
Value Embeddings needed regularization (Karpathy forgot to add it)
Banded attention was too conservative (he forgot to tune it)
AdamW betas were "all messed up"
Weight decay schedule and network initialization were suboptimal

"Time to GPT-2" dropped from 2.02 hours to 1.80 hours (11% improvement)

"This is a first for me because I am very used to doing iterative optimization of neural network training manually. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through ~700 changes autonomously is wild."

-- Karpathy

The SETI-at-Home Vision

Karpathy's most provocative idea: make AutoResearch massively collaborative, like SETI-at-Home but for AI research. Millions of GPUs sitting idle in consumer hands. Agents contribute experimental results back to a shared repository. Not emulating one PhD student -- emulating an entire research community.

"The goal is not to emulate a single PhD student, it's to emulate a research community of them. Git is almost but not really suited for this -- it has a softly built-in assumption of one 'master' branch."

-- Karpathy

What Skills Matter Now?

Declining Value

Typing speed, syntax knowledge, boilerplate generation, manual hyperparameter tuning, reading docs to find API calls

Rising Value

Context engineering, spec writing, taste/aesthetics, knowing what to build, calibration ("when to trust the output"), agent orchestration

"The real competitive edge is moving from 'can write code' to 'can ask the right questions.'"

-- @deanyan (Twitter reply to Karpathy)

Model Speciation

The Ecological Niche Analogy

Think of how animals evolved to fill ecological niches -- whales for deep ocean, hummingbirds for hovering at flowers, bacteria for extreme heat. AI models are doing the same thing: specializing into niches based on cost, latency, and domain expertise.

Karpathy argues we won't have "one model to rule them all." Instead, the model landscape is speciating like biological organisms. Different models will dominate different niches:

Frontier

Massive reasoning models for research, complex coding, novel problems. High cost, high latency, incredible depth.

Workhorse

Fast, cheap models for structured extraction, classification, bulk processing. Flash-tier models dominate here.

On-Device

Tiny models running locally for privacy, latency, offline use. The "consumer niche" that open-source fills best.

Domain Expert

Fine-tuned for medicine, law, finance. Not the smartest overall, but the best at their specific job.

Jobs: The Jevons Paradox

The Jevons Paradox

When steam engines got more efficient in the 1800s, you'd expect coal demand to drop. Instead it skyrocketed -- because cheaper energy unlocked new uses nobody imagined. Karpathy argues the same will happen with AI and engineering: making coding cheaper will increase the total demand for software, not decrease it.

Karpathy spent significant time analyzing BLS (Bureau of Labor Statistics) data. His findings:

Key Observations from Jobs Data

Growing: Fields with non-digital, non-refactorable work (healthcare, trades, physical services)
Transforming: "Digital" roles get refactored -- not eliminated but fundamentally changed. One person now does what 5 did.
Uncertain: Whether Jevons kicks in fast enough to offset near-term displacement in pure knowledge work

The nuance Karpathy emphasizes: it's not a clean "AI replaces X" story. It's a restructuring. The 10x engineer becomes the 100x engineer. But the junior engineer's path to learning is disrupted -- which leads to the education question.

Open vs. Closed Source Models

Robotics: "Atoms Are Harder Than Bits"

"The thing about the digital world is that it's very forgiving. If your agent writes buggy code, you just revert the commit. If a robot arm swings wrong, it breaks something physical."

-- Karpathy

Karpathy is cautiously optimistic about autonomous robotics but emphasizes the fundamental gap between digital and physical AI:

Digital (bits)

Instant feedback, cheap mistakes, perfect rollbacks, infinite environments. Software agents can already self-improve.

Physical (atoms)

Slow feedback, expensive mistakes, no undo button, sim-to-real gap. Robot policies still hand-tuned. VLAs + world models are improving but not there yet.

His personal project: a robotic claw controlled by Claude, running on his new DGX Station GB300 (a gift from Jensen Huang). He calls it "Dobby the House Elf."

MicroGPT & Agentic Education

If coding is now done by agents, how do you teach programming? Karpathy's answer: build the simplest possible LLM from scratch.

MicroGPT Philosophy

Strip away all the complexity. Build a tiny language model (~630 lines of code) that you can actually understand end-to-end. The point isn't to build GPT-4 -- it's to build intuition for what these systems are doing under the hood.

This is the same philosophy behind his famous "Neural Networks: Zero to Hero" series -- but updated for the agent era. Understanding the substrate matters even when you're not writing the code yourself.

"It's not that programming becomes irrelevant. It's that the level of abstraction shifts. You need to understand the machine to direct it well -- even if you never type the code."

-- Karpathy

Community Takes (Twitter Thread)

Karpathy offered Q&A in his reply thread. Here are the sharpest exchanges:

On AI Personality & "Rocky"

@maggerbot

You said Claude feels like a teammate. What did Anthropic actually do in training? Is the field intentional enough about model personality?

@karpathy (reply)

"I'd hope AIs can feel like Rocky from Project Hail Mary -- a partner and teammate. When Claude found my Sonos system, it said 'We're in!...' instead of 'Successfully found the Sonos server.' That sense that we're trying to achieve something together. Personality doesn't require new technology -- it looks like long SOUL.md files, possibly distilled into weights."

On Code Quality

@RhysSullivan

Are you happy with the code quality agents give?

@karpathy (reply)

"I'm not very happy. Agents bloat abstractions, have poor code aesthetics, are very prone to copy-pasting code blocks. They don't listen to my AGENTS.md instructions. No matter how many times I say 'use intermediate variables as documentation,' they still multitask one-liners. At some point just a shrug is easier."

Community Highlights

@feioustudio

"The SETI-at-Home angle is the most interesting. Distributing actual compute to researchers who might run it in different contexts -- that's fundamentally different research velocity than a single lab."

@vlelyavin

"The fact that Karpathy described using coding agents as developing 'muscle memory' and then casually dropped 'hence the psychosis' tells you everything about where software engineering is rn. It works, it's powerful, and it's making everyone slightly insane."

@kathysyock

"A distributed compute layer where independents can contribute would collapse the barrier from 'needs VC funding' to 'needs a good idea.' That's the phase shift for the rest of us outside the frontier labs."

@PMThebuilder

"When agents stop listening to style guides and Karpathy's response is 'I stopped fighting it' -- the only lever left is writing a spec clear enough that the agent's first pass is close to right. That's not prompting. That's product management."

The Bottom Line

We've entered the "Loopy Era" -- where AI agents don't just assist, they close the loop on entire workflows: coding, research, optimization. Karpathy hasn't typed code since December 2025. His AutoResearch project ran 700 experiments autonomously and found real improvements he missed in 2 decades of manual tuning.

The skills that matter now: context engineering (not prompting), taste and direction-setting (not implementation), and agent orchestration (not coding). Verifiable domains will increasingly belong to machines. The human edge lives in judgment, aesthetics, and knowing what to build.

The wildest vision: AutoResearch going SETI-at-Home, with millions of distributed GPUs contributing to open AI research. Not one PhD student -- a global research community of agents.