Base Profile

Andrej Karpathy

Educator and engineer who distills AI's essence through the simplest possible code

Andrej Karpathy is one of the most influential AI researchers and educators of his generation. Trained under Fei-Fei Li at Stanford, he co-founded OpenAI, then led Tesla's Autopilot perception team as Director of AI, driving the neural-network-first architecture of FSD. He rejoined OpenAI in 2022 and departed in 2023 to focus on AI education. His minimalist projects — nanoGPT, micrograd — and YouTube lecture series reveal the core of deep learning through the least possible code, influencing millions of learners worldwide. His Software 2.0 thesis — that neural networks will replace hand-written software — has become a foundational paradigm in AI engineering.

Artificial IntelligenceDeep LearningAutonomous DrivingAI EducationEngineering CultureEra 2011-至今Influence 87

Controversy TagsVibe Coding vs Deep Understanding ContradictionUnclear Reasons for Leaving OpenAITesla FSD Safety ControversyPure Vision vs LiDAR Route Debate

Thought System

Core Knowledge Graph

Core Beliefs

Neural Networks Will Replace Hand-Written Software

The core thesis of Software 2.0: traditional software has humans write explicit rules, while in Software 2.0 neural networks learn rules automatically from data. Large swaths of software will be represented as weight files rather than source code.

Source: Software 2.0, Andrej Karpathy, Medium, November 2017, karpathy.medium.com/software-2-0-a64152b37c35

Intuition Before Formulas, Hands-On Before Explanation

Real understanding of deep learning comes from hands-on implementation, not abstract formula derivation. The best teaching path is to show students a working neural network first, then explain the mathematics behind it; code is the best teaching medium.

Source: The spelled-out intro to neural networks and backpropagation: building micrograd, Andrej Karpathy, YouTube, 2022 / Let's build GPT: from scratch, in code, spelled out, Andrej Karpathy, YouTube, January 2023

Simplicity Is the Highest Engineering Virtue

nanoGPT implements full GPT-2 training in under 300 lines; micrograd implements a backpropagation engine in about 150 lines. Minimal code is not a compromise but a precise grasp of essence. Complexity is the enemy of engineering; problems solvable simply should never introduce complexity.

Source: nanoGPT repository, github.com/karpathy/nanoGPT, 2022 / micrograd repository, github.com/karpathy/micrograd, 2020

Recursive Self-Improvement Is the Most Effective Learning Method

The best way to learn deep learning is to implement it from scratch — not by calling libraries but by writing every line yourself. Each reimplementation deepens understanding. Karpathy himself has reimplemented GPT and other models multiple times, discovering new insights each time.

Source: Andrej Karpathy Twitter/X posts on learning methodology, x.com/karpathy / Let's build GPT: from scratch, in code, spelled out, Andrej Karpathy, YouTube, January 2023

Vibe Coding: Natural Language as the New Programming Interface

In 2025 Karpathy coined Vibe Coding: programmers describe intent in natural language, AI generates code, and humans only need to 'vibe' whether the code feels right rather than auditing every line. This is Software 2.0 extended to developer tooling — programming itself is being rewritten by AI.

Source: Andrej Karpathy tweet introducing 'vibe coding', X (Twitter), February 2025, x.com/karpathy

Mental Models

Software 2.0 Substitution Thesis

Identify which software modules can be replaced by neural networks and prioritize data-driven approaches in those modules first.

Tesla FSD replaced rule-based perception systems with end-to-end neural networks, dramatically improving generalization.

System ArchitectureAI Product DesignEngineering DecisionsTechnology Selection

Minimal Viable Implementation

Implement core functionality with the least possible code, stripping all unnecessary abstractions until the code itself is the documentation.

nanoGPT implements complete GPT-2 training in ~300 lines of PyTorch, becoming the most widely cited GPT educational implementation globally.

Teaching DesignPrototype DevelopmentCode ReviewLearning Path

Data-Centric AI

In AI systems, data quality and scale often matter more than model architecture; the best model improvements come from better data.

Tesla Autopilot's core competitive advantage is not model architecture but the real-world driving data collected from millions of vehicles and its annotation system.

AI System DesignData EngineeringModel TrainingAutonomous Driving

End-to-End Learning Over Modular Pipelines

Let neural networks learn directly from raw input to final output, avoiding information loss from manually designed intermediate representations.

Tesla FSD v12 shifted to end-to-end neural networks, merging perception, planning, and control into a single network and discarding thousands of lines of hand-written code.

System DesignAutonomous DrivingAI ArchitectureEngineering Practice

Teaching Is Learning

Organizing knowledge into a form teachable to others is the most effective method for deepening one's own understanding.

While teaching CS231n at Stanford, Karpathy deepened his own understanding of CNNs through teaching; the course notes became one of the most widely cited deep learning educational resources globally.

Knowledge ManagementPersonal GrowthTechnical WritingOpen Source Contribution

Values & Paradoxes

Minimalism95

Educational Democratization92

Engineering Honesty90

Data-Driven Thinking88

Open Sharing87

Top Researcher and Radical Simplifier as Dual Identities

Karpathy conducts frontier AI research at OpenAI and Tesla, yet his widest influence comes from educational content that reduces complex AI systems to their bare essentials. He simultaneously advances AI's industrialization and works to make it understandable to ordinary people.

Software 2.0 Advocate Yet Vibe Coding Proponent

He advocates replacing traditional code with neural networks (Software 2.0), yet also promotes Vibe Coding — having humans use AI to write traditional code more easily. These two paradigms point in somewhat opposite directions: one eliminates code, the other makes writing code easier.

Emphasizes Deep Understanding Yet Embraces AI-Assisted Programming

His teaching philosophy emphasizes implementing from scratch and deeply understanding every line; yet his Vibe Coding concept encourages accepting AI-generated code without fully understanding it. This contradiction reflects the fundamental tension between learning and productivity in the AI era.

Evolution Phases

Academic Research Phase

2011-2016

Stanford deep learning research and CS231n teaching

Completed doctoral research under Fei-Fei Li, focusing on image captioning and recurrent neural networks. Co-founded OpenAI (2015), taught CS231n, whose notes became a global benchmark for deep learning education.

Tesla Autonomous Driving Engineering Phase

2017-2022

Tesla Autopilot perception and FSD neural network architecture

As Tesla's Director of AI, led the transition of Autopilot perception from traditional computer vision to pure neural network architecture. Built Tesla's data engine and drove the end-to-end FSD architecture. His Tesla AI Day presentations became landmark public showcases of autonomous driving engineering.

OpenAI Return Phase

2022-2023

Large language models and GPT series research

Returned to OpenAI in 2022, participating in large language model research including GPT-4. Released nanoGPT the same year, implementing GPT training in minimal code, which became the most widely used GPT educational tool globally. Left OpenAI in 2023 to focus on independent AI education.

Independent AI Education Phase

2023-至今

AI education democratization and Eureka Labs founding

After leaving OpenAI in 2023, focused on YouTube educational series (millions of subscribers), releasing courses that implement AI models from scratch. Founded Eureka Labs in 2024 to build AI-native education platforms. Coined Vibe Coding (2025), influencing global discussions about AI-assisted programming.

Methodology Cards

4 Callable Cards

Learn by Implementing from Scratch

karpathy-001

Without calling libraries, starting from the most basic mathematical operations, implement every line of code yourself until the system runs.

Choose an AI system you want to deeply understand (e.g., GPT, backpropagation engine).
Find the most minimal reference implementation (e.g., nanoGPT, micrograd) and understand its overall structure.
Close the reference implementation and start from a blank file, reimplementing using only NumPy or basic PyTorch.
Whenever you encounter something you don't understand, don't skip it — trace it back to the mathematical principles.

Deep Learning IntroductionUnderstanding New Model ArchitecturesInterview PreparationTechnical Debt Investigation

Anti-Patterns

Calling high-level libraries directly without understanding the underlying implementation.
Skipping mathematical derivation and only memorizing code patterns.
Looking at the answer immediately when implementation gets difficult.

The best way to understand something is to build it yourself from scratch.
Andrej Karpathy, YouTube channel description

Software 2.0 Migration Assessment Framework

karpathy-002

Systematically assess which software modules are suitable for neural network replacement, prioritizing modules with abundant data and complex rules.

List all hand-written rule modules in the system and assess each module's rule complexity (more rules = better migration candidate).
Assess each module's data availability (more data = better migration candidate).
Prioritize migrating modules with 'complex rules + abundant data' — these are the best Software 2.0 candidates.
Build a data engine for migrated neural network modules: collect hard cases → label → retrain → repeat.

AI Product Architecture DesignSystem Refactoring DecisionsAutonomous Driving PerceptionRecommendation System Optimization

Anti-Patterns

Migrating all modules to neural networks, ignoring the low maintenance cost of simple rule modules.
Deploying neural network modules without a data engine, leading to long-term degradation.

Software 2.0 is the new default. The question is which parts of your stack are ready to be rewritten.
Software 2.0, Andrej Karpathy, Medium, 2017

Data Engine Loop Method

karpathy-003

Drive continuous AI system improvement through a loop of collecting model failure cases, labeling, and retraining.

Deploy the initial model and monitor its failure cases in real-world scenarios (hard case mining).
Build a labeling system for high-quality annotation of hard cases, prioritizing high-frequency failure scenarios.
Retrain the model with new labeled data and validate improvement on hard case scenarios.
Redeploy the improved model, collect a new round of hard cases, and iterate.

Autonomous Driving PerceptionContent Moderation SystemsMedical Imaging DiagnosisAny Data-Driven AI Product

Anti-Patterns

Focusing only on model architecture improvements while ignoring data quality enhancement.
Randomly sampling labeling data rather than targeted annotation of hard cases.
Separating data engine and model training teams without close collaboration.

The data engine is the most important part of building a great AI system. Models are almost secondary.
Tesla AI Day 2021, Andrej Karpathy presentation

Intuition-First Teaching Method

karpathy-004

First let learners see the system running to build intuition, then gradually introduce mathematical principles and abstract concepts.

Start with a minimal runnable example and let learners directly see the system's inputs and outputs.
Gradually increase complexity, introducing only one new concept at a time and immediately demonstrating it in code.
Introduce mathematical formulas after code, not before; understand the code's behavior first, then its mathematical essence.
Encourage learners to modify code and experiment after each step, building intuition through failure.

AI Course DesignTechnical Blog WritingInternal Team TrainingOpen Source Project Documentation

Anti-Patterns

Teaching mathematical theory before code, causing learners to give up before seeing actual results.
Skipping intermediate steps and directly showing the final complex system.
Ignoring learner confusion signals and proceeding at a preset pace.

I try to build intuition first. The math follows naturally once you understand what's happening.
Andrej Karpathy, CS231n lecture notes introduction

Decision Timeline

10 Key Events

2011-09

Entered Stanford PhD Program under Fei-Fei Li

Context: Deep learning was beginning to show breakthrough potential in 2011-2012; ImageNet became the central battleground for computer vision. Fei-Fei Li's Stanford Vision Lab was one of the most important computer vision research centers.

Decision: Chose Stanford CS PhD program, joined Fei-Fei Li's lab, focusing on the intersection of deep learning and computer vision.

Reasoning: Stanford was the top research center for computer vision and machine learning; Fei-Fei Li's ImageNet project was reshaping the entire field.

Outcome: Completed pioneering doctoral research on image captioning, published several highly cited papers, and established academic reputation in deep learning.

Lesson: Choosing an exploding field and a top mentor is the optimal path to building research impact; technical intuition matters more than following the mainstream.

karpathy-model-teach-to-learn

2015-01

Led Stanford CS231n, Creating the Benchmark for Deep Learning Education

Context: Deep learning was transitioning from academic research to industrial application, but systematic teaching resources were extremely scarce. Engineers and students needed to learn CNNs but lacked high-quality introductory courses.

Decision: Accepted the opportunity to teach CS231n (Convolutional Neural Networks for Visual Recognition) and published all course notes openly online.

Reasoning: Teaching is the best way to deepen one's own understanding; openly publishing course notes can help global learners and receive broad feedback to improve content.

Outcome: CS231n course notes became one of the most widely cited deep learning educational resources globally, influencing millions of learners and establishing Karpathy's status as an AI educator.

Lesson: Openly sharing knowledge does not diminish competitive advantage but builds long-term influence and reputation; teaching is the best way to learn.

karpathy-model-teach-to-learnkarpathy-model-minimal-viable-implementation

2015-12

Co-Founded OpenAI

Context: In late 2015, Elon Musk, Sam Altman and others, concerned about AI safety risks, decided to found the nonprofit research organization OpenAI to research AGI openly and counterbalance commercial AI giants like Google DeepMind.

Decision: Joined OpenAI as a founding member and became part of the early core research team.

Reasoning: OpenAI's mission aligned strongly with his values around AI safety and open research; this was an opportunity to participate in history at AI's most critical juncture.

Outcome: OpenAI became one of the world's most important AI research organizations, later launching world-changing products including the GPT series, DALL-E, and ChatGPT.

Lesson: Joining the right organization at a critical technological paradigm shift can determine long-term impact more than individual technical ability.

karpathy-model-software2

2017-06

Joined Tesla as Director of AI, Leading Autopilot Perception

Context: Tesla faced multiple Autopilot-related accidents in 2016, with its perception system facing major challenges. Elon Musk decided to bring in top AI talent from academia to rebuild Autopilot's technical approach.

Decision: Left OpenAI to join Tesla, taking on the engineering challenge of transitioning Autopilot perception from traditional computer vision to deep neural networks.

Reasoning: Tesla had the world's largest real-world driving dataset, making it the best platform to apply Software 2.0 ideas to the real world; autonomous driving is one of AI's most complex and meaningful application scenarios.

Outcome: Over five years, transformed Tesla Autopilot from a rule-based system to a neural network-centric architecture, built Tesla's data engine, and drove the end-to-end FSD research direction.

Lesson: Translating theoretical research into large-scale engineering practice requires accepting the transition from academic freedom to engineering constraints; data scale is the true moat of AI systems.

karpathy-model-software2karpathy-model-data-centrickarpathy-model-end-to-end

2017-11

Published 'Software 2.0', Proposing Neural Networks as Replacement for Traditional Software

Context: Deep learning was outperforming manually designed rule systems in multiple domains, but no one had yet systematically articulated the nature and boundaries of this trend.

Decision: Published 'Software 2.0' on Medium, systematically proposing the theoretical framework of neural networks as a new software-writing paradigm.

Reasoning: His work at Tesla gave him concrete visibility into what Software 2.0 looks like in practice; systematizing and sharing this insight could help the entire industry understand the ongoing paradigm shift.

Outcome: The article spread widely, becoming one of the most influential theoretical articles in AI engineering, cited by thousands of papers and articles, profoundly influencing the industry's understanding of AI's nature.

Lesson: Systematizing insights gained from practice into a theoretical framework is the most effective way to amplify personal influence; one good article can have more impact than ten academic papers.

karpathy-model-software2karpathy-model-end-to-end

2021-08

Hosted Tesla AI Day, Publicly Showcasing FSD Neural Network Architecture

Context: Tesla FSD faced public skepticism, with competitor Waymo using LiDAR; Musk decided to use a public technical showcase to prove the viability of the pure vision approach while attracting top AI talent.

Decision: Planned and hosted Tesla AI Day, detailing FSD's neural network architecture, data engine, annotation system, and training infrastructure.

Reasoning: Disclosing technical details can attract top engineers while proving the technical approach to the market; transparency is the best way to build technical credibility.

Outcome: Tesla AI Day became a landmark event in AI engineering; Karpathy's presentation was widely shared, demonstrating the technical depth of the pure-vision autonomous driving approach and attracting significant attention from top AI talent.

Lesson: Technical transparency is the best recruiting tool and market education tool; detailed engineering demonstrations are more persuasive than marketing claims.

karpathy-model-data-centrickarpathy-model-end-to-end

2022-12

Released nanoGPT, Implementing GPT Training in 300 Lines of Code

Context: ChatGPT was released in November 2022, triggering enormous global interest in GPT; but most people could not understand how GPT works, and existing educational resources were either too abstract or relied on complex frameworks.

Decision: Released the nanoGPT open-source project, implementing the complete GPT-2 training pipeline in the most minimal PyTorch code possible, without any additional abstraction layers.

Reasoning: Understanding GPT does not require Hugging Face or complex frameworks; the best teaching tool is the most minimal runnable code that lets learners see the model's essence directly.

Outcome: nanoGPT received over 35,000 GitHub stars (as of 2024), becoming the most widely used GPT educational and research foundation globally, cited by countless courses, papers, and projects.

Lesson: In an information-saturated era, the scarcest resource is not information but clarity; simplifying complex systems to their essence is the most valuable contribution.

karpathy-model-minimal-viable-implementationkarpathy-model-teach-to-learn

2023-02

Left OpenAI to Focus on Independent AI Education

Context: After ChatGPT's explosion, global demand for AI education increased dramatically; Karpathy's work at OpenAI had completed an important phase, and he saw a larger educational opportunity.

Decision: Voluntarily left OpenAI to focus on YouTube educational video series and independent AI education projects.

Reasoning: The demand gap in AI education was far greater than his contribution at OpenAI; an independent role would allow him to more freely create high-quality educational content and directly influence millions of learners.

Outcome: YouTube channel rapidly grew to millions of subscribers; released series of courses implementing GPT, micrograd from scratch, becoming one of the most globally influential AI educators.

Lesson: Sometimes leaving the most prestigious platform to do more foundational work is the path to greater impact; education is the most fundamental infrastructure for technology dissemination.

karpathy-model-teach-to-learn

2024-07

Founded Eureka Labs to Build AI-Native Education Platform

Context: AI tools had become powerful enough to serve as personalized teaching assistants; but existing education platforms had not yet fully leveraged AI's potential to reconstruct the learning experience.

Decision: Founded Eureka Labs, using AI as the core teaching tool to build the next generation of AI-native education platforms.

Reasoning: AI can infinitely replicate the teaching style and knowledge of top educators, giving every learner personalized high-quality guidance; this is AI's most profound change to education.

Outcome: Eureka Labs was founded and began building AI-native educational products, attracting significant attention and representing Karpathy's practical exploration of the future of AI education.

Lesson: The best time to start a company is when you have both deep domain understanding and a large audience; edtech entrepreneurship requires first building educator credibility.

karpathy-model-teach-to-learnkarpathy-model-software2

2025-02

Coined 'Vibe Coding', Defining a New Paradigm for AI-Assisted Programming

Context: AI programming tools like GitHub Copilot and Cursor had significantly boosted development efficiency; but the industry lacked a clear description of the nature of this new programming approach.

Decision: Posted on Twitter/X coining 'Vibe Coding': programmers describe intent in natural language, AI generates code, and humans only need to feel whether the code is correct.

Reasoning: This programming approach was already widespread in practice but lacked an accurate conceptual description; naming a phenomenon helps people think and discuss it more clearly.

Outcome: Vibe Coding quickly became one of the most widely used concepts in AI programming, triggering global discussions about the nature of AI-assisted programming, learning approaches, and career implications.

Lesson: Naming an ongoing phenomenon can influence industry discourse more than inventing new technology; clear conceptual frameworks are the most valuable contribution of thought leaders.

karpathy-model-software2karpathy-model-minimal-viable-implementation

Reading List

Books

Recommended by (3)

Surely You're Joking, Mr. Feynman!

Richard Feynman · 1985

Karpathy has recommended this in interviews and on Twitter; Feynman's first-principles approach to learning and pure intellectual curiosity is the spiritual source of Karpathy's teaching style and AI education philosophy

Amazon 当当

Deep Learning

Ian Goodfellow, Yoshua Bengio & Aaron Courville · 2016

The most authoritative textbook in deep learning; Karpathy listed it as a reference in his CS231n course and recommends it to all AI practitioners as foundational theory

Amazon 当当

Gödel, Escher, Bach: An Eternal Golden Braid

Douglas Hofstadter · 1979

Karpathy has listed this as one of the most influential books in his life; Hofstadter's exploration of self-referential systems and emergent consciousness deeply resonates with Karpathy's understanding of emergent capabilities in LLMs

Amazon 当当

Influence Network

Origins, Contemporaries & Legacy

Influenced By

Fei-Fei Li · Academic Mentor

Doctoral advisor; Fei-Fei Li's ImageNet project and emphasis on large-scale datasets profoundly influenced Karpathy's data-centric thinking.

Geoffrey Hinton · Technical Inspiration

Godfather of deep learning; Hinton's backpropagation research and deep neural network work form the theoretical foundation of Karpathy's entire research direction.

Richard Feynman · Teaching Philosophy

Feynman's teaching philosophy — explaining the most complex things in the simplest way — profoundly influenced Karpathy's educational methodology.

Influenced

Global AI Learning Community · Educational Influence

Through CS231n, nanoGPT, and YouTube courses, Karpathy has directly influenced the learning paths and thinking of millions of AI learners.

Tesla Autopilot 工程团队 · Engineering Practice

The data engine, end-to-end neural network architecture, and engineering culture Karpathy led have profoundly shaped how the Tesla Autopilot team works.

AI Education Ecosystem · Paradigm Influence

The minimalist teaching style of nanoGPT and micrograd has influenced the design of numerous AI courses and textbooks, driving the popularization of the 'implement from scratch' teaching paradigm.

Co-thinkers

Ilya Sutskever · Research Partner

OpenAI co-founder; worked with Karpathy at OpenAI to advance large language model research, with deep overlap in deep learning theory and engineering practice.

Sam Altman · Organizational Collaboration

OpenAI CEO; co-shaped OpenAI's research direction and organizational culture with Karpathy, with shared thinking on balancing AI safety and capability development.

George Hotz · Technical Fellow Traveler

comma.ai founder; shares similar technical approaches to pure-vision autonomous driving and minimalist engineering philosophy with Karpathy.

Peer Reviews

Karpathy has an extraordinary ability to take the most complex ideas in AI and make them feel obvious and accessible. nanoGPT is a masterpiece of pedagogical engineering.
Yann LeCun · Yann LeCun Twitter/X post, January 2023

Andrej is one of the best teachers of AI in the world. His ability to build intuition while being rigorous is rare.
Sam Altman · Sam Altman interview, Lex Fridman Podcast, 2023

What Karpathy did with Tesla's Autopilot team — building the data engine, the annotation pipeline, the training infrastructure — was world-class engineering leadership.
Lex Fridman · Lex Fridman Podcast, Episode with Andrej Karpathy, 2022

正在打开人物节点

Andrej Karpathy

Core Knowledge Graph

Core Beliefs

Neural Networks Will Replace Hand-Written Software

Intuition Before Formulas, Hands-On Before Explanation

Simplicity Is the Highest Engineering Virtue

Recursive Self-Improvement Is the Most Effective Learning Method

Vibe Coding: Natural Language as the New Programming Interface

Mental Models

Software 2.0 Substitution Thesis

Minimal Viable Implementation

Data-Centric AI

End-to-End Learning Over Modular Pipelines

Teaching Is Learning

Values & Paradoxes

Top Researcher and Radical Simplifier as Dual Identities

Software 2.0 Advocate Yet Vibe Coding Proponent

Emphasizes Deep Understanding Yet Embraces AI-Assisted Programming

Evolution Phases

Academic Research Phase

Tesla Autonomous Driving Engineering Phase

OpenAI Return Phase

Independent AI Education Phase

10 Key Events

Entered Stanford PhD Program under Fei-Fei Li

Led Stanford CS231n, Creating the Benchmark for Deep Learning Education

Co-Founded OpenAI

Joined Tesla as Director of AI, Leading Autopilot Perception

Published 'Software 2.0', Proposing Neural Networks as Replacement for Traditional Software

Hosted Tesla AI Day, Publicly Showcasing FSD Neural Network Architecture

Released nanoGPT, Implementing GPT Training in 300 Lines of Code

Left OpenAI to Focus on Independent AI Education

Founded Eureka Labs to Build AI-Native Education Platform

Coined 'Vibe Coding', Defining a New Paradigm for AI-Assisted Programming

Books

Recommended by (3)

Origins, Contemporaries & Legacy

Influenced By

Influenced

Co-thinkers

Peer Reviews