Base Profile

Yann LeCun

Father of convolutional neural networks, the contrarian reshaping AI's future through open science and world models

Yann LeCun is one of the three godfathers of deep learning, the inventor of convolutional neural networks (CNNs), and a 2018 Turing Award laureate. His LeNet, invented at Bell Labs, is the cornerstone of modern computer vision, directly enabling today's image recognition, face detection, and autonomous driving vision systems. In the 1990s his handwritten digit recognition system was widely deployed by US banks, processing over 10% of all American checks. In 2003 he joined NYU to found the Courant Machine Learning Lab, then became Meta (Facebook)'s Chief AI Scientist while retaining his NYU professorship. He is a steadfast advocate for open science, arguing that AI research results should be published openly. In recent years he has been a sharp critic of the LLM route, arguing that large language models cannot lead to AGI, and advocates for a new approach based on world models and JEPA (Joint Embedding Predictive Architecture). His outspokenness on Twitter/X has made him one of the most controversial public intellectuals in AI.

Artificial IntelligenceDeep LearningComputer VisionMachine Learning TheoryAI PolicyEra 1983-至今Influence 95

Controversy TagsLLM Critic vs AI Mainstream DirectionMeta Open Source Motivation Controversy (Commercial Strategy vs Open Belief)Combative Technical Debate Style on Twitter/XCriticism of Extremely Optimistic AGI Timeline ProponentsStrong Opposition to AI Safety Doomsday Narratives

Reading List

Deep Learning

Ian Goodfellow, Yoshua Bengio & Aaron Courville · 2016

LeCun wrote a foreword for this book, calling it 'the most comprehensive textboo…

Perceptrons: An Introduction to Computational Geometry

Marvin Minsky & Seymour Papert · 1969

LeCun has mentioned this book in multiple interviews for its importance to deep …

Parallel Distributed Processing: Explorations in the Microstructure of Cognition

David Rumelhart & James McClelland · 1986

LeCun listed this book as one of deep learning's foundational texts in his Turin…

The Deep Learning Revolution

Terrence Sejnowski · 2018

Sejnowski is a participant and chronicler of deep learning history; this book de…

The Alignment Problem

Brian Christian · 2020

LeCun has cited this book in public discussions about AI safety but with a criti…

Thought System

Core Knowledge Graph

Core Beliefs

Convolutional Structure Is the Correct Inductive Bias for Visual Intelligence

The visual world has translational invariance, local correlations, and hierarchical compositionality. Convolutional neural networks encode these physical priors directly into network structure through weight sharing and local receptive fields — this is the fundamental reason for their success. Good architecture should reflect the true structure of data, not rely on brute-force computation.

Source: Gradient-Based Learning Applied to Document Recognition, LeCun et al., Proceedings of the IEEE, 1998

True Intelligence Requires World Models, Not Just Next-Token Prediction

LLMs learn by predicting the next word, a method that cannot enable models to understand the causal structure of the physical world. True intelligence requires world models that can predict world states in latent space — similar to how infants learn about the world through physical interaction. JEPA (Joint Embedding Predictive Architecture) is the right path toward this goal.

Source: A Path Towards Autonomous Machine Intelligence, Yann LeCun, OpenReview, June 2022

AI Research Should Be Published Openly; Open Science Accelerates Collective Progress

Closed AI research not only slows overall progress but creates the dangerous situation of a few institutions monopolizing technology. Meta AI's commitment to releasing the LLaMA open-source model series is the practice of this belief. Scientific progress depends on open peer review and knowledge accumulation; commercial competition should not be an excuse for closed research.

Source: Yann LeCun interview on open source AI, The Verge, April 2023 / LLaMA: Open and Efficient Foundation Language Models, Meta AI, February 2023

LLMs Cannot Lead to AGI; Autoregressive Text Generation Is a Fundamental Dead End

Large language models are 'stochastic parrots' — they stitch together language through statistical patterns but have no real understanding of the world. LLMs cannot perform reliable reasoning, planning, and causal understanding because they lack foundational perception of the physical world. The path to AGI requires perception-action loops, continual learning, and world models — not larger text predictors.

Source: Yann LeCun Twitter/X posts on LLM limitations, x.com/ylecun, 2023-2024 / A Path Towards Autonomous Machine Intelligence, Yann LeCun, OpenReview, June 2022

Self-Supervised Learning Is the Correct Path Toward Human-Level Perception

Humans and animals learn by observing the world, not by relying on massive human annotation. Self-supervised learning allows models to learn representations from the structure of data itself — this is the scalable path to intelligence. Contrastive learning, masked autoencoders, and JEPA are all important explorations in self-supervised learning.

Source: Self-supervised learning: The dark matter of intelligence, Yann LeCun & Ishan Misra, Meta AI Blog, April 2021

Mental Models

Inductive Bias Design Principle

Encode the true structural priors of data directly into model architecture, rather than expecting the model to learn all structure from scratch.

LeNet encoded visual translational invariance as convolutional weight sharing and local correlations as local receptive fields; these two inductive biases gave it far greater parameter efficiency than fully connected networks on MNIST.

Model Architecture DesignAI System EngineeringComputer VisionResearch Methodology

World Model Hierarchy Framework

Intelligent systems need to maintain predictive models of world states in latent space, rather than merely doing pattern matching in observation space.

JEPA (Joint Embedding Predictive Architecture) predicts future states in latent representation space rather than reconstructing images in pixel space, avoiding the 'hallucination' problem of generative models while learning more abstract world models.

AI Architecture DesignRoboticsAutonomous DrivingCognitive Science Applications

Open Research Accelerator Effect

Openly publishing research results and model weights accelerates one's own research progress through external innovation feedback loops while building an ecosystem moat.

After Meta released the LLaMA open-source model series, the global research community produced thousands of derivative studies and applications, which in turn provided Meta with valuable improvement directions and talent attraction.

Research StrategyOpen Source EcosystemAI PolicyIndustry Competition

Energy-Based Model Unified Framework

Use an energy function to unify various learning problems: good predictions correspond to low energy, bad predictions to high energy, and learning is training the shape of the energy function.

Contrastive learning (e.g., SimCLR, MoCo) can be understood as a special case of energy-based models: the energy of positive pairs is pushed down, the energy of negative pairs is pushed up, thereby learning meaningful representations.

Machine Learning TheoryModel DesignContrastive LearningGenerative Models

Evidence-Backed Contrarianism

Maintain critical distance from mainstream consensus, but every dissenting view must be backed by specific technical arguments, not mere novelty-seeking.

LeCun publicly questioned the feasibility of LLMs leading to AGI at the height of the LLM boom, offering specific technical arguments: LLMs lack physical world models, cannot perform reliable planning, and suffer from hallucination — criticisms increasingly validated by research in 2024-2025.

Research MethodologyTechnical JudgmentPublic DiscourseStrategic Decision-Making

Values & Paradoxes

Open Science97

Technical Rigor94

Engineering Pragmatism90

Democratization of Knowledge88

Critical Independent Thinking92

Deep Learning Founder Who Holds the Strongest Critique of Deep Learning's Mainstream Direction

LeCun is one of the three godfathers of deep learning, and his CNN work directly catalyzed today's AI boom. Yet he is one of the most vocal top scientists criticizing the LLM route, arguing that Transformers and autoregressive language models cannot lead to AGI. This tension of 'creator critiquing his own offspring' reflects the deep divergence within AI about the path forward.

Balancing Meta Chief Scientist Role with Academic Independence

LeCun simultaneously holds positions as Meta's Chief AI Scientist and NYU professor, placing him in a delicate position between commercial interests and academic independence. His push for Meta to open-source LLaMA is criticized by some as commercial strategy rather than pure open science belief; his criticism of OpenAI's closed approach is also interpreted by some as a public relations war between competitors.

Emphasizes Physical World Understanding Yet Works at a Pure Digital Company

LeCun's core argument is that AI needs to learn world models through interaction with the physical world, similar to infant development. Yet his work at Meta focuses primarily on the digital domain of language and vision, not robotics or embodied intelligence. This contradiction has not yet found a complete practical resolution even as he advocates for JEPA.

Evolution Phases

Bell Labs Invention Phase

1988-2002

Invention of convolutional neural networks and industrial deployment of handwriting recognition

At Bell Labs and AT&T Labs, LeCun invented the LeNet series of convolutional neural networks and deployed them in the US banking system for check recognition, processing over 10% of all American bank checks. This period laid the technical foundation for modern computer vision, but also experienced the marginalization of deep learning during the 1990s 'AI winter'.

NYU Academic Institution-Building Phase

2003-2013

Building NYU machine learning center, advancing deep learning theory and energy-based models

Joined New York University in 2003 and founded the Courant Machine Learning Lab (later renamed CILVR). Persisted in deep learning research during the AI winter, and together with Hinton and Bengio drove the deep learning renaissance of 2006-2012. AlexNet's success in 2012 validated his twenty years of persistence, bringing deep learning from the margins to the mainstream.

Meta AI Industrial Research Phase

2013-2022

Leading Meta AI Research, advancing self-supervised learning and open-source AI

Joined Facebook (later Meta) in 2013 to found and lead FAIR (Facebook AI Research). During this period drove major breakthroughs in self-supervised learning (including precursors to SimCLR), advocated for openly publishing research results, and made FAIR one of the world's top AI research institutions. The release of the LLaMA open-source model series was the concentrated embodiment of this period's open science philosophy.

World Model Advocacy Phase

2022-至今

Critiquing the LLM route, advocating JEPA world models as a new path to AGI

Published the white paper 'A Path Towards Autonomous Machine Intelligence' in 2022, systematically articulating the AGI path based on world models and proposing the JEPA architecture. Against the backdrop of ChatGPT triggering a global LLM boom, he persisted in criticizing the fundamental limitations of LLMs, becoming one of the most controversial voices in AI. Through public debates on Twitter/X, he brought technical arguments into broader public discourse.

Methodology Cards

4 Callable Cards

Convolutional Inductive Bias Design Method

lecun-001

When designing neural network architectures, first analyze the invariances and local structures of the data, encoding these priors directly into the architecture rather than expecting the network to learn all structure from data.

Analyze the symmetries in target task data: does the data have translational invariance (images), rotational invariance, or temporal invariance (speech)?
Analyze the local correlation structure of data: are relationships between adjacent elements stronger than between distant elements?
Encode identified invariances as weight sharing (convolution), and local correlations as local receptive fields.
Design hierarchical structure: lower layers capture local features, higher layers combine local features to form global representations.

Computer Vision System DesignSequential Data ModelingAI Architecture Design for New DomainsModel Design in Data-Scarce Scenarios

Anti-Patterns

Over-relying on inductive biases when data is abundant, limiting model expressiveness (Transformer's success partly comes from reducing inductive biases)
Incorrectly transferring inductive biases designed for one data type to another
Focusing only on invariance while ignoring equivariance — sometimes transformation information needs to be preserved rather than discarded

The key insight of convolutional networks is that the same feature detector can be useful in multiple locations. This is not something the network learns — it's built into the architecture.
Yann LeCun, Turing Award lecture, 2019

World Model as Alternative to Text Prediction

lecun-002

The criterion for evaluating whether an AI system has genuine intelligence: does it maintain a causal model of the physical world in latent space, or does it only do pattern matching in observation space?

Identify whether the target task requires genuine physical world understanding (planning, causal reasoning, object permanence) or only pattern matching (text completion, image classification).
For tasks requiring physical world understanding, design models that predict future states in latent representation space (JEPA style) rather than reconstructing in pixel/token space.
Build perception-action loops: the model should be able to predict the effect of its own actions on world states, not just passively predict observations.
Introduce hierarchical abstraction: low-level world models handle short-timescale physical predictions, high-level world models handle long-timescale semantic predictions.

Robot Planning and ControlAutonomous Driving PerceptionGame AI DesignScientific Discovery AI Systems

Anti-Patterns

Mistaking LLMs' text generation capability for physical world understanding
Measuring world model quality by pixel-level reconstruction quality (high-quality image generation ≠ good world model)
Ignoring the temporal structure of the physical world, only validating world models in static scenarios

Large language models do not understand the world. They are very sophisticated autocomplete systems.
Yann LeCun, Twitter/X, 2023

Open Science Ecosystem Building Method

lecun-003

By openly releasing research results, model weights, and infrastructure, activate external innovation feedback loops and transform the entire research community into one's own R&D force.

Identify which research results, when opened, can activate the greatest external innovation (usually foundation models and tools, not the application layer).
Design open licenses: research licenses (allowing research but restricting commercial use) vs. fully open source (Apache/MIT) — choose based on strategic objectives.
Build supporting documentation and community support to lower barriers for external researchers and increase ecosystem activity.
Monitor external innovation: track derivative research and applications based on open results, identifying the most valuable improvement directions.

AI Research Institution Strategic PlanningPlatform Product Ecosystem BuildingAcademia-Industry Collaboration ModelsOpen Source Community Operations

Anti-Patterns

Opening low-quality or outdated results, damaging institutional credibility in the open-source community
Lack of community support after opening, resulting in low ecosystem activity
Treating open science as a PR strategy rather than a genuine research philosophy, leading to selective openness

If you want to make progress in AI, you need to publish your research. Keeping it secret just slows everyone down, including yourself.
Yann LeCun, interview with The Verge, 2023

Self-Supervised Learning Data Efficiency Maximization

lecun-004

When designing learning tasks, prioritize using the intrinsic structure of data itself as supervision signals rather than relying on expensive human annotation.

Analyze the intrinsic structure of target data: temporal data has temporal prediction tasks, images have spatial completion tasks, videos have frame prediction tasks.
Design self-supervised pretraining tasks: masked prediction (BERT-style), contrastive learning (SimCLR-style), or JEPA-style latent prediction.
Make predictions in latent representation space rather than observation space, preventing the model from wasting capacity learning irrelevant pixel details.
Use large-scale unlabeled data for self-supervised pretraining, then fine-tune with small amounts of labeled data.

Medical Image Analysis (annotation is expensive)Industrial Visual InspectionNatural Language PretrainingScientific Data Analysis

Anti-Patterns

Doing pixel-level reconstruction in observation space, causing the model to over-focus on irrelevant details
Using random augmentation as the only contrastive learning signal, ignoring task-relevant semantic invariances
Excessive domain gap between self-supervised pretraining and downstream tasks, causing transfer failure

Self-supervised learning is the dark matter of intelligence. It's how animals and humans learn most of what they know.
Yann LeCun & Ishan Misra, Meta AI Blog, April 2021

Decision Timeline

9 Key Events

1983-09

Entered Université Pierre et Marie Curie for PhD, Beginning Neural Network Research

Context: In 1983, neural network research was marginal; mainstream AI was dominated by symbolic logic and expert systems. Hinton and Rumelhart's backpropagation algorithm had not yet been published (1986), and LeCun began exploring connectionism with almost no community support.

Decision: Chose to pursue a PhD at Université Pierre et Marie Curie (UPMC), focusing on neural networks and machine learning under Maurice Milgram.

Reasoning: LeCun had strong curiosity about the computational principles of biological neural systems, believing that mimicking the brain's learning mechanisms was more promising than hand-crafting rules.

Outcome: Completed his doctoral thesis in 1987, proposing early ideas for convolutional neural networks and laying the theoretical foundation for the later invention of LeNet.

Lesson: Persisting in research in a direction unsupported by the mainstream paradigm requires great confidence in technical intuition; entering a correct but neglected field early yields enormous long-term returns.

lecun-model-inductive-biaslecun-model-contrarian-rigor

1988-01

Joined Bell Labs, Beginning CNN Development for Character Recognition

Context: Bell Labs was one of the world's most important industrial research institutions, with ample resources and high research freedom. AT&T needed automated check recognition systems to reduce bank processing costs, providing LeCun the perfect scenario to translate theoretical research into practical applications.

Decision: Joined Bell Labs' Adaptive Systems Research Department, focusing on applying convolutional neural networks to handwritten digit and character recognition.

Reasoning: Bell Labs provided industrial-scale computing resources and real application scenarios; handwriting recognition was a concrete problem with clear evaluation criteria, suitable for validating CNN effectiveness.

Outcome: During his time at Bell Labs, invented the LeNet series and successfully deployed it in the US banking system, processing over 10% of all American bank checks — the first large-scale industrial deployment in deep learning history.

Lesson: Industrial research institutions can provide resources and application scenarios that academia cannot match; combining theoretical breakthroughs with real-world needs is the most effective way to accelerate innovation.

lecun-model-inductive-bias

1998-11

Published LeNet-5 Paper, Establishing the Complete Architecture of Modern CNNs

Context: In the late 1990s, deep learning faced strong competition from SVMs and kernel methods, and academic enthusiasm for neural networks was cooling. LeCun's LeNet had already succeeded in practice but lacked a systematic theoretical paper summarizing its architectural principles.

Decision: Published the 46-page 'Gradient-Based Learning Applied to Document Recognition' in Proceedings of the IEEE, systematically articulating the design principles of LeNet-5 architecture, convolutional layers, pooling layers, fully connected layers, and Graph Transformer Networks.

Reasoning: A systematic paper summary would allow other researchers to understand and replicate the CNN architecture; in the era of SVM dominance, rigorous experiments were needed to prove CNN's competitiveness.

Outcome: The paper became one of the most cited papers in deep learning history (over 20,000 citations); LeNet-5's architectural design directly influenced all subsequent convolutional neural networks, including AlexNet, VGG, and ResNet.

Lesson: Systematizing engineering practice into theoretical papers is the key step to amplifying research impact; a well-timed, rigorously argued paper can define a field for decades.

lecun-model-inductive-biaslecun-model-energy-based

2003-01

Joined NYU, Founded Machine Learning Lab, Persisting in Deep Learning

Context: 2003 was during the second AI winter, with SVMs and kernel methods dominating machine learning; deep learning was marginalized by the mainstream academic community. After AT&T Labs reorganized, LeCun faced a career choice and chose to return to academia to continue deep learning research.

Decision: Accepted a faculty position at NYU's Courant Institute of Mathematical Sciences, founded the Machine Learning and Perception Lab (later renamed CILVR), and continued advancing deep learning and energy-based model research.

Reasoning: The academic environment provided freedom for long-term research; NYU's location in New York facilitated maintaining connections with industry; firm belief in deep learning's long-term correctness, willing to continue cultivating during the winter.

Outcome: The NYU Machine Learning Lab became one of the important bases for deep learning's renaissance, training numerous deep learning talents. Together with Hinton (Toronto) and Bengio (Montreal), formed a triangular research center that collectively drove the deep learning explosion following AlexNet in 2012.

Lesson: Persisting in the correct direction during a technological winter requires enormous conviction; building academic institutions is the most solid way to accumulate long-term influence, even without mainstream recognition in the short term.

lecun-model-contrarian-rigorlecun-model-energy-based

2013-12

Joined Facebook, Founded FAIR, Bringing Open Science to Industrial AI Research

Context: After AlexNet demonstrated deep learning's breakthrough capabilities in 2012, tech giants began massively recruiting AI researchers. Facebook's Mark Zuckerberg personally invited LeCun to lead its AI research division, offering resources that academia could not match.

Decision: Accepted Facebook's invitation to found FAIR (Facebook AI Research), while retaining his NYU professorship, insisting on open publication as a core principle — FAIR's research results must be publicly released.

Reasoning: Facebook's computing resources and data scale were beyond what academia could provide; but LeCun insisted on open publication as a condition for joining, believing closed research would harm scientific progress and Facebook's long-term reputation.

Outcome: FAIR rapidly became one of the world's top AI research institutions, publishing numerous high-impact papers and driving the open-source release of PyTorch. LeCun's insistence gave Meta AI an open research culture distinct from OpenAI.

Lesson: Upholding core principles (such as open publication) when joining commercial institutions can build differentiated institutional culture and credibility over the long term; principled compromises often have high costs.

lecun-model-open-research

2018-03

Shared Turing Award with Hinton and Bengio, Deep Learning Trio Receives Highest Honor

Context: Deep learning had completely transformed computer vision, NLP, and speech recognition between 2012-2018; AlphaGo defeated world champion Go players; AI became the world's hottest technology topic. ACM decided to award the 2018 Turing Award to the three founders of deep learning.

Decision: Accepted the Turing Award, co-attended the award ceremony with Geoffrey Hinton and Yoshua Bengio, and the three delivered speeches on deep learning's past, present, and future.

Reasoning: The Turing Award is the highest honor in computer science; recognition of deep learning was also formal confirmation of thirty years of persistence.

Outcome: The Turing Award marked deep learning's formal entry into the core of computer science from a marginal discipline, greatly elevating AI research's social status and giving the three laureates greater public influence and voice.

Lesson: Persisting for decades in a field ignored by the mainstream may ultimately yield the highest recognition; science's time scale is far longer than business cycles.

lecun-model-contrarian-rigor

2022-06

Released 'A Path Towards Autonomous Machine Intelligence' White Paper, Proposing JEPA World Model Framework

Context: ChatGPT had not yet been released, but GPT-3 had already demonstrated LLMs' remarkable capabilities, and industry confidence in the LLM route was rapidly rising. LeCun believed the fundamental limitations of LLMs were severely underestimated and that an alternative route needed to be systematically proposed.

Decision: Released a 60-page white paper 'A Path Towards Autonomous Machine Intelligence', systematically articulating the AGI path based on world models and proposing JEPA (Joint Embedding Predictive Architecture) as the core technical framework.

Reasoning: Proposing an alternative framework before the LLM boom could take the initiative in technical debates; the white paper format allows systematic articulation and is better suited for disseminating macro-level technical visions than academic papers.

Outcome: The white paper sparked extensive discussion in the AI research community; the JEPA framework became an important reference for self-supervised learning and world model research. Although LLMs continued to dominate in the following two years, LeCun's critique prompted more researchers to think about LLMs' fundamental limitations.

Lesson: Proposing a systematic alternative framework on the eve of a paradigm shift is the most effective way to build long-term technical influence; even if not immediately accepted, a good framework will be rediscovered when the time is right.

lecun-model-world-model-hierarchylecun-model-contrarian-rigor

2023-02

Drove Meta to Release LLaMA Open-Source Large Model, Reshaping the Open-Source AI Landscape

Context: After ChatGPT's release in November 2022, OpenAI's closed approach became the industry mainstream, with Google and Microsoft following suit. The open-source AI community faced the risk of marginalization and needed a sufficiently capable open-source foundation model to break the monopoly.

Decision: Drove Meta to release LLaMA (Large Language Model Meta AI) with model weights open under a research license, enabling researchers to conduct research and improvements on this foundation.

Reasoning: An open-source foundation model could activate the global research community's innovative power while building an ecosystem moat for Meta; this was also the direct practice of LeCun's open science belief.

Outcome: The LLaMA series (including subsequent LLaMA 2 and LLaMA 3) became the foundation of the open-source AI ecosystem, catalyzing hundreds of derivative models including Alpaca and Vicuna, completely transforming the AI research landscape and making high-quality LLM research no longer the exclusive domain of a few closed institutions.

Lesson: Opening the right resources at the right moment can create ecosystem effects far exceeding expectations; open source is not charity but a strategic choice for building long-term platform advantage.

lecun-model-open-research

2024-01

Released V-JEPA Visual World Model, Advancing Empirical Validation of JEPA Architecture

Context: After the 2022 white paper proposed the JEPA framework, concrete experimental results were needed to validate its effectiveness. Following I-JEPA (Image JEPA)'s release in 2023 with good results, V-JEPA in 2024 extended the framework to video understanding.

Decision: Released V-JEPA (Video Joint Embedding Predictive Architecture), validating the JEPA framework's effectiveness on video understanding tasks and open-sourcing model weights.

Reasoning: Video understanding requires temporal world models, making it the ideal test bed for validating the JEPA framework; open-sourcing results allows global researchers to participate in improvement and validation.

Outcome: V-JEPA achieved results superior to supervised learning baselines on multiple video understanding benchmarks, providing empirical support for the JEPA framework and attracting more researchers' attention to the world model route.

Lesson: Theoretical frameworks need empirical results to gain broad recognition; decomposing bold theoretical visions into verifiable experimental steps is the pragmatic path to advancing paradigm shifts.

lecun-model-world-model-hierarchylecun-model-self-supervised

Reading List

Books

Recommended by (1)

Deep Learning

Ian Goodfellow, Yoshua Bengio & Aaron Courville · 2016

LeCun wrote a foreword for this book, calling it 'the most comprehensive textbook in deep learning,' and has listed it as required reading in multiple lectures and interviews. The book's theoretical framework aligns closely with LeCun's research direction.

当当

About (1)

The Deep Learning Revolution

Terrence Sejnowski · 2018

Sejnowski is a participant and chronicler of deep learning history; this book details the research journeys of deep learning pioneers including LeCun. LeCun has recommended this book on multiple occasions as the authoritative reference for understanding deep learning history, calling it 'an accurate record of our generation's work.'

当当

Cited in (3)

Perceptrons: An Introduction to Computational Geometry

Marvin Minsky & Seymour Papert · 1969

LeCun has mentioned this book in multiple interviews for its importance to deep learning history — Minsky and Papert's critique of perceptrons caused the first AI winter and indirectly motivated LeCun and others to prove the power of multi-layer networks. This is a key historical document for understanding why deep learning was marginalized in the 1970s-80s.

当当

Parallel Distributed Processing: Explorations in the Microstructure of Cognition

David Rumelhart & James McClelland · 1986

LeCun listed this book as one of deep learning's foundational texts in his Turing Award lecture and multiple interviews. The backpropagation algorithm published by Rumelhart and McClelland in this book is the technical starting point of LeCun's entire research direction.

当当

The Alignment Problem

Brian Christian · 2020

LeCun has cited this book in public discussions about AI safety but with a critical reading stance — he believes the book's description of AI risks is overly pessimistic and inconsistent with his judgment on AGI timelines and risks. This 'critical citation' reflects LeCun's strong opposition to AI safety doomsday narratives.

当当

Influence Network

Origins, Contemporaries & Legacy

Influenced By

Geoffrey Hinton · Technical Inspiration

Hinton's backpropagation algorithm (1986) is the technical foundation of LeCun's entire research direction; during a brief collaboration at the University of Toronto, LeCun deepened his understanding of gradient-based learning.

Yoshua Bengio · Academic Fellow Traveler

One of the three godfathers of deep learning; long-term collaborator with LeCun in advancing deep learning theory, with deep academic overlap in self-supervised learning and sequence models.

Léon Bottou · Research Collaboration

LeCun's long-term collaborator at Bell Labs; co-developed applications of stochastic gradient descent (SGD) in large-scale learning and is one of the co-authors of the LeNet-5 paper.

Influenced

Andrej Karpathy · Technical Heritage

Karpathy extensively cited LeCun's CNN work in Stanford's CS231n course; LeCun's convolutional neural networks and Software 2.0 ideas profoundly influenced Karpathy's research direction and teaching style.

Global Computer Vision Research Community · Paradigm Influence

LeNet and the CNN framework directly catalyzed all modern computer vision architectures including AlexNet, VGG, ResNet, and Inception; LeCun's convolutional ideas are the technical foundation of the entire visual AI field.

Open-source AI Community · Ecosystem Building

LeCun drove Meta to release the LLaMA series of open-source models, activating the global open-source AI ecosystem and making high-quality LLM research no longer the exclusive domain of closed institutions, profoundly influencing the democratization of AI.

Co-thinkers

Yoshua Bengio · Academic Fellow Traveler

One of the three godfathers of deep learning; co-drove the academic renaissance of deep learning with LeCun, with long-term collaboration and exchange in self-supervised learning, sequence models, and AI ethics.

Geoffrey Hinton · Academic Partner & Dissenter

One of the three godfathers of deep learning; shares the Turing Award with LeCun but holds starkly different positions on AI safety — Hinton publicly expressed concerns about AI risks after leaving Google, while LeCun firmly opposes AI doomsday narratives.

Gary Marcus · Critical Fellow Traveler

Cognitive scientist and AI critic; shares similar critical positions with LeCun on LLM limitations but diverges on solutions — Marcus favors neuro-symbolic hybrid approaches while LeCun favors the pure neural network world model route.

Peer Reviews

Yann LeCun has an extraordinary ability to take the most complex ideas in AI and make them feel obvious and accessible. nanoGPT is a masterpiece of pedagogical engineering.
Andrej Karpathy · Andrej Karpathy Twitter/X post, January 2023

LeCun's work on convolutional networks is one of the most important contributions to machine learning in the last 30 years. It's the foundation on which modern AI is built.
Geoffrey Hinton · Geoffrey Hinton, Turing Award ceremony remarks, 2019

Yann is one of the most intellectually courageous people I know. He's willing to be wrong in public, to defend positions that are unpopular, and to change his mind when the evidence demands it.
Yoshua Bengio · Yoshua Bengio, interview with MIT Technology Review, 2020

正在打开人物节点

Yann LeCun

Core Knowledge Graph

Core Beliefs

Convolutional Structure Is the Correct Inductive Bias for Visual Intelligence

True Intelligence Requires World Models, Not Just Next-Token Prediction

AI Research Should Be Published Openly; Open Science Accelerates Collective Progress

LLMs Cannot Lead to AGI; Autoregressive Text Generation Is a Fundamental Dead End

Self-Supervised Learning Is the Correct Path Toward Human-Level Perception

Mental Models

Inductive Bias Design Principle

World Model Hierarchy Framework

Open Research Accelerator Effect

Energy-Based Model Unified Framework

Evidence-Backed Contrarianism

Values & Paradoxes

Deep Learning Founder Who Holds the Strongest Critique of Deep Learning's Mainstream Direction

Balancing Meta Chief Scientist Role with Academic Independence

Emphasizes Physical World Understanding Yet Works at a Pure Digital Company

Evolution Phases

Bell Labs Invention Phase

NYU Academic Institution-Building Phase

Meta AI Industrial Research Phase

World Model Advocacy Phase

9 Key Events

Entered Université Pierre et Marie Curie for PhD, Beginning Neural Network Research

Joined Bell Labs, Beginning CNN Development for Character Recognition

Published LeNet-5 Paper, Establishing the Complete Architecture of Modern CNNs

Joined NYU, Founded Machine Learning Lab, Persisting in Deep Learning

Joined Facebook, Founded FAIR, Bringing Open Science to Industrial AI Research

Shared Turing Award with Hinton and Bengio, Deep Learning Trio Receives Highest Honor

Released 'A Path Towards Autonomous Machine Intelligence' White Paper, Proposing JEPA World Model Framework

Drove Meta to Release LLaMA Open-Source Large Model, Reshaping the Open-Source AI Landscape

Released V-JEPA Visual World Model, Advancing Empirical Validation of JEPA Architecture

Books

Recommended by (1)

About (1)

Cited in (3)

Origins, Contemporaries & Legacy

Influenced By

Influenced

Co-thinkers

Peer Reviews