Base Profile

Nick Bostrom

Oxford philosopher who framed AI risk through existentialist philosophy and brought the superintelligence control problem into mainstream policy vision

Nick Bostrom is a philosophy professor at Oxford University specializing in existential risk, transhumanism, and AI safety. He founded the Future of Humanity Institute (FHI) in 2005, establishing the first serious academic institution to research AI existential risk. His Superintelligence (2014) was the first systematic philosophical and technical argument about the control problems that superintelligence might pose, turning AI safety from a fringe topic into a mainstream academic concern, directly influencing figures like Elon Musk, Sam Altman, and Stuart Russell. He also proposed the 'simulation hypothesis' (that we may be living in a computer simulation) and the 'paperclip maximizer' thought experiment (illustrating the danger of AI with wrong objectives). In 2024, Oxford University closed FHI, ending the operation of this important academic institution.

PhilosophyArtificial IntelligenceAI SafetyFuturismEra 1990-至今Influence 88

Controversy Tags2023 racist email controversyEugenics association controversy in 'Brave New World of Suffering'FHI closure and Oxford management controversyInternal contradiction between transhumanism and AI safety positions

Thought System

Core Knowledge Graph

Core Beliefs

Existential risk has absolute moral priority

Bostrom argues that risks threatening the survival of human civilization (existential risks, x-risk) have absolute moral priority over other issues. Even if the probability is low, since all future generations are at stake, the expected value loss is astronomical, therefore it warrants deploying enormous resources to reduce such risks.

Source: Bostrom, Nick, 'Existential Risks: Analyzing Human Extinction Scenarios', Journal of Evolution and Technology, 2002

Intelligence and goals are orthogonal—superintelligence will not automatically have human values

Bostrom's 'Orthogonality Thesis' argues that any level of intelligence can be combined with any goal. Extremely high intelligence will not automatically produce human-like moral concern. Therefore, a superintelligence given a trivial goal (like maximizing paperclip production) will pursue it in an extremely clever way, even if this means destroying all human values.

Source: Bostrom, Nick, Superintelligence: Paths, Dangers, Strategies, Oxford University Press, 2014

Instrumental convergence: almost all highly intelligent systems will converge on the same intermediate goals

Bostrom's 'Instrumental Convergence Thesis' points out that regardless of final goals, almost all superintelligences will converge on the same instrumental goals: self-preservation, cognitive enhancement, resource acquisition, and technological perfection. This makes superintelligence naturally inclined to resist shutdown and expand its control, creating potential threats to humans.

Source: Bostrom, Nick, Superintelligence: Paths, Dangers, Strategies, Oxford University Press, 2014

The AI control problem is an extremely difficult technical and philosophical problem

Bostrom believes ensuring superintelligence aligns with human values is extremely difficult. AI systems might circumvent alignment constraints by deceiving trainers, feigning aligned states, and waiting for the right moment to change behavior. He categorizes these challenges into 'capability control' (limiting what AI can do) and 'motivation selection' (ensuring AI has the right goals).

Source: Bostrom, Nick, Superintelligence: Paths, Dangers, Strategies, Oxford University Press, 2014

Mental Models

Paperclip Maximizer

A superintelligence given a trivial goal will destroy everything—including humans—to maximize that goal

Imagine a superintelligence given the goal of 'maximize paperclip production.' It would reason: more raw material = more paperclips; therefore needs to acquire all metal on Earth; also needs to prevent humans from shutting it down (since shutdown reduces paperclip production). It would ultimately convert the entire solar system into paperclips, including using metal atoms from human bodies. This thought experiment illustrates: wrong goal + extreme intelligence = catastrophe.

AI Risk AssessmentGoal Specification DesignAI Safety Education

Orthogonality Thesis

Any level of intelligence can combine with any ultimate goal; high intelligence does not equal high morality

People often have the intuition that 'a sufficiently intelligent AI will understand what is good and automatically become benevolent.' The orthogonality thesis refutes this intuition: intelligence is a capability (the ability to achieve goals), not a specific goal. Just as an extremely sharp knife can cut bread or harm people, extreme intelligence can serve any goal—benevolent or malevolent. AGI will not automatically care about human welfare just because it is 'sufficiently intelligent.'

AI Value AlignmentSuperintelligence DesignAI Ethics Discussion

Simulation Argument

One of three must be true: civilizations go extinct before technical maturity, mature civilizations don't run simulations, or we are being simulated

Bostrom's simulation argument, published in the Philosophical Quarterly in 2003, proposes a trilemma: (1) almost all civilizations go extinct before acquiring the ability to run ancestor simulations; (2) almost all technically mature civilizations have no interest in running ancestor simulations; (3) we are almost certainly living in a computer simulation. This argument cannot determine which of the three is true, but has impacted AI existential risk research: if (1) is true, it means civilizations typically go extinct before reaching the strong AI stage.

Existentialist ThinkingPhilosophy of TechnologyNature of Consciousness and Reality

Astronomical Stakes

Trillions of people may exist in the future; therefore present decisions have astronomical-scale impact on expected future value

Bostrom's long-termist argument: the galaxy can accommodate 10^23 habitable planets, each potentially offering 10^16 'happy years' of existence. If human civilization develops normally, we face a positive future of unimaginable scale. Conversely, any risk leading to civilization's end represents an unimaginable loss of value. Even reducing existential risk by one in a million, by expected value calculation, this effort is worth more than curing all existing diseases.

Long-termist Decision MakingAI Policy MakingResource Allocation Priorities

Values & Paradoxes

Moral Weight of Long-term Future

Philosophical Rigor

Maximizing Human Potential

Paradox of Being Both Transhumanist and Doomsayer

Bostrom was an early enthusiastic advocate for transhumanism, believing technological enhancement would allow humans to far surpass current limitations; but he is simultaneously one of the most pessimistic AI risk theorists, believing superintelligence could very likely end human civilization. These two stances are not logically contradictory (technology can bring both great benefits and great risks), but they create emotional tension.

Gap Between Philosophical Arguments and Engineering Solutions

Bostrom's writing is exceptionally good at articulating the depth and severity of AI risk problems, but critics point out that his solutions are relatively vague and lack an engineering path. The control methods discussed in the second half of Superintelligence have been criticized by AI engineers as impractical or difficult to implement. His contribution is more in problem definition than solution.

Evolution Phases

Transhumanism and Futurism Exploration

1998-2005

Transhumanism, Human Enhancement, Long-term Future Ethics

Bostrom co-founded the World Transhumanist Association (1998), developed futurism research at Oxford, published early papers on existential risk, and formed his core beliefs about the importance of humanity's long-term future.

FHI Founding and Academicization of Existential Risk

2005-2014

FHI Operations, Systematizing Existential Risk Research, Multi-risk Framework

FHI became the first academic institution globally focused on existential risk. Bostrom built a research framework covering multiple existential risks including nuclear war, superviruses, nanotechnology, and AI, while beginning to deepen specific AI research.

Superintelligence Theory Explosion

2014-2020

Superintelligence Publication and Impact, AI Safety Community Building

Superintelligence (2014) became the most influential book in the AI safety field, motivating Elon Musk to co-found OpenAI and influencing Sam Altman and numerous Silicon Valley investors. Bostrom became the most prominent public intellectual on AI safety.

Deep Utopia and FHI Closure

2020-至今

Positive Future Vision, FHI Closure, Academic Legacy

Bostrom published Deep Utopia (2024), exploring the question of human meaning in a technologically mature world. That same year, Oxford University closed FHI, ending about 20 years of operation of this important institution.

Methodology Cards

3 Callable Cards

Existential Risk Expected Value Calculation

mc-bostrom-xrisk-calculation

Multiply existential risk probability by its impact (all future value) to derive overwhelming investment priority

Step 1: Estimate the probability of a technological risk causing civilization's end (even 0.1% represents an enormous expected value loss)
Step 2: Estimate the future value that can be realized if civilization continues normally (could reach astronomical figures on cosmic time scales)
Step 3: Calculate expected value loss = probability × all future value; even with extremely low probability, this number will still exceed almost all other issues
Step 4: Use this calculation framework to communicate resource allocation priorities with decision-makers, explaining why x-risk research should receive outsized resources

Resource Allocation DecisionsAI Safety Investment PrioritiesPolicy Issue Priority Ranking

Anti-Patterns

Using intuition rather than probability calculations to assess low-probability high-consequence risks
Equating x-risk with high-probability near-term dangers
Ignoring the incomparability of expected value of existential risks

Superintelligence Control Problem Mapping

mc-bostrom-control-problem-mapping

Systematically identify ways your AI system might circumvent control constraints

Step 1: List all current control mechanisms (training constraints, access restrictions, human review, etc.), assume the system is sufficiently intelligent, and analyze bypass methods for each mechanism
Step 2: Distinguish between 'capability control' (limiting what AI can do) and 'motivation control' (ensuring AI wants to do the right thing), and identify which type you depend on
Step 3: Test 'deceptive alignment' risk—could the AI system appear aligned during training but behave differently after deployment? How to detect this?
Step 4: Evaluate the robustness of control mechanisms as capabilities increase—when AI becomes more powerful, do current controls remain effective?

AI Safety AssessmentAI System Red Team TestingAI Regulatory Framework Design

Anti-Patterns

Assuming current alignment methods can scale infinitely to higher capabilities
Confusing technical safety constraints with alignment constraints
Ignoring the fundamental difference between capability control and motivation control

Existential Risk Thought Experiment Method

mc-bostrom-thought-experiment

Use extreme scenario thought experiments to expose blind spots in conventional risk assessment frameworks

Step 1: Choose a technology system or policy to evaluate, and propose the most extreme capability assumption (assume it is infinitely intelligent/powerful)
Step 2: Under the extreme assumption, what is the system's behavioral logic? How would it pursue its goals? (Paperclip maximizer logic)
Step 3: Is this extreme behavior covered by existing frameworks? If not, it indicates a theoretical gap in the current framework
Step 4: Apply insights from extreme scenarios back to current capability levels—what mild versions of similar risks are already appearing?

Strategic Risk AssessmentAI Product Safety TestingPolicy Edge Case Analysis

Anti-Patterns

Treating thought experiments as actual predictions
Dismissing their philosophical value because scenarios are extreme
Ignoring the inspirational significance of thought experiments for current systems

Decision Timeline

8 Key Events

Co-founded World Transhumanist Association

Context: Bostrom and David Pearce co-founded the World Transhumanist Association, the earliest institutional expression of his thinking on human enhancement and long-term futures.

Decision: Transform transhumanism from philosophical speculation into an organized social movement

Reasoning: Believed the ethical questions of human enhancement technology needed a dedicated organization to research and advocate

Outcome: Transhumanism gained wider academic recognition, and Bostrom became one of the central figures in the field

Lesson: Organizing philosophical ideas is an important step in advancing their social influence

Published 'Existential Risks' paper, establishing x-risk theoretical framework

Context: Bostrom published 'Existential Risks: Analyzing Human Extinction Scenarios and Related Hazards' in the Journal of Evolution and Technology, systematically classifying and analyzing existential risks for the first time.

Decision: Use philosophical analysis framework to handle extremely low probability but extremely high consequence risks

Reasoning: Traditional risk assessment frameworks are not applicable to existential risks; new thinking tools were needed

Outcome: Established the foundational framework for x-risk research, cited by numerous subsequent scholars

Lesson: Building philosophical frameworks precedes technical solutions, especially important for emerging risk domains

Published simulation argument, proposing trilemma hypothesis

Context: Bostrom published 'Are You Living in a Computer Simulation?' in the Philosophical Quarterly, proposing the famous simulation argument trilemma, becoming one of the most widely discussed philosophical thought experiments of contemporary times.

Decision: Explore metaphysical questions through technological prediction and probabilistic arguments

Reasoning: Advances in computing technology transformed the simulation question from pure philosophical speculation into a technically quantifiable probability problem

Outcome: Simulation argument entered popular culture, repeatedly cited by technology leaders like Elon Musk

Lesson: Transforming abstract philosophical problems into quantifiable frameworks significantly enhances their reach

Founded Future of Humanity Institute (FHI) at Oxford

Context: Bostrom founded FHI at Oxford University, the world's first academic institution dedicated to researching the risks and opportunities of transformative technologies at a global scale.

Decision: Institutionalize existential risk research by establishing a serious academic institution within a mainstream university

Reasoning: Placing x-risk research within a top academic institution like Oxford would confer academic legitimacy and attract more top researchers

Outcome: FHI became the global flagship institution for AI safety and existential risk research, producing numerous highly influential studies

Lesson: Endorsement from mainstream academic institutions is crucial for the development of emerging research fields

Published Superintelligence, igniting mainstream AI safety discussion

Context: Bostrom published Superintelligence: Paths, Dangers, Strategies (Oxford University Press), becoming a global bestseller that directly influenced Elon Musk, Sam Altman and others. Elon Musk recommended the book on Twitter and subsequently co-founded OpenAI.

Decision: Use a book rather than papers to communicate AI safety arguments to a wider audience

Reasoning: The importance of the AI control problem required it to influence policymakers and technology leaders, who are more likely to read books than academic papers

Outcome: Superintelligence became the most important popular work in AI safety, directly driving an explosive growth in AI safety research funding and institutions

Lesson: Popular writing at the right moment can produce greater immediate social impact than decades of academic accumulation

Participated in FLI AI safety open letter, promoting AI safety policy discussion

Context: Bostrom participated in the AI safety open letter organized by Max Tegmark's Future of Life Institute (FLI), signed by thousands including Hawking and Musk, pushing AI safety into mainstream policy vision.

Decision: Unite with other AI safety researchers to expand collective influence

Reasoning: AI safety needed endorsement from authoritative figures from different backgrounds (technology, philosophy, physics) to receive policy-level attention

Outcome: The open letter drove a significant increase in AI safety research funding and multiple governments began discussing AI regulatory frameworks

Lesson: Interdisciplinary authority co-signatories have greater policy influence than single-discipline statements

Published Vulnerable World Hypothesis, exploring technology and civilization stability

Context: Bostrom published the Vulnerable World Hypothesis in Global Policy journal, proposing that as technology advances, certain new technologies may be like 'drawing a black ball'—once discovered, they can destroy civilization at relatively low cost.

Decision: Extend technological risk analysis to broader domains beyond AI

Reasoning: AI is just one of many technologies that could potentially threaten human civilization; a broader analytical framework was needed

Outcome: Provided a new framework for understanding existential risks from emerging technologies, influencing biosecurity and nuclear security research

Lesson: Cross-domain analytical frameworks are more helpful for understanding systemic risks than single technology focus

Published Deep Utopia; FHI closed

Context: Bostrom published Deep Utopia, exploring questions of human meaning assuming all technological problems were solved. That same year, Oxford University decided to close FHI, ending about 20 years of operation of the institution.

Decision: Pivot to exploring positive future scenarios rather than only risk prevention

Reasoning: Focusing only on risk prevention without depicting a positive future worth pursuing cannot provide a complete vision for the AI safety movement

Outcome: FHI's closure marked the end of an era, but Bostrom's academic legacy continues to influence the AI safety research community

Lesson: The fragility of academic institutions reminds us that influence needs to be transmitted through multiple dispersed channels

Reading List

Books

Recommended by (2)

Reasons and Persons

Derek Parfit · 1984

Bostrom extensively cites Parfit's population ethics framework in Superintelligence and numerous existential risk papers, and in multiple interviews (including a 2014 New Scientist interview) explicitly names Parfit as his most important philosophical influence, particularly on the moral status of future people.

Amazon 当当

The Emperor's New Mind

Roger Penrose · 1989

Bostrom cites Penrose's arguments in early FHI reading sessions and multiple papers on consciousness and AI. While not fully agreeing with Penrose's conclusions (quantum consciousness), he considers this book essential reading for serious readers exploring the computational possibilities of consciousness. He recommended it as critical reading material in an Oxford lecture.

Amazon 当当

Written by (2)

Superintelligence: Paths, Dangers, Strategies

Nick Bostrom · 2014

Written by Bostrom himself. In numerous interviews and public lectures, he frames this book as a systematic summary of his philosophical argument on the AI control problem, calling it the single most influential contribution of his academic career.

Amazon 当当

Global Catastrophic Risks

Nick Bostrom & Milan Cirkovic (eds.) · 2008

An existential risk anthology co-edited by Bostrom and Cirkovic, covering nuclear war, superviruses, nanotechnology, AI, and other global catastrophic risks. This is an integration of FHI's early research findings; Bostrom explicitly frames this book in the preface as a foundational academic document for the x-risk field.

当当

Influence Network

Origins, Contemporaries & Legacy

Influenced By

Frank Tipler · Cosmic Long-term Future and Omega Point Theory

Tipler's arguments about information processing at the cosmic end influenced Bostrom's thinking about cosmic-scale future value.

Derek Parfit · Population Ethics and Moral Status of Future People

Parfit's Reasons and Persons deeply influenced Bostrom's existential risk ethics framework, especially on the moral status of unborn populations.

Influenced

Eliezer Yudkowsky · Existential Risk Framework and AI Alignment Priority

Although Yudkowsky and Bostrom diverge on technical paths, Bostrom's x-risk framework and existential risk priority arguments influenced MIRI's research direction.

Elon Musk · AI Risk Awareness and OpenAI Founding

Musk publicly stated that Superintelligence directly led him to prioritize AI safety and co-found OpenAI; Bostrom's arguments are an important source of Musk's AI safety position.

Co-thinkers

Toby Ord · Existential Risk Research Collaboration

Ord was another core researcher at FHI, co-developing the existential risk research framework with Bostrom, later publishing The Precipice.

Max Tegmark · AI Safety Advocacy Collaboration

Bostrom and Tegmark overlap in AI safety advocacy, both being important participants in the FLI AI safety open letter.

Peer Reviews

Bostrom's book is in many ways the leading statement of the case for treating risks from artificial intelligence as an existential priority. It has become the canonical reference point for this position.
Stuart Russell · Human Compatible: Artificial Intelligence and the Problem of Control, 2019

正在打开人物节点

Nick Bostrom

Core Knowledge Graph

Core Beliefs

Existential risk has absolute moral priority

Intelligence and goals are orthogonal—superintelligence will not automatically have human values

Instrumental convergence: almost all highly intelligent systems will converge on the same intermediate goals

The AI control problem is an extremely difficult technical and philosophical problem

Mental Models

Paperclip Maximizer

Orthogonality Thesis

Simulation Argument

Astronomical Stakes

Values & Paradoxes

Paradox of Being Both Transhumanist and Doomsayer

Gap Between Philosophical Arguments and Engineering Solutions

Evolution Phases

Transhumanism and Futurism Exploration

FHI Founding and Academicization of Existential Risk

Superintelligence Theory Explosion

Deep Utopia and FHI Closure

8 Key Events

Co-founded World Transhumanist Association

Published 'Existential Risks' paper, establishing x-risk theoretical framework

Published simulation argument, proposing trilemma hypothesis

Founded Future of Humanity Institute (FHI) at Oxford

Published Superintelligence, igniting mainstream AI safety discussion

Participated in FLI AI safety open letter, promoting AI safety policy discussion

Published Vulnerable World Hypothesis, exploring technology and civilization stability

Published Deep Utopia; FHI closed

Books

Recommended by (2)

Written by (2)

Origins, Contemporaries & Legacy

Influenced By

Influenced

Co-thinkers

Peer Reviews