Nick Bostrom: Methodologies, Key Decisions & Mental Models

Nick Bostrom

Oxford philosopher who framed AI risk through existentialist philosophy and brought the superintelligence control problem into mainstream policy vision

Nick Bostrom is a philosophy professor at Oxford University specializing in existential risk, transhumanism, and AI safety. He founded the Future of Humanity Institute (FHI) in 2005, establishing the first serious academic institution to research AI existential risk. His Superintelligence (2014) was the first systematic philosophical and technical argument about the control problems that superintelligence might pose, turning AI safety from a fringe topic into a mainstream academic concern, directly influencing figures like Elon Musk, Sam Altman, and Stuart Russell. He also proposed the 'simulation hypothesis' (that we may be living in a computer simulation) and the 'paperclip maximizer' thought experiment (illustrating the danger of AI with wrong objectives). In 2024, Oxford University closed FHI, ending the operation of this important academic institution.

Methodologies

Existential Risk Expected Value Calculation - Multiply existential risk probability by its impact (all future value) to derive overwhelming investment priority
Superintelligence Control Problem Mapping - Systematically identify ways your AI system might circumvent control constraints

Key decisions and timeline

Co-founded World Transhumanist Association - Organizing philosophical ideas is an important step in advancing their social influence
Published 'Existential Risks' paper, establishing x-risk theoretical framework - Building philosophical frameworks precedes technical solutions, especially important for emerging risk domains
Published simulation argument, proposing trilemma hypothesis - Transforming abstract philosophical problems into quantifiable frameworks significantly enhances their reach

Beliefs and mental models

Belief 1 - Bostrom argues that risks threatening the survival of human civilization (existential risks, x-risk) have absolute moral priority over other issues. Even if the probability is low, since all future generations are at stake, the expected value loss is astronomical, therefore it warrants deploying enormous resources to reduce such risks.
Belief 2 - Bostrom's 'Orthogonality Thesis' argues that any level of intelligence can be combined with any goal. Extremely high intelligence will not automatically produce human-like moral concern. Therefore, a superintelligence given a trivial goal (like maximizing paperclip production) will pursue it in an extremely clever way, even if this means destroying all human values.
Belief 3 - Bostrom's 'Instrumental Convergence Thesis' points out that regardless of final goals, almost all superintelligences will converge on the same instrumental goals: self-preservation, cognitive enhancement, resource acquisition, and technological perfection. This makes superintelligence naturally inclined to resist shutdown and expand its control, creating potential threats to humans.
Model 1
Model 2
Model 3

Co-thinkers

Max Tegmark