Title: AI as a Precise Art
1AI as a Precise Art
Eliezer Yudkowsky Singularity Institute for
Artificial Intelligence singinst.org
2- Cards 70 blue, 30 red, randomized sequence
- Subjects paid 5 for each correct guess
- Subjects only guessed blue 76 of the time (on
average) - Optimal strategy is "Always guess blue"
- Strategy need not resemble cards - noisy strategy
doesn't help in noisy environment
(Tversky, A. and Edwards, W. 1966. Information
versus reward in binary choice. Journal of
Experimental Psychology, 71, 680-683.)
Eliezer Yudkowsky Singularity
Institute for AI
3- Vernor Vinge Can't predict any entity smarter
than you, or you would be that smart - Deep Blue played better chess than its
programmers, from which it follows that
programmers couldn't predict exact move - Why go to all that work to write a program whose
moves you couldn't predict? Why not just use a
random move generator? - Takes vast amount of work to craft AI actions
predictably so good you can't predict them - We run a program because we know something about
the output and we don't know the output
Eliezer Yudkowsky Singularity
Institute for AI
4- Gilovich If we wish to disbelieve, we ask if
the evidence compels us to accept the
discomforting belief. If we wish to believe, we
ask if the evidence prohibits us from keeping our
preferred belief. - The less you know, the less likely you are to get
good results, but the easier it is to allow
yourself to believe in good results.
(Gilovich, T. 2000, June. Motivated skepticism
and motivated credulity Differential standards
of evidence in the evaluation of desired and
undesired propositions. Address presented at the
12th Annual Convention of the American
Psychological Society, Miami Beach, Florida.
Quoted in Brenner, L. A., Koehler, D. J. and
Rottenstreich, Y. 2002. "Remarks on support
theory Recent advances and future directions."
In Gilovich, T., Griffin, D. and Kahneman, D.
eds. 2003. Heuristics and Biases The Psychology
of Intuitive Judgment. Cambridge, U.K. Cambridge
University Press.)
Eliezer Yudkowsky Singularity
Institute for AI
5Mind Projection Fallacy
- If I am ignorant about a phenomenon,
- this is a fact about my state of mind,
- not a fact about the phenomenon.
- Confusion exists in the mind, not in reality.
- There are mysterious questions.
- Never mysterious answers.
- (Inspired by Jaynes, E.T. 2003. Probability
Theory The Logic of Science. Cambridge
Cambridge University Press.)
Eliezer Yudkowsky Singularity
Institute for AI
6"The influence of animal or vegetable life on
matter is infinitely beyond the range of any
scientific inquiry hitherto entered on. Its power
of directing the motions of moving particles, in
the demonstrated daily miracle of our human
free-will, and in the growth of generation after
generation of plants from a single seed, are
infinitely different from any possible result of
the fortuitous concurrence of atoms... Modern
biologists were coming once more to the
acceptance of something and that was a vital
principle."
Eliezer Yudkowsky Singularity
Institute for AI
7Intelligence Explosion
- Hypothesis The smarter you are, the more
creativity you can apply to the task of making
yourself even smarter. - Prediction Positive feedback cycle rapidly
leading to superintelligence. - Extreme case of more common belief that
reflectivity / self-modification is one of the
Great Keys to AI.
(Good, I. J. 1965. Speculations Concerning the
First Ultraintelligent Machine. Pp. 31-88 in
Advances in Computers, 6, F. L. Alt and M.
Rubinoff, eds. New York Academic Press.)
Eliezer Yudkowsky Singularity
Institute for AI
8- If a transistor operates today, the chance that
it will fail before tomorrow is greater than 10-6
(1 failure per 3,000 years)
Eliezer Yudkowsky Singularity
Institute for AI
9- If a transistor operates today, the chance that
it will fail before tomorrow is greater than 10-6
(1 failure per 3,000 years) - But a modern chip has millions of transistors
Eliezer Yudkowsky Singularity
Institute for AI
10- If a transistor operates today, the chance that
it will fail before tomorrow is greater than 10-6
(1 failure per 3,000 years) - But a modern chip has millions of transistors
- Possible because most causes of transistor
failure not conditionally independent for each
transistor
Eliezer Yudkowsky Singularity
Institute for AI
11- If a transistor operates today, the chance that
it will fail before tomorrow is greater than 10-6
(1 failure per 3,000 years) - But a modern chip has millions of transistors
- Possible because most causes of transistor
failure not conditionally independent for each
transistor - Similarly, an AI that remains stable over
millions of self-modifications cannot permit any
significant probability of failure which applies
independently to each modification
Eliezer Yudkowsky Singularity
Institute for AI
12- Modern chip may have 155 million interdependent
parts, no patches after it leaves the factory
Eliezer Yudkowsky Singularity
Institute for AI
13- Modern chip may have 155 million interdependent
parts, no patches after it leaves the factory - A formal proof of ten billion steps can still be
correct (try this with informal proof!)
Eliezer Yudkowsky Singularity
Institute for AI
14- Modern chip may have 155 million interdependent
parts, no patches after it leaves the factory - A formal proof of ten billion steps can still be
correct (try this with informal proof!) - Humans too slow to check billion-step proof
Eliezer Yudkowsky Singularity
Institute for AI
15- Modern chip may have 155 million interdependent
parts, no patches after it leaves the factory - A formal proof of ten billion steps can still be
correct (try this with informal proof!) - Humans too slow to check billion-step proof
- Automated theorem-provers don't exploit enough
regularity in the search space to handle large
theorems
Eliezer Yudkowsky Singularity
Institute for AI
16- Modern chip may have 155 million interdependent
parts, no patches after it leaves the factory - A formal proof of ten billion steps can still be
correct (try this with informal proof!) - Humans too slow to check billion-step proof
- Automated theorem-provers don't exploit enough
regularity in the search space to handle large
theorems - Human mathematicians can do large proofs
Eliezer Yudkowsky Singularity
Institute for AI
17- Modern chip may have 155 million interdependent
parts, no patches after it leaves the factory - A formal proof of ten billion steps can still be
correct (try this with informal proof!) - Humans too slow to check billion-step proof
- Automated theorem-provers don't exploit enough
regularity in the search space to handle large
theorems - Human mathematicians can do large proofs
- ...but not reliably
Eliezer Yudkowsky Singularity
Institute for AI
18Solution HumanAI
- Human generates lemmas, mysteriously avoiding
exponential explosion of search space - Complex theorem-prover generates formal proof
leading to next lemma - Simple verifier checks proof
- Could an AGI use a similar combination of
abilities to carry out deterministic
self-modifications?
Eliezer Yudkowsky Singularity
Institute for AI
19Solution Human/AI synergy
- Human generates lemmas, mysteriously avoiding
exponential explosion of search space - Complex theorem-prover generates formal proof
- Simple verifier checks proof
- Could an AGI use a similar combination of
abilities to carry out deterministic
self-modifications?
Eliezer Yudkowsky Singularity
Institute for AI
20- Inside of a chip is deterministic environment
- Possible to achieve determinism for things that
happen inside the chip
Eliezer Yudkowsky Singularity
Institute for AI
21- Inside of a chip is deterministic environment
- Possible to achieve determinism for things that
happen inside the chip - Success in external world not deterministic, but
AI can guarantee that its future self will try to
accomplish the same things this cognition
happens within the chip
Eliezer Yudkowsky Singularity
Institute for AI
22- Inside of a chip is deterministic environment
- Possible to achieve determinism for things that
happen inside the chip - Success in external world not deterministic, but
AI can guarantee that its future self will try to
accomplish the same things this cognition
happens within the chip - AI cannot predict future self's exact action, but
knows criterion that future action will fit
Eliezer Yudkowsky Singularity
Institute for AI
23Difficult to formalize argument!
- Bayesian framework breaks down on infinite
recursion - Not clear how to calculate the expected utility
of changing the code that calculates the expected
utility of changing the code...
Eliezer Yudkowsky Singularity
Institute for AI
24Difficult to formalize argument!
- Bayesian framework breaks down on infinite
recursion - Not clear how to calculate the expected utility
of changing the code that calculates the expected
utility of changing the code... - Yet humans don't seem to break down when
imagining changes to themselves - Never mind an algorithm that does it efficiently
how would you do it at all?
Eliezer Yudkowsky Singularity
Institute for AI
25Wanted Reflective Decision Theory
- We have a deep understanding of
- Bayesian probability theory
- Bayesian decision theory
- Causality and conditional independence
- We need equally deep understanding of
- Reflectivity
- Self-modification
- Designing AI will be a precise art when we know
how to make an AI design itself
Eliezer Yudkowsky Singularity
Institute for AI
26Thank you.
Eliezer Yudkowsky Singularity Institute for
Artificial Intelligence singinst.org