Title: Artificial Intelligence and Computer Games
1Artificial Intelligence and Computer Games
- John Laird
- EECS 494
- University of Michigan
2What is AI?
- The study of computational systems that exhibit
intelligence. - Theories and computational models.
- What is intelligence?
- What people do.
- Leads to the Turing Test.
- Hard to separate out intelligent behavior from
other behavior. - Behave rationally Use available knowledge to
maximize goal achievement. - Often leads to optimization techniques.
- A set of capabilities Problem solving, learning,
planning, ...
3Different Practices of AI
- The study of rational behavior and processing.
- The study of human behavior and cognitive
processing. - The study of other approaches neural,
evolutionary. - Computational models of component processes
knowledge bases, inference engines, search
techniques, machine learning techniques. - Understanding the connection between domains
techniques. - Computational constraints vs. desired behavior.
- Application of techniques to real world problems.
4Roles of AI in Games
- Opponents
- Teammates
- Strategic Opponents
- Support Characters
- Autonomous Characters
- Commentators
- Camera Control
- Plot and Story Guides/Directors
5Basic Outline
- Discuss a variety of AI techniques relevant to
games. - Build up from simple systems and domains, to
complex agents and domains. - Reactive --gt goals--gt search --gt planning --gt
learning. - Cover different roles of AI in games
- Opponents in an action game
- Assistant/friend in an action game
- Opponents in a strategy games
- Assistant/friend in a strategy game
- Dungeon master, director, god, ...
- Little or nothing on AI programming
languages/systems - LISP, PROLOG, CLIPS, Soar, CYC, ...
6Goals of AI action game opponent
- Provide a challenging opponent
- Not always as challenging as a human -- Quake
monsters. - What ways should it be subhuman?
- Not too challenging.
- Should not be superhuman in accuracy, precision,
sensing, ... - Should not be too predictable.
- Through randomness.
- Through multiple, fine-grained responses.
- Through adaptation and learning.
7AI Agent in a Game
- Each time through control loop, tick each
agent. - Define an API for agents sensing and acting.
- Encapsulate all agent data structures.
- And so agents cant trash each other or the game.
- Share global data structures on maps, etc.
Agent 1
Agent 1
Player
Game
8Structure of an Intelligent Agent
- Sensing perceive features of the environment.
- Thinking decide what action to take to achieve
its goals, given the current situation and its
knowledge. - Acting doing things in the world.
- Thinking has to make up for limitations in
sensing and acting. - The more accurate the models of sensing and
acting, the more realistic the behavior.
9Why not just C code?
- Doesnt easily localize tests for next action to
take. - Hard to add new variables and new actions
- End up retesting all variables every time through.
10Sensing Limitations Complexities
- Limited sensor distance
- Limited field of view
- Must point sensor at location and keep it on.
- Obstacles
- Complex room structures
- Detecting and computing paths to doors
- Noise in sensors
- Different sensors give different information and
have different limitations. - Sound omni-directional, gives direction,
distances, speech, ... - Vision limited field of view, 2 1/2D, color,
texture, motion, ... - Smell omni-directional, chemical makeup.
- Need to integrate different sources to build
complete picture.
11Perfect Agent Unrealistic
- Sensing Have perfect information of opponent
- Thinking Have enough time to do any calculation.
- Know everything relevant about the world.
- Action Flawless, limitless action
- Teleport anywhere, anytime.
I know what to do!
12Simple Behavior
- Random motion
- Just roll the dice to pick when and which
direction to move - Simple pattern
- Follow invisible tracks Galaxians
- Tracking
- Pure Pursuit Move toward agents current
position - Head seeking missile
- Lead Pursuit Move to position in front of agent
- Collision Move toward where agent will be
- Weave Every N seconds move X degree off
opponents bearing - Spiral Head 90-M degrees off of opponents
bearing - Evasive opposite of any tracking
- Delay in sensing gives different effects
13Random
14Simple Patterns
15Pure Pursuit
16Lead Pursuit
17Collision
18Moving in the World Path Following
- Just try moving toward goal.
Goal
Source
19Problem
Goal
Source
20(No Transcript)
21(No Transcript)
22Path Planning
- Find a path from one point to another using an
internal model - Satisficing Try to find a good way to achieve a
goal - Optimizing Try to find best way to achieve goal
23Path Finding
5
3
10
4
7
2
2
6
7
3
5
1
7
3
5
6
8
3
2
2
1
2
2
2
0
2
5
4
6
24Analysis
- Find the shortest path through a maze of rooms.
- Approach is A
- At each step, calculate the cost of each expanded
path. - Also calculate an estimate of remaining cost of
path. - Extend path with the lowest cost estimate.
- Cost can be more than just distance
- Climbing and swimming are harder (2x)
- Monster filled rooms are really bad (5x)
- Can add cost to turning creates smoother paths
- But must be a numeric calculation.
- Must guarantee that estimate is not an
overestimate. - A will always find shortest path.
25Goals for Tutorial
- Exposure to AI on tactical decision making
- Not state-of-the-art AI, but relevant to Computer
Game - Concepts not code
- Analysis of strengths and weaknesses
- Pointers to more detailed references
- Enough to be dangerous
- Whats missing?
- Sensing models
- Path planning and spatial reasoning
- Scripting languages
- Teamwork, flocking,
- Personality, emotion,
- How all of these pieces fit together
26Plan for the Tutorial
- Cover a variety of AI decision-making techniques
- Finite-state Machines
- Decision Trees
- Neural Networks
- Genetic Algorithms
- Rule-based Systems
- Fuzzy Logic
- Planning Systems
- Maybe Soar
- Describe within the context of a simple game
scenario - Describe implementation issues
- Evaluate their strengths and weaknesses
27Which AI techniques are missing?
- Agent-based approaches
- A-Life approaches
- Bayesian, decision theoretic
- Blackboards
- Complex partial-order planning
- Logics
- Prolog Lisp
- Also not covering scripting
28Types of Behavior to Capture
- Wander randomly if dont see or hear an enemy.
- When see enemy, attack
- When hear an enemy, chase enemy
- When die, respawn
- When health is low and see an enemy, retreat
- Extensions
- When see power-ups during wandering, collect
them.
29(No Transcript)
30Conflicting Goals for AI in Games
Goal Driven
Reactive
Human Characteristics
Knowledge Intensive
Low CPU Memory Usage
Fast Easy Development
31Dimensions of Comparison
- Complexity
- Execution
- Specification
- Expressiveness
- Propositional
- Predicates and variables
32Complexity
- Complexity of Execution
- How fast does it run as more knowledge is added?
- How much memory is required as more knowledge is
added? - Complexity of Specification
- How hard is it to write the code?
- As more knowledge is added, how much more code
needs to be added? - Memory of prior events
33Expressiveness of Specification
- What can easily be written?
- Propositional
- Statements about specific objects in the world
no variables - Jim is in room7, Jim has the rocket launcher, the
rocket launcher does splash damage. - Go to room8 if you are in room7 through door14.
- Predicate Logic
- Allows general statement using variables
- All rooms have doors
- All splash damage weapons can be used around
corners - All rocket launchers do splash damage
- Go to a room connected to the current room.
34Memory
- Can it remember prior events?
- For how long?
- How does it forget?
35General References
- AI
- Winston Artifical Intelligence, 3rd Edition,
Addison Wesley, 1992 - Russell and Norvig Artificial Intelligence A
Modern Approach, Prentice Hall, 1995. - Nilsson, Artificial Intelligence A New
Synthesis, Morgan Kaufmann, 1998. - AI and Computer Games
- LaMothe Tricks of the Windows Game Programming
Gurus, SAMS, 1999, Chapter 12, pp. 713-796. - Deloura, Game Programming Gems, Charles River
Media, 2000, Section 3, pp. 219-350. - www.gameai.com
- www.gamedev.net/
- Chris Miles
36Finite State Machines
John Laird and Michael van Lent University of
Michigan AI Tactical Decision Making Techniques
37(No Transcript)
38(No Transcript)
39(No Transcript)
40Example FSM with Retreat
Attack-ES E,-D,S,-L
Retreat-S -E,-D,S,L
Attack-E E,-D,-S,-L
S
L
-S
L
Events EEnemy Seen SSound Heard DDie LLow
Health Each feature with N values can require N
times as many states
E
-L
-E
E
Retreat-ES E,-D,S,L
-L
E
Wander-L -E,-D,-S,L
-L
E
L
-S
-L
S
L
Retreat-E E,-D,-S,L
Wander -E,-D,-S,-L
-E
-E
E
D
D
Chase -E,-D,S,-L
D
D
Spawn D (-E,-S,-L)
S
41Extended FSM Save Values
Retreat L
Attack E, -L
L
Events EEnemy Seen SSound Heard DDie LLow
Health Maintain memory of current values of all
events transition event on old events
-L
-S
E
D
L
E
-E
E
Chase S,-L
Wander -E,-D,-S
S
D
-E
D
D
Spawn D (-E,-S,-L)
S
42Augmented FSM Action on Transition
Retreat L
Attack E, -L
L
Events EEnemy Seen SSound Heard DDie LLow
Health Execute action during transition
Action
-L
-S
E
D
L
E
-E
E
Chase S,-L
Wander -E,-D,-S
S
D
-E
D
D
Spawn D (-E,-S,-L)
S
43(No Transcript)
44(No Transcript)
45(No Transcript)
46Extended Augmented
- Use C class for states
- Methods for actions and transitions
47FSM Evaluation
- Advantages
- Very fast one array access
- Can be compiled into compact data structure
- Dynamic memory current state
- Static memory state diagram array
implementation - Can create tools so non-programmer can build
behavior - Non-deterministic FSM can make behavior
unpredictable - Disadvantages
- Number of states can grow very fast
- Exponentially with number of events s2e
- Number of arcs can grow even faster as2
- Hard to encode complex memories
- Propositional representation
- Difficult to put in pick up the better weapon,
attack the closest enemy
48References
- Web references
- www.gamasutra.com/features/19970601/build_brains_i
nto_games.htm - csr.uvic.ca/mmania/machines/intro.htm
- www.erlang/se/documentation/doc-4.7.3/doc/design_p
rinciples/fsm.html - www.microconsultants.com/tips/fsm/fsmartcl.htm
- Deloura, Game Programming Gems, Charles River
Media, 2000, Section 3.0 3.1, pp. 221-248.
49Decision Trees
- John Laird and Michael van Lent
- University of Michigan
- AI Tactical Decision Making Techniques
50Classification Problems
- Task
- Classify objects as one of a discrete set of
categories - Input set of facts about the object to be
classified - Is today sunny, overcast, or rainy
- Is the temperature today hot, mild, or cold
- Is the humidity today high or normal
- Output the category this object fits into
- Should I play tennis today or not?
- Put today into the play-tennis category or the
no-tennis category
51Example Problem
- Classify a day as a suitable day to play tennis
- Facts about any given day include
- Outlook ltSunny, Overcast, Raingt
- Temperature ltHot, Mild, Coolgt
- Humidity ltHigh, Normalgt
- Wind ltWeak, Stronggt
- Output categories include
- PlayTennis Yes
- PlayTennis No
- OutlookOvercast, TempMild, HumidityNormal,
WindWeak gt PlayTennisYes - OutlookRain, TempCool, HumidityHigh,
WindStrong gt PlayTennisNo - OutlookSunny, TempHot, HumidityHigh, WindWeak
gt PlayTennisNo
52Classifying with a Decision Tree
Outlook?
Sunny
Rain
Overcast
No
Temp?
Wind?
Hot
Cool
Strong
Mild
Weak
No
Yes
Yes
Yes
No
53Decision Trees
- Nodes represent attribute tests
- One child for each possible value of the
attribute - Leaves represent classifications
- Classify by descending from root to a leaf
- At root test attribute associated with root
attribute test - Descend the branch corresponding to the
instances value - Repeat for subtree rooted at the new node
- When a leaf is reached return the classification
of that leaf - Decision tree is a disjunction of conjunctions of
constraints on the attribute values of an instance
54Decision Trees are good when
- Inputs are attribute-value pairs
- With fairly small number of values
- Numeric or continuous values cause problems
- Can extend algorithms to learn thresholds
- Outputs are discrete output values
- Again fairly small number of values
- Difficult to represent numeric or continuous
outputs - Disjunction is required
- Decision trees easily handle disjunction
- Training examples contain errors
- Learning decision trees
- More later
55Decision-making as Classification
- How does classification relate to deciding what
to do in a situation? - Treat each output command as a separate
classification problem - Given inputs should walk gt ltforward, backward,
stopgt - Given inputs should turn gt ltleft, right, nonegt
- Given inputs should run gt ltyes, nogt
- Given inputs should weapon gt ltblaster, shotgungt
- Given inputs should fire gt ltyes, nogt
56Decision-making w/ Decision Trees
- Separate decision tree for each output
- Poll each decision tree for current output
- Poll each tree multiple times a second
- Event triggered like FSM
- Need current value of each input attribute/value
- All sensor inputs describe the state of the world
- Store the state of the environment
- Most recent sensor inputs
- Constantly update this state
- Constantly poll the decision trees to decide on
output - Constantly send outputs to be executed
57Sense, Think, Act Cycle
- Sense
- Gather input sensor changes
- Update state with new values
- Think
- Poll each decision tree
- Act
- Execute any changes to actions
Sense
Think
Act
58Example FSM with Retreat
Attack-ES E,-D,S,-L
Retreat-S -E,-D,S,L
Attack-E E,-D,-S,-L
S
L
-S
L
Events EEnemy SSound DDie LLow Health Each
new feature can double number of states
E
-L
-E
E
Retreat-ES E,-D,S,L
-L
E
Wander-L -E,-D,-S,L
-L
E
L
-S
-L
S
L
Retreat-E E,-D,-S,L
Wander -E,-D,-S,-L
-E
-E
E
D
D
Chase -E,-D,S,-L
D
D
Spawn D (-E,-S,-L)
S
59Decision Tree for Quake
- Input Sensors Eltt,fgt Lltt,fgt Sltt,fgt Dltt,fgt
- Categories (actions) Attack, Retreat, Chase,
Spawn, Wander
D?
t
f
Spawn
E?
t
f
L?
S?
t
t
f
f
Wander
Retreat
Attack
L?
t
f
Retreat
Chase
60Learning Decision Trees
- Decision trees are usually learned by induction
- Generalize from examples
- Induction doesnt guarantee correct decision
trees - Bias towards smaller decision trees
- Occams Razor Prefer simplest theory that fits
the data - Too expensive to find the very smallest decision
tree - Learning is non-incremental
- Need to store all the examples
- ID3 is the basic learning algorithm
- C4.5 is an updated and extended version
61Induction
- If X is true in every example X must always be
true - More examples are better
- Errors in examples cause difficulty
- Note that induction can result in errors
- Inductive learning of Decision Trees
- Create a decision tree that classifies the
available examples - Use this decision tree to classify new instances
- Avoid over fitting the available examples
- One root to node path for each example
- Perfect on the examples, not so good on new
instances
62Induction requires Examples
- Where do examples come from?
- Programmer/designer provides examples
- Capture a humans decisions
- of examples need depends on difficulty of
concept - More is always better
- Training set vs. Testing set
- Train on most (75) of the examples
- Use the rest to validate the learned decision
trees
63ID3 Learning Algorithm
- ID3 has two parameters
- List of examples
- List of attributes to be tested
- Generates tree recursively
- Chooses attribute that best divides the examples
at each step
- ID3(examples,attributes)
- if all examples in same category then
- return a leaf node with that category
- if attributes is empty then
- return a leaf node with the most common category
in examples - best Choose-Attribute(examples,attributes)
- tree new tree with Best as root attribute test
- foreach value vi of best
- examplesi subset of examples with best vi
- subtree ID3(examplesi,attributes best)
- add a branch to tree with best vi and subtree
beneath - return tree
64Entropy
- Entropy how mixed is a set of examples
- All one category Entropy 0
- Evenly divided Entropy log2( of examples)
- Given S examples Entropy(S) S pi log2 pi
where pi is the proportion of S belonging to
class i - 14 days with 9 in play-tennis and 5 in no-tennis
- Entropy(9,5) 0.940
- 14 examples with 14 in play-tennis and 0 in
no-tennis - Entropy (14,0) 0
65Information Gain
- Information Gain measures the reduction in
Entropy - Gain(S,A) Entropy(S) S Sv/S Entropy(Sv)
- Example 14 days Entropy(9,5) 0.940
- Measure information gain of Windltweak,stronggt
- Windweak for 8 days6,2
- Windstrong for 6 days 3,3
- Gain(S,Wind) 0.048
- Measure information gain of Humiditylthigh,normalgt
- 7 days with high humidity 3,4
- 7 days with normal humidity 6,1
- Gain(S,Humidity) 0.151
- Humidity has a higher information gain than Wind
- So choose humidity as the next attribute to be
tested
66Learning Example
- Learn a decision tree to replace the FSM
- Four attributes Enemy, Die, Sound, Low Health
- Each with two values true, false
- Five categories Attack, Retreat, Chase, Wander,
Spawn - Use all 16 possible states as examples
- Attack(2), Retreat(3), Chase(1) Wander(2),
Spawn(8) - Entropy of first 16 examples (max entropy 4)
- Entropy(2,3,1,2,8) 1.953
67Example FSM with Retreat
Attack-ES E,-D,S,-L
Retreat-S -E,-D,S,L
Attack-E E,-D,-S,-L
S
L
-S
L
Events EEnemy SSound DDie LLow Health Each
new feature can double number of states
E
-L
-E
E
Retreat-ES E,-D,S,L
-L
E
Wander-L -E,-D,-S,L
-L
E
L
-S
-L
S
L
Retreat-E E,-D,-S,L
Wander -E,-D,-S,-L
-E
-E
E
D
D
Chase -E,-D,S,-L
D
D
Spawn D (-E,-S,-L)
S
68Learning Example (2)
- Information gain of Enemy
- 0.328
- Information gain of Die
- 1.0
- Information gain of Sound
- 0.203
- Information gain of Low Health
- 0.375
- So Die should be the root test
69Learned Decision Tree
D?
t
f
Spawn
- 8 examples left 2,3,1,2 1.906
- 3 attributes remaining Enemy, Sound, Low Health
- Information gain of Enemy
- 0.656
- Information gain of Sound
- 0.406
- Information gain of Low Health
- 0.75
70Learned Decision Tree (2)
D?
t
f
Spawn
L?
t
f
- 4 examples on each side t 0.811 f 1.50
- 2 attributes remaining Enemy, Sound
- Information gain of Enemy (L f)
- 1.406
- Information gain of Sound (L t)
- .906
71Learned Decision Tree (3)
D?
t
f
Spawn
L?
t
f
S?
E?
t
f
t
f
Retreat
E?
Attack
S?
t
f
t
f
Retreat
Wander
Wander
Chase
72Decision Tree Evaluation
- Advantages
- Simpler, more compact representation
- State Memory
- Create internal sensors Enemy-Recently-Sensed
- Easy to create and understand
- Can also be represented as rules
- Decision trees can be learned
- Disadvantages
- Decision tree engine requires more coding than
FSM - Need as many examples as possible
- Higher CPU cost
- Learned decision trees may contain errors
73References
- Mitchell Machine Learning, McGraw Hill, 1997.
- Russell and Norvig Artificial Intelligence A
Modern Approach, Prentice Hall, 1995. - Quinlan Induction of decision trees, Machine
Learning 181-106, 1986. - Quinlan Combining instance-based and model-based
learning,10th International Conference on Machine
Learning, 1993.
74Neural Networks
- John Laird and Michael van Lent
- University of Michigan
- AI Tactical Decision Making Techniques
75Inspiration
- Mimic natural intelligence
- Networks of simple neurons
- Highly interconnected
- Adjustable weights on connections
- Learn rather than program
- Architecture is different
- Brain is massively parallel
- 1011 neurons
- Neurons are slow
- Fire 10-100 times a second
- Brain is much faster
- 1014 neuron firings per second for brain
- 106 perceptron firings per second for computer
76Simulated Neuron
- Inputs (aj) from other perceptrons with weights
(Wi,j) - Learning occurs by adjusting the weights
- Perceptron calculates weighted sum of inputs
(ini) - Threshold function calculates output (ai)
- Step function (if ini gt t then ai 1 else ai
0) - Sigmoid g(a) 1/1e-x
- Output becomes input for next layer of perceptron
aj
Wi,j
S Wi,j aj ini
ai
ai g(ini)
77Network Structure
- Single perceptron can represent AND, OR not XOR
- Combinations of perceptron are more powerful
- Perceptron are usually organized on layers
- Input layer takes external input
- Hidden layer(s)
- Output player external output
- Feed-forward vs. recurrent
- Feed-forward outputs only connect to later
layers - Learning is easier
- Recurrent outputs can connect to earlier layers
or same layer - Internal state
78Neural network for Quake
- Four input perceptron
- One input for each condition
- Four perceptron hidden layer
- Fully connected
- Five output perceptron
- One output for each action
- Choose action with highest output
- Probabilistic action selection
Enemy
Dead
Sound
Low Health
Attack
Wander
Spawn
Retreat
Chase
79Back Propagation
- Learning from examples
- Examples consist of input and correct output
- Learn if networks output doesnt match correct
output - Adjust weights to reduce difference
- Only change weights a small amount (?)
- Basic perceptron learning
- Wi,j Wi,j ?Wi,j
- Wi,j Wi,j ?(t-o)aj
- If output is too high (t-o) is negative so Wi,j
will be reduced - If output is too low (t-o) is positive so Wi,j
will be increased - If aj is negative the opposite happens
80Back propagation algorithm
- Repeat
- Foreach e in examples do
- O Run-Network(network,e)
- // Calculate error term for output layer
- Foreach perceptron in the output layer do
- Errk ok(1-ok)(tk-ok)
- // Calculate error term for hidden layer
- Foreach perceptron in the hidden layer do
- Errh oh(1-oh)SwkhErrk
- // Update weights of all neurons
- Foreach perceptron do
- Wi,j Wi,j ? (xij) Errj
- Until network has converged
81Neural Net Example
- Single perceptron to represent OR
- Two inputs
- One output (1 if either inputs is 1)
- Step function (if weighted sum gt 0.5 output a 1)
1
0.1
S Wj aj 0.1
0
g(0.1) 0
0.6
0
82Neural Net Example
- Wj Wj ?Wj
- Wj Wj ?(t-o)aj
- W1 0.1 0.1(1-0)1 0.2
- W2 0.6 0.1(1-0)0 0.6
0
0.2
S Wj aj 0.6
1
g(0.6) 0
0.6
1
- No error so no training occurs
83Neural Net Example
1
0.2
S Wj aj 0.2
0
g(0.2) 0
0.6
0
- Error so training occurs
- W1 0.2 0.1(1-0)1 0.3
- W2 0.6 0.1(1-0)0 0.6
1
0.3
S Wj aj 0.9
1
g(0.9) 1
0.6
1
84Neural Networks Evaluation
- Advantages
- Handle errors well
- Graceful degradation
- Can learn novel solutions
- Disadvantages
- Neural networks are the second best way to do
anything - Cant understand how or why the learned network
works - Examples must match real problems
- Need as many examples as possible
- Learning takes lots of processing
- Incremental so learning during play might be
possible
85References
- Mitchell Machine Learning, McGraw Hill, 1997.
- Russell and Norvig Artificial Intelligence A
Modern Approach, Prentice Hall, 1995. - Hertz, Krogh Palmer Introduction to the theory
of neural computation, Addison-Wesley, 1991. - Cowan Sharp Neural nets and artificial
intelligence, Daedalus 11785-121, 1988.
86Genetic Algorithms
- John Laird and Michael van Lent
- University of Michigan
- AI Tactical Decision Making Techniques
87Inspiration
- Evolution creates individuals with higher fitness
- Population of individuals
- Each individual has a genetic code
- Successful individuals (higher fitness) more
likely to breed - Certain codes result in higher fitness
- Very hard to know ahead which combination of
genes high fitness - Children combine traits of parents
- Crossover
- Mutation
- Optimize through artificial evolution
- Define fitness according to the function to be
optimized - Encode possible solutions as individual genetic
codes - Evolve better solutions through simulated
evolution
88Genetic Algorithm
- initialize population p with random genesrepeat
- foreach pi in p
- fi fitness(pi)
- repeat
- parent1 select(p,f)
- parent2 select(p,f)
- child1, child2 crossover(parent1,parent2)
- if (random lt mutate_probability)
- child1 mutate(child1)
- if (random lt mutate_probability)
- child2 mutate(child2)
- add child1, child2 to p
- until p is full
- p p
- Fitness(gene) the fitness function
- Select(population,fitness) weighted selection of
parents - Crossover(gene,gene) crosses over two genes
- Mutate(gene) randomly mutates a gene
89Genetic Operators
- Crossover
- Select two points at random
- Swap genes between two points
- Mutate
- Small probably of randomly changing each part of
a gene
90Representation
- Gene is typically a string of symbols
- Frequently a bit string
- Gene can be a simple function or program
- Evolutionary programming
- Every possible gene must encode a valid solution
- Crossover should result in valid genes
- Mutation should result in valid genes
- Intermediate genes intermediate fitness values
- Genetic algorithm is a hill climbing technique
- Smooth fitness functions provide a hill to climb
91Example FSM with Retreat
Attack-ES E,-D,S,-L
Retreat-S -E,-D,S,L
Attack-E E,-D,-S,-L
S
L
-S
L
Events EEnemy SSound DDie LLow Health Each
new feature can double number of states
E
-L
-E
E
Retreat-ES E,-D,S,L
-L
E
Wander-L -E,-D,-S,L
-L
E
L
-S
-L
S
L
Retreat-E E,-D,-S,L
Wander -E,-D,-S,-L
-E
-E
E
D
D
Chase -E,-D,S,-L
D
D
Spawn D (-E,-S,-L)
S
92Representing rules as bit strings
- Conditions
- Enemy ltt,fgt bits 1 and 2
- 10 Enemy t 01 Enemy f 11 Enemy t or f
00 Enemy has no value - Sound ltt,fgt bits 3 and 4
- Die ltt,fgt bits 5 and 6
- Low Health ltt,fgt bits 7 and 8
- Classification
- Action ltattack,retreat,chase,wander,spawngt
- Bits 9-13 10000 Action attack
- 1111101100001 If deadt then actionspawn
- Encode 1 rule per gene or many rules per gene
- Fitness function of examples classified
correctly
93Genetic Algorithm Example
- Initial Population
- 10 11 11 11 11010 E gt Attack or Retreat or
Wander - 11 10 10 11 10100 S D gt Attack or Chase
- 01 00 01 10 01100 -E -D L gt Retreat or Chase
- 10 10 10 11 00010 E S D gt Wander
- ...
- Parent Selection
- 10 11 11 11 11010 Sometimes correct
- 11 10 10 11 10100 Never correct
- 01 00 01 10 01100 Sometimes correct
- 10 10 10 11 00010 Never correct
- ...
94Genetic Algorithm Example
- Crossover
- 10 11 11 11 11010 Sometimes correct
- 01 00 01 10 01100 Sometimes correct
- 10 10 01 10 01010 E S -D L gt Retreat or Wander
- 01 01 11 11 11100 -E -S gt Attack or Retreat or
Chase - Mutate
- 10 10 01 10 01000 E S -D L gt Retreat
- 01 01 11 11 11100 -E -S gt Attack or Retreat or
Chase - Add to next generation
- 10 10 01 10 01000 Always correct
- 01 01 11 11 11100 Never correct
- ...
95Cloak and Dagger DNA
- Simple turn-based strategy game
- Genes encode programs to play the game
- 12 simple instructions
- Fitness measures success of programs
- Fitness measured against rest of population
- Play members of population against each other
- Crossover and mutation create new programs
- Crossover combine the first n instructions from
parent 1 with the last m instructions from parent
2 - Child can have more or less instructions than the
parents - Mutation replace an instruction with a random
instruction
96Genetic Algorithm Evaluation
- Advantages
- Powerful optimization technique
- Can learn novel solutions
- No examples required to learn
- Disadvantages
- Genetic algorithms are the third best way to do
anything - Finding correct representation can be tricky
- Fitness function must be carefully chosen
- Evolution takes lots of processing
- Cant really run a GA during game play
- Solutions may or may not be understandable
97References
- Mitchell Machine Learning, McGraw Hill, 1997.
- Holland Adaptation in natural and artificial
systems, MIT Press 1975. - Back Evolutionary algorithms in theory and
practice, Oxford University Press 1996. - Booker, Goldberg, Holland Classifier systems
and genetic algorithms, Artificial Intelligence
40 235-282, 1989.
98Rule-based Systems(Production Systems)
- John Laird and Michael van Lent
- University of Michigan
- AI Tactical Decision Making Techniques
99History of Rule-based Systems
- Originally developed by Allen Newell and Herb
Simon - To model human problem solving PSG
- Followed by OPS languages (OPS1-5, OPS-83) and
descendants CLIPS, ECLIPS, - Used extensively in building expert systems and
knowledge-based systems. - Used in psychological modeling.
- Act-R, Soar, EPIC,
- Actively used in many areas of AI Applications
- Less used in research (except by us!).
100(No Transcript)
101(No Transcript)
102Basic Cycle
Match
Rule instantiations that match working memory
Changes to Working Memory
Conflict Resolution
Act
Selected Rule
103(No Transcript)
104Simple Approach
- No rules with same variable in multiple
conditions. - Restricts what you can write, but might be ok for
simple systems.
105Limits of Simple Approach
- Cant use variables in condition
- Cant pick something from a set
- If the closest enemy is carrying the same weapon
that I am carrying, then - Must have pre-computed data structures or
function calls for comparisons - More work for sensor module
106Picking the rule to fire
- Simple approach
- Run through rules one at a time and test
conditions - Pick the first one that matches
- Time to match depends on
- Number of rules
- Complexity of conditions
- Number of rules that dont match
107(No Transcript)
108Picking the next rule to fire
- If only simple tests in conditions,
compile rules into a match net. - Process changes to working memory hash into tests
R1 If A, B, C, then
A
B
C
D
R2 If A, B, D, then
R1
R2
Bit vectors for rules if all bits are set, add to
conflict set
Expected cost Linear in the number of changes to
working memory
Conflict Set
109More Complex Rule-based Systems
- Allow complex conditions with multiple variables
- Function calls in conditions and actions
- Can compute many relations using rules
- Examples
- OPS5, OPS83, CLIPS, ART, ECLIPS, Soar, EPIC,
110OPS5 Working Memory Syntax
- Set of working memory elements (WME) records
- WME class name and list of attribute value pairs
- (self health 100 weapon blaster)
- (enemy visible true
- name joe
- range 400)
- (item visible true
- name small-health
- type health
- range 500)
- (command action arg1 arg2)
111Literalize
- (literalize self health weapon ammo heading)
- Declares the legal attributes for a class
- OPS5 assigns positions in record for attributes
- Working memory objects are a complete record
- If one attribute modified, delete and add in new
record
Health weapon ammo heading
112Example Initial Working Memory
- (self health 1000
- x 230 y 34
- weapon blaster
- ammo full)
- (enemy visible true
- name joe
- range 400)
- (command)
- Each WME has a unique timetag associated with it.
113Prototype Ops5 Rule
- (p name
- (condition )
-
- --gt
- (actions))
(p name (condition ) --gt (actions))
(p name (condition ) --gt (actions))
(p name (condition ) --gt (actions))
(p name (condition ) --gt (actions))
(p name (condition ) --gt (actions))
(p name (condition ) --gt (actions))
114OPS5 Rule Syntax
- (p attack
- (enemy visible true
- name ltnamegt)
- (command)
- --gt
- (modify 2 action attack
- arg1 ltnamegt))
115(No Transcript)
116(No Transcript)
117(No Transcript)
118Even More Rules
- If ammunition is low and see weapon in room
that is same as current weapon, go get it - (p get-weapon-ammo-low
- (self ammo low
- weapon ltweapongt)
- (item visible true
- type ltweapongt)
- (command)
- --gt
- (modify 3 action get-item arg1 ltweapongt)
- Both occurrences of ltweapongt must match same
value.
119Even More Rules 2
- If ammunition is high and see weapon, only pick
it up if it is not the same as the current weapon - (p get-weapon-ammo-low
- (self ammo high weapon ltweapongt)
- (item visible true
- type ltnweapongt
- -type ltweapongt)
- (command)
- --gt
- (modify 3 action get-item
- arg1 ltnweapongt)
120Even More Rules 3
- If ammunition is high and see weapon, only pick
it up if it is not the same as the current weapon - (p get-weapon-ammo-low
- (self ammo high
- weapon ltweapongt)
- (item visible true
- type ltgt ltweapongt ltnweapongt )
- (command)
- --gt
- (modify 3 action get-item
- arg1 ltnweapongt)
121Summary of OPS5 Syntax
- Conditions
- Variables ltxgt
- Negation -(object type monkey)
- Predicates gt ltxgt
- Conjunctive tests ltgt ltygt ltxgt
- Disjunctive tests ltlt monkey ape gtgt
- Many languages allow function calls in
conditions. - Actions
- Add, Delete, Modify Working Memory
- Function calls
122(No Transcript)
123Conflict Resolution Filters
- Select between instantiations based on filters
- Refractory Inhibition
- Dont fire same instantiation that has already
fired - Data Recency
- Select instantiations that match most recent data
- Specificity
- Select instantiations that match more working
memory elements - Random
- Select randomly between the remaining
instantiations
124Data Recency Lexical graphic Ordering
- R1 1, 5, 8
- R2 1, 5, 6, 7
- R3 2, 5, 9
- R4 1, 6, 7, 8
- R4 1, 5, 5, 7
- R5 2, 5, 9
- R6 5, 9
R3 9, 5, 2 R5 9, 5, 2 R6 9, 5 R4 8, 7, 6,
1 R2 8, 5, 1 R1 7, 6, 5, 1 R4 7, 5, 5, 1
125Specificity
- Pick the rule instantiation based on more tests
- Only invoked when two rules match the same WMEs
- More reasons to do this action
- (p get-weapon-ammo-low
- (self ammo high weapon ltweapongt)
- (item visible true type ltgt ltweapongt
ltnweapongt ) - (command)
- --gt
- (modify 3 action get-item arg1 ltnweapongt)
- (p get-item
- (self name ltnamegt)
- (item visible true type lttypegt)
- (command)
- --gt
- (modify 3 action get-item
- arg1 lttypegt))
126Other Conflict Resolution Strategies(not in OPS5)
- Rule order pick the first rule that matches
- In original PSG
- Makes order of loading important not good for
big systems - Rule importance pick rule with highest priority
- When a rule is defined, give it a priority number
- Forces a total order on the rules is right 80
of the time - Decide Rule 4 80 is better than Rule 7 70
- Decide Rule 6 85 is better than Rule 5 75
- Now have ordering between all of them even if
wrong
127(No Transcript)
128Basic Idea of Efficient Matching
- Only process the changes to working memory
- Save intermediate match information (RETE)
- Compile rules into discrimination network
- Share intermediate match information between
rules - Recompute intermediate information for changes
- Requires extra memory for intermediate match
information - Scales well to large rule sets
- Recompute match for rules affected by change
(TREAT) - Check changes against rules in conflict set
- Less memory than Rete
- Doesnt scale as well to large rule sets
- Both make extensive use of hashing
129(No Transcript)
130(No Transcript)
131(No Transcript)
132Rule-based System Evaluation
- Advantages
- Corresponds to way people often think of
knowledge - Very expressive
- Modular knowledge
- Easy to write and debug compared to decision
trees - More concise the FSM
- Disadvantages
- Can be memory intensive
- Can be computationally intensive
- Sometimes difficult to debug
133References
- RETE
- Forgy, C. L. Rete A fast algorithm for the many
pattern/many object pattern match problem.
Artificial Intelligence, 19(1) 1982, pp. 17-37. - TREAT
- Miranker, D. TREAT A new and efficient match
algorithm for AI production systems.
Pittman/Morgan Kaufman, 1989.
134Fuzzy Logic
- John Laird and Michael van Lent
- University of Michigan
- AI Tactical Decision Making Techniques
135Fuzzy Logic
- Philosophical approach
- Ontological commitment based on degree of truth
- Is not a method for reasoning under uncertainty
- See probability theory and Bayesian inference
- Crisp Facts distinct boundaries
- Fuzzy Facts imprecise boundaries
- Example Scout reporting an enemy
- Two to three tanks at grid NV 123456 (Crisp)
- A few tanks at grid NV 123456 (Fuzzy)
- The water is warm. (Fuzzy)
- There might be 2 tanks at grid NV 54
(Probabilistic)
136Fuzzy Rules
- If the water temperature is cold and water flow
is low then make a positive bold adjustment to
the hot water valve. - If position is unobservable, threat is somewhat
low, and visibility is high then risk is low.
Fuzzy Variable
Fuzzy Value represented as a fuzzy set
Fuzzy Modifier or Hedge
137Fuzzy Sets
- Classical set theory
- An object is either in or not in the set.
- Sets with smooth boundary
- Not completely in or out somebody 6 is 80
tall - Fuzzy set theory
- An object is in a set by matter of degree
- 1.0 gt in the set
- 0.0 gt not in the set
- 0.0 lt object lt 1.0 gt partially in the set
- Provides a way to write symbolic rules but add
numbers in a principled way
138Apply to Computer Game
- Can have different characteristics of entities
- Strength strong, medium, weak
- Aggressiveness meek, medium, nasty
- If meek and attacked, run away fast.
- If medium and attacked, run away slowly.
- If nasty and strong and attacked, attack back.
- Control of a vehicle
- Should slow down when close to car in front
- Should speed up when far behind car in front
- Provides smoother transitions not a sharp
boundary
139(No Transcript)
140(No Transcript)
141(No Transcript)
142Fuzzy Inference
- Fuzzy Matching
- Calculate the degree to which given facts (WMES)
match rules pick rule with best match - Inference
- Calculate the rules conclusion based on its
matching degree. - Combination
- Combine conclusions inferred by all fuzzy rules
into a final conclusion - Defuzzification
- Convert fuzzy conclusion into a crisp consequence.
143(No Transcript)
144(No Transcript)
145Fuzzy Combination
- Why?
- More than one rule may match because of
overlapping fuzzy sets - May trigger multiple fuzzy rules
- Example of Multiple Rule Match
- if shower is cold then turn hot valve up boldly.
- if shower is somewhat warm then turn hot valve up
slightly - Union of all the calculated fuzzy consequences in
Working Memory.
146Defuzzification
- Why?
- May require a crisp value for output to
environment. - May trigger multiple fuzzy rules
- Two Methods
- Mean of Maximum (MOM)
- Center of Area (COA)
147(No Transcript)
148Evaluation of Fuzzy Logic
- Does not necessarily lead to non-determinism
- Advantages
- Allows use of numbers while still writing crisp
rules - Allows use of fuzzy concepts such as medium
- Biggest impact is for control problems
- Help avoid discontinuities in behavior
- Disadvantages
- Sometimes results are unexpected and hard to
debug - Additional computational overhead
- Change in behavior may or may not be significant
149References
- Nguyen, H. T. and Walker, E. A. A First Course in
Fuzzy Logic, CRC Press, 1999. - Rao, V. B. and Rao, H. Y. C Neural Networks and
Fuzzy Logic, IGD Books Worldwide, 1995. - McCuskey, M. Fuzzy Logic for Video Games, in Game
Programming Gems, Ed. Deloura, Charles River
Media, 2000, Section 3, pp. 319-329.
150Planning
- John Laird and Michael van Lent
- University of Michigan
- AI Tactical Decision Making Techniques
151What is Planning?
- Plan sequence of actions to get from the current
situation to a goal situation - Higher level mission planning
- Path planning
- Planning generate a plan
- Initial state the state the agent starts in or
is currently in - Goal test is this state a goal state
- Operators every action the agent can perform
- Also need to know how the action changes the
current state - Note at this level planning doesnt take
opposition into account
152Two Approaches
- State-space search
- Search through the possible future states that
can be reached by applying different sequences of
operators - Initial state current state of the world
- Operators actions that modify the world state
- Goal test is this state a goal state
- Plan-space search
- Search through possible plans by applying
operators that modify plans - Initial state empty plan (do nothing)
- Operators add an action, remove an action,
rearrange actions - Goal test does this plan achieve the goal
153Two Approaches
Plan-space Search
State-space Search
154Traditional planning and combat simulations
- Combat simulations are difficult for traditional
AI planning - Opponent messes up the plan
- Environment changes messing up the plan
- Goal state is hard to define and subject to
change - Lots of necessary information is unavailable
- Too many steps between start and finish of
mission - Some applications of traditional AI planning
- Path planning
- State-space search algorithms like A
- Game theoretical search
- State-space search algorithms with opponents like
min-max and alpha-beta
155What concepts are useful?
- Look-ahead search
- Internal state representation
- Internal action representation
- State evaluation function
- Opponent model
- Means-ends analysis
156What should I do?
Shoot?
Pickup?
Pickup?
157Look-ahead search
- Try out everything I could do and see what works
best - Looking ahead into the future
- As opposed to hard-coded behavior rules
- Cant look-ahead in real world
- Dont have time to try everything
- Cant undo actions
- Look-ahead in an internal version of the world
- Internal state representation
- Internal action representation
- State evaluation function
158Internal State Representation
- Store a model of the world inside your head
- Simplified, abstracted version
- Experiment with different actions internally
- Simple planning
- Additional uses of internal state
- Notice changes
- My health is dropping, I must be getting shot in
the back - Remember recent events
- There was a weak enemy ahead, I should chase
through that door - Remember less recent events
- I picked up that health pack 30 seconds ago, it
should respawn soon
159Internal State for Quake II
- Self
- Current-health
- Last-health
- Current-weapon
- Ammo-left
- Current-room
- Last-room
- Current-armor
- Last-armor
- Available-weapons
- Enemy
- Current-weapon
- Current-room
- Last-seen-time
- Estimated-health
- Powerup
- Type
- Room
- Available
Parameters Full-health Health-powerup-amount Am
mo-powerup-amount Respawn-rate
160Internal Action Representation
- How will each action change the internal state?
- Simplified, abstracted also
- Necessary for internal experiments
- Experiments are as accurate as the internal
representation - Internal actions are called operators
(STRIPS-style) - Pre-conditions what must be true so I can take
this action - Effects how action changes internal state
- Additional uses of internal actions
- Update internal opponent model
161Pick-up-health operator
- Preconditions
- Self.current-room x
- Self.current-health lt full-health
- Powerup.current-room x
- Powerup.type health
- Powerup.available yes