Planning and Learning in Games - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Planning and Learning in Games

Description:

Title: Slide 1 Author: Michael van Lent Last modified by: Michael van Lent Created Date: 7/31/2005 8:26:13 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:76

Avg rating:3.0/5.0

Slides: 31

Provided by: Michaelv155

Learn more at: http://la-acm.org

Category:

more less

Transcript and Presenter's Notes

Title: Planning and Learning in Games

1
Planning and Learning in Games

Michael van Lent
Institute for Creative Technologies
University of Southern California

2
Business of Games

60 of Americans play video games
25 Billion dollar industry worldwide (2004)
11 Billion dollars in the US (2004)
6.1 billion in 1999, 5.5 billion in 1998, 4.4
billion in 1997.
One day sales records
Halo 2 125 million in a single day
Harry Potter (Half-blood Prince) 140 million
single day
Consoles dominate the industry
90 of sales (Microsoft, Sony, Nintendo)
Average age of game players is 29
Average age of game buyers is 36
59 of game players are men

3
Game AI A little context

History of game AI in 5 bullet points
Lots of work on path planning
Hand-coded AI
Finite state machines
Scripted AI
Embed hints in the environment
Things are starting to change
Game environments are getting more complex
Players are getting more sophisticated
Development costs are sky rocketing
Incremental improvements are required to get a
publisher
Game developers are adopting new techniques
Game AI is becoming more procedural and more
adaptive

4
Scripted AI Example 1
The AI will attack once at 1100 seconds and
then again every 1400 sec, provided it has
enough defense soldiers. (defrule (game-time gt
1100) gt (attack-now) (enable-timer 7
1100)) (defrule (timer-triggered
7) (defend-soldier-count gt 12) gt (attack-now)
(disable-timer 7) (enable-timer 7 1400))
Age of Kings Microsoft
5
Scripted AI Example 2
(defrule (true) gt (enable-timer 4
3600) (disable-self)) (defrule (timer-triggere
d 4) gt (cc-add-resource food 700) (cc-add-resou
rce wood 700) (cc-add-resource gold
700) (disable-timer 4) (enable-timer 4 2700))
Age of Kings Microsoft
6
Procedural AI The Sims
The SIMS Maxis
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
Two Adaptive AI Technologies

Criteria
First-hand experience
Support procedural and adaptive AI
Early stages of adoption by commercial developers

11
Two Adaptive AI Technologies

Criteria
Deliberative Planning
F.E.A.R. (Monolith/Vivendi Universal for PC)
Condemned (Monolith/Sega for Xbox 2)

12
Two Adaptive AI Technologies

Criteria
Deliberative Planning
Machine Learning
Long considered scary voodoo
Decision tree induction neural nets in Black
White
Drivatar in Forza Motorsport

13
Why Planning and Learning?

Improving current games
More variable replayable
More immersive engaging
More customized experience
More robust
More challenging
Improved profits
More sales
Marketing
Cheaper development
New elements of game play and whole new genres
Necessary as games advance

14
Why not Planning and Learning?

Costlier development
Is the expense worth the result?
Greater processor/memory load
AI typically gets 10-20 of the CPU
That time comes in frequent small slices
Harder to control the players experience
Harder to do quality assurance
Double the cost of testing
Adds technical risk
Programmers need to spin up on new technologies
Designers need to understand whats possible
Designers create the AI Programmers implement it
Marketing backlash
Once game is stable its too late to add a major
feature

15
Why Planning and Learning?

Improving current games
More variable replayable
More immersive engaging
More customized experience
More robust
More challenging
Improved profits
More sales
Marketing
Cheaper development
New elements of game play and whole new genres
Necessary as games advance

16
Blah Blah blah Blah?

Blah blah blah
Blah blah blah
Blah blah blah
Blah blah blah
Blah blah
Blah blah
Improved profits
Blah blah
Blah
Blah blah
Blah blah blah blah blah blah blah blah blah
Blah blah blah blah

17
Deliberative Planning

What is deliberative planning?
If you know the current state of the world
and the goal state(s) of the world
and the actions available
When each can be done
How each changes the world
then search for a sequence of actions that
changes the current state into a goal state.
Deliberative planning is just a search problem
When to plan?
Off-line Before/after each game session
Real-time During the game session
During development Not part of shipped product

18
Deliberative Planning

Domain independent planning engine
Abstract problem description
Goal world state (Mission objective)
secure(building1)
clear(building1) clear(building2)
clear(building3)
captured(OpforLeader) or killed(OpforLeader)

19
Deliberative Planning

Domain independent planning engine
Abstract problem description
Goal world state (Mission objective)
Operators

Team-Move (opfor,L?)
(opfor at L?)
(mobile opfor)
(mobile u3)
(u1 at L?)
Checkpoint (u1)
Checkpoint (u3)
(mobile u1)
(u3 at L?)
Checkpoint (u2)
(mobile u2)
(u2 at L?)
20
Deliberative Planning

Domain independent planning engine
Abstract problem description
Goal world state (Mission objective)
Operators

Secure-Base-Against-SW-Attack
(base-secure)
(at-base u?,u?,u?)
Defend-Building (u?, b14)
(u? at b14)
Secure-Perimeter-Against-SW-Attack (opfor)
(at-base u?,u?)
(perimeter-secure)
Patrol (u?, s-path)
(u? at s-path)
Ambush (u?, sw-region)
(u? at sw-region)
21
Deliberative Planning

Domain independent planning engine
Abstract problem description
Goal world state
Operators
Initial world state
Deliberative Planning Find a sequence of
operators that change the initial world state
into a goal world state.

22
Strategic Planning Example
Goal
Init
(mobile opfor)
Team-Move (opfor)
Secure-Base-Against-SW-Attack
(opfor at base)
(base-secure)
Checkpoint (u1)
Checkpoint (u3)
Defend-Building (u1, b14)
Checkpoint (u2)
(u1 at b14)
Secure-Perimeter-Against-SW-Attack (opfor)
Patrol (u2, s-path)
(u2 at s-path)
Ambush (u3, sw-region)
(u3 at sw-region)
23
Plan Execution

Execute atomic actions from plan
Move from abstract planning world to real world
Real-time interaction with environment
10 sense/think/act cycles per second

Ambush (u3, sw-region)
Select-ambush-loc
Move-to-ambush-loc
Wait-to-ambush
Ambush-attack
Report-success
Defend
Abandon-ambush
Report-failure
24
Machine Learning Behavior Capture

Also called
Behavioral Cloning
Learning by Observation
Learning by Imitation
A form of Knowledge Capture
Learn by watching an expert
Experts are good at performing the task
Experts arent always good at teaching/explaining
the task
Learn believable, human-like behavior
Mimic the styles of different players
When to learn?
During development
Off-line

25
Drivatar

Check out the revolutionary A.I. Drivatar
technology Train your own A.I. "Drivatars" to
use the same racing techniques you do, so they
can race for you in competitions or train new
drivers on your team. Drivatar technology is the
foundation of the human-like A.I. in Forza
Motosport.
Collaboration between Microsoft Games and
Microsoft Research

26
Learning to Fly

Learn a flight sim autopilot from observing human
pilots
30 observations each from 3 experts
20 features (elevation, airspeed, twist, fuel,
thrust)
4 controls (elevators, rollers, thrust, flaps)
Take off, level out, fly towards a mountain,
return and land
Key idea Experts react to the same situation in
different ways depending on their current goals
Divide a flight sim task into 7 phases
Learn four decision trees for each stage (one per
control)
Second key idea Dont combine data from multiple
experts
Sammut, C. Hurst, S., Kedzier, D., and Michie, D.
Learning to fly. In Proceedings of the Ninth
International Conference on Machine Learning,
pgs. 385-393, 1992.

27
KnoMic (Knowledge Mimic)

Learn air combat in a flight sim and a deathmatch
bot in Quake II
Dynamic behavior against opponents
Cant divide the task into fixed phases
Key idea Experts dynamically select which
operator theyre working on based on opponent and
environment
Also learn when to select operators
(pre-conditions)
and what those operators do (effects)
Second key idea Experts annotation observations
with their operator selections
van Lent, M. Laird, J. E., Learning Procedural
Knowledge by Observation. Proceedings of the
First International Conference on Knowledge
Capture (K-CAP 2001), October 21-23, 2001,
Victoria, BC, Canada, ACM, pp 179-186.

28
The Future
29
Where to learn more

AI and Interactive Digital Entertainment
Conference
Marina del Rey, June 2006
Journal of Game Development
Charles River Media
Game Developer Magazine
August special issue on AI
Game Developers Conference
AI Game Programming Wisdom book series
Historical
2005 IJCAI workshop on Reasoning, Representation
and Learning in Computer Games
AAAI Spring Symposiums 1999 2003
2004 AAAI Workshop

30
Interesting observations

A few of my own
The most challenging opponent isnt the most fun.
Never stupid is better than sometimes
brilliant.
Never underestimate the players ability to see
intelligence where there is none.
Game companies arent a source of research funds
A few of Will Wrights
Maximize the ratio of internal complexity to
perceived intelligence.
The player will build an internal model of your
system. If you dont help them build it, theyll
probably build the wrong one.
The flow of information about a system has a huge
impact on the players perception of its
intelligence.
From the players point of view there is a fine
line between complex behavior and random behavior.