A Learning Process Architecture for Continuous Strategic Games

About This Presentation

Title:

A Learning Process Architecture for Continuous Strategic Games

Description:

Artificial Intelligence Overview ... Liquid AI (Madden 2005): Courtesy: Electronic Arts. Jonathan Gibbs, MURF 2004. The RoboFlag Game ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 18

Provided by: surf2

Category:

more less

Transcript and Presenter's Notes

Title: A Learning Process Architecture for Continuous Strategic Games

1
A Learning Process Architecture for Continuous
Strategic Games

By Jonathan Gibbs
Mentor Richard Murray
Co-Mentor Ling Shi

2
Artificial Intelligence Overview

It is the science and engineering of making
intelligent machines, especially intelligent
computer programs. It is related to the similar
task of using computers to understand human
intelligence, but AI does not have to confine
itself to methods that are biologically
observable. (John McCarthy Stanford University)
To obtain a scientific understanding of the
mechanisms underlying thought and intelligent
behavior and their embodiment in machines.
(American Association for Artificial
Intelligence)

3
Artificial Intelligence in Games
4
The RoboFlag Game

Up to 6 on 6 capture the flag game
Limited sensing and communication capability
Simulator and Hardware testbed
Each robot operates as a separate entity

Courtesy Richard Murray
5
Objectives

Create a learning process architecture that does
not rely predefined strategies
Implement the architecture so that a simple
strategy can be defeated in a small number of
tries
Make the process cooperative

6
Typical Learning Processes

State Definition
Reward Scheme
Mathematical Model
Strategy Database
Probabilistic decision maker
Solve the game as a math problem
Solve a probabilistic graph

Current State
Game
Database
Next Action
Current State
Game
Model
Next Action
7
Challenges with RoboFlag

RoboFlagis a dynamic game, NOT a board game
Limited model detail
Limited database size
Limited computation time
Small amount useful information available
Limited state definition must be efficient and
effective
Limited sharing capability
Reward system must be aggressive

Current State
Game
Next Action
Current State
Game
Next Action
8
State Definition
struct JRobotStatus float radius //radius
from flag float theta //theta from
flag BOOL myside //which side of the
field BOOL enemy_present //Is there an enemy
in front of us BOOL gotflag //Do we have the
flag float prob1 //Probabilities of assigned
actions. float prob2 float prob3 float
prob4 float prob5 float prob6 float
prob7 float prob8

Contain relevant information
Easy to interpret
Small
Computationally efficient

9
Reward Scheme

Aggressive
Robust
Efficient

enum JReward Tagged -5, Ambig 0,
MovedCloser 2, InZone 10, GotFlag 10

10
The Architecture (Good)
RoboFlag
11
The Opposition (Evil)

Man to Man Strategy
Feasible for one robot to beat
Spiral Approach
Change directions

12
Results

Very little movement
No reaction based on enemy location
Many inconclusive events
Flag was never captured

13
Changes

Changed default probabilities
Replaced 2 boolean variables with enemy location
information
Cosmetic changes to the update function
Added ability to read an old log file

14
Results

More movement towards the flag
New probability weights made enemy information
insignificant
Did capture the flag
Logger failed

15
Conclusions

Architecture did not achieve original objective
but showed potential
No matter how much learning the computer does,
the mechanisms by which it learns must be
continuously tweaked
Trial and Error is easy to implement but is
probably not the best approach
Severe limitations when using mathematics to
model reason

16
Future Work

Increase state definition size until it is
computationally too expensive
Implement a mechanism for cooperation with other
robots
Perfect the architecture so that it can learn
defensive and offensive strategy at the same time

17
Acknowledgments