Title: Motocross And Artificial Neural Networks
1Motocross And Artificial Neural Networks
Benoit Chaperot, Colin Fyfe, School of
Computing, University of Paisley, Paisley, PA1
2BE, SCOTLAND.
2Why use Artificial Neural Network
- Riding a motorbike involves behaviours which are
difficult to express as a set of procedural rules - ANNs expected to behave in a human or
animal-like manner - Capable of extrapolating when presented with new
and different sets of inputs - Capable of evolving
3The ANN
- The inputs
- Bike position, orientation and velocity relative
to the track - Terrain height information
- Track path information
The outputs Accelerate, brake Turn left,
right Lean forward, backward
ANN
4Inputs to ANN
- Output from ANN
- Accelerate, brake
- turn left, right
- lean forward, backward
5Two forms of training
- Evolutionary algorithms Training considered as
an optimisation to be performed using genetic
algorithms - Back propagation algorithm ANNs are trained
using training data made from a recording of the
game being played by a good human player
6Evolutionary Algorithm
- Training considered as optimisation
- ANNs initialised with random weights
- 80 ANNs per generation
- Each ANN evaluated for 150 seconds using a score
function - Fittest ANNs are given more chance to reproduce,
crossover and mutation techniques are used - The whole population converges to a satisfactory
solution to the problem after approximately 100
generations
7Fitness Function
- One way point every metre along the track, bonus
for passing through a way point. - Bonus/penalty (i.e. normally negative) for
missing a way point. - Bonus/penalty (i.e. normally negative) for
crashing. - Bonus/penalty (i.e. normally negative) for every
metre away from the centre of the next way point.
8Problems with EA
- May be difficult to find a good evaluation
function this function determines the final
behaviour of evolved ANNs. - May be difficult to maintain diversity in
population. The population may quickly converge
towards a local solution need to find the right
evolution parameters. - It takes time to evaluate each and every
individual. - Crossover between two fit ANNs is likely to
produce unfit ANNs due to ANN architecture and
operation. - No consistent results.
9Back propagation algorithm
- Training data made by the first author playing
the game on many different tracks. - Each sample of training data contains a situation
(bike position, orientation on a track) and the
first authors solution to the situation (turn
left, right, accelerate, brake ) - ANNs trained at reproducing player solution
given a situation - Training data composed of approximately 120000
samples - Good solution to the problem after only 20000
iterations
10Results
- ANNs learn and perform like a human intelligence
- Average lap time
- Good human player 2 min 10 sec
- ANN trained using GA 2 min 50 sec
- ANN trained using BP 2 min 20 sec
- ANN trained using GA slow, but better than one
trained using BP at adapting to new situations
11Bagging and Boosting
- These are called ensemble methods and are used to
improve AI performance. - Bagging
- Create N different bags of training data.
- Train one ANN on each bag.
- Present ANNs with one problem.
- The combined solution of all ANNs is expected to
be better than any individual ANN solution. - Many combinations function possible Average,
vote, winner takes it all
12Bagging, results
The combined solution on track m16 of NN 0 to 9
trained on different tracks is better than any of
the individual solutions, but still not as good
as one of NN trained on all tracks.
13Bagging, conclusion
- Bagging is not working well for the motocross
problem the combined solution of ANNs trained
on different tracks is not as good as one of ANN
trained on all tracks. - Bagging is processing intensive the input is to
be propagated through more than one ANN, and the
outputs combined.
14Boosting
- Boosting puts more emphasis on data which
machines trained on early bags find difficult. - The physics in the game has improved
- With early physics model and an alternative
back-propagation technique, anti-boosting worked
well but took a long time to perform. - With improved physics model and traditional
back-propagation technique, no boosting or
anti-boosting seems to be necessary and training
takes a short time to perform.
15Back-Propagation, more
- The learning rate is a very important parameter
in the BP algorithm - Too high, the ANN over fits the data, forgets
about most of the data and is not able to
generalise. - Too low, the ANN can take a long time to train,
and sometime not train well at all due to
floating point numbers limitations. - A good technique has been to reduce the learning
rate logarithmically from 1e-2 to 1e-5, over 1
million iterations. It takes a few seconds to
perform and give best results.
16Further use of ANNs
Physics camera A. The camera follows the bike
and points toward the bike the human player does
not have a good view of the track ahead.
B. The camera follows the bike but points toward
the predicted future position of the bike the
human player now has a better view of the track
ahead. The predicted position of the bike in time
(1.5 seconds) is given by an ANN.
17Future work
- Work on online learning, have artificial
intelligence to evolve, improve and adapt to new
situations as you play the game. - Obstacle avoidance this needs redefinition of
what the ANNs can see. - Modular architecture for ANNs, The signal would
propagate through a variable number of nodes for
tuneable performance, processing requirements,
and easier modular training.
18Questions ?