Title: Bayesian Ranking: From Xbox Live to Computer Go
1Bayesian RankingFrom Xbox Live to Computer Go
- Ralf Herbrich and Thore Graepel
2Overview
- Motivation Ranking in Video Games
- Bayesian Player Ranking TrueSkill
- Skill Belief, Likelihood, Update Equation
- Applications in Online Gaming
- Numerical Results on Halo 2 data
- Ranking Moves Computer Go
- Conclusion
3Motivation
- Microsoft is the leader in online video gaming
(Xbox Live). - Centralised game-independent service.
- Gamercard, Achievements, TrueSkill, etc.
- What makes playing online games fun?
- Good network connection (Broadband).
- Seamless setup (Xbox Live).
- Competitive matches (Ranking!).
4Ranking in Video Games
- Problem Setting
- k teams of n1,,nk many players compete.
- The outcome is a ranking between the teams
(including draws). - Questions
- Skill si of each player such that
- Global ranking between all players.
- High quality of match between k teams.
5Overview
- Motivation Ranking in Video Games
- Bayesian Player Ranking TrueSkill
- Skill Belief, Likelihood, Update Equation
- Applications in Online Gaming
- Numerical Results on Halo 2 data
- Ranking Moves Computer Go
- Conclusion
6The Bayesian Approach
- Classical logic deals with certain statements.
- In the real world, uncertainty is abundant.
- Degree of Belief (logic 0 or 100).
Player is skill is between 35 and 40
and Player js skill is between 30 and 35
Player i won against Player j
Bayesian Approach Probability for Logic under
Uncertainty P(AB) P(BA) P(A) / P(B)
Prior
Posterior
Likelihood
Evidence
7Overview
- Motivation Ranking in Video Games
- Bayesian Player Ranking TrueSkill
- Skill Belief, Likelihood, Update Equation
- Applications in Online Gaming
- Numerical Results on Halo 2 data
- Ranking Moves Computer Go
- Conclusion
8TrueSkill Skill Belief
- Track two numbers per player
- µ Average skill of player
- s Uncertainty about skill of player
- Benefits
- Faster Skill Learning
- Better Matchmaking
- Accurate Prediction of Game Outcomes
s
Belief in Skill Level
µ
10
15
20
25
30
35
40
Skill Level
9TrueSkill Likelihood
- Likelihood
- Game outcomes are all permutations including
draws between pairs of teams. - Latent performance model for players
P(game outcomes1,,sn) P(tis are in game
outcome order s1,,sn)
- Team performance, ti, is sum of players
performances in the team.
10Likelihood Example Two Players
0.2
7
Player 2 wins
0.18
6
Players 1 and 2 draw
0.16
5
0.14
0.12
4
Probability density
Performance of Player 2
0.1
3
0.08
0.06
2
Players 1 and 2 draw
0.04
Player 1 wins
1
0.02
Player 2 wins
Player 1 wins
0
0
0
1
2
3
4
5
6
7
-8
-6
-4
-2
0
2
4
x
- x
Performance of Player 1
1
2
11TrueSkill Skill Updates
Game Outcome
Belief in Skill Level
0
10
20
30
40
50
12Two Team Update Algorithm
Draw
Win\Loss
bi 1 (for players in winning team) bi -1
(for players in losing team) bi 0 (for all
other players)
13Overview
- Motivation Ranking in Video Games
- Bayesian Player Ranking TrueSkill
- Skill Belief, Likelihood, Update Equation
- Applications in Online Gaming
- Numerical Results on Halo 2 data
- Ranking Moves Computer Go
- Conclusion
14The True Skill System Applications
- Leaderboard
- (Conservative) skill estimate µ - 3s
- Matchmaking
- Competitive game Fun game!
- Match quality Probability of a draw
- Team Balancing
- Maximise match quality by greedy search.
15TrueSkill Matchmaking
Lobby
Possible Matches
?
16Alternative Ranking Systems ELO
- Only 2 players and disregard draws.
- Assumptions
- Use a moving average estimator for pi,j
- Assume a constant average skill ? each game is a
zero-sum game. - Make a linear approximation of F.
17ELO Properties
- ELO is approximation of the TrueSkill system.
- ELO deteriorates without matchmaking!
- ELO only maintains one point the mode.
- ELO has a fixed step-size per update.
- ELO cannot deal with team games or multiple (gt 2)
player games. - People need a provisional ranking in ELO starting
at the mid-point of the scale.
18Overview
- Motivation Ranking in Video Games
- Bayesian Player Ranking TrueSkill
- Skill Belief, Likelihood, Update Equation
- Applications in Online Gaming
- Numerical Results on Halo 2 data
- Ranking Moves Computer Go
- Conclusion
19Results Halo 2 Multiplayer Beta
- 5 different hoppers
- Free-For-All 60261 games (5946 players)
- 1 vs. 1 6240 games (1672 players)
- 5 maps, 3 different game variants.
- Matchmaking was relaxed to level gap 9.
- Parameters in all experiments
- Performance variation factor 60
- Draw Probability 5
- Dynamics variation factor 2
20Free-For-All char vs. SQLwildman?
40
35
30
25
Level
20
15
char (
)
TrueSkill
10
)
SQLwildman (
TrueSkill
char (Halo 2)
5
SQLwildman (Halo 2)
0
0
100
200
300
400
Number of games played
21Free-For-All char vs. SQLwildman?
100
char wins
SQLwildman wins
80
Both players draw
60
Winning probability
40
20
5/8 games won by char
0
0
100
200
300
400
500
Number of games played
22Near Normal Level Distribution 650,000 Players
- TrueSkill Analysis of Halo 2 (Nov. Dec. 2004)
50
40
30
Level
20
10
1
0
0.5
1
1.5
2
2.5
3
Level Occupancy
4
x 10
x 10
23TrueSkill
- Skill based ranking instead of experience based
ranking for better matchmaking. - TrueSkill system is
- a generalisation of ELO
- tracks a belief distribution
- can deal with multiple team/players/draws
- Every Xbox 360 Live game uses TrueSkill ranking
matchmaking!
24Overview
- Motivation Ranking in Video Games
- Bayesian Player Ranking TrueSkill
- Skill Belief, Likelihood, Update Equation
- Applications in Online Gaming
- Numerical Results on Halo 2 data
- Ranking Moves Computer Go
- Conclusion
25Liberty Fast Pattern Based Computer Go
- Go Simple rules yet very complex game.
- Computer Go
- Infancy (best programs at weak amateur level).
- Problems Evaluation and Branching factor (250).
- New grand challenge of AI (replacing Chess).
- Idea Learning good moves from expert play.
- Applications
- Reduce branching factor for search.
- Fast pattern based Go engine.
26From Local Patterns to Probability of Moves
?25, ? 5.2
Not in database!
Black to move
27Harvesting and Learning
- Two processes
- Harvesting patterns.
- Ranking patterns.
- We learn the value of these patterns using a
modified Bayesian TrueSkill ranking system. - Partial ranking from every expert move
- Move made wins over any other move available on
the board. - Nothing is known about the ranking of the
un-played moves.
28Better than State-of-the-Art
1
Liberty (120K games)
Liberty (20K games)
Liberty (20K games)
Werf et al. (2002)
0.8
0.6
55 in top 5
cumulative probability
0.4
32 top
0.2
0
1
5
10
15
20
25
30
expert move rank
29Better than State-of-the-Art
1
Liberty (120K games)
Liberty (20K games)
Liberty (20K games)
Werf et al. (2002)
0.8
0.6
cumulative probability
0.4
0.2
10 alternative moves
50 alternative moves
max. 50 alternative moves
0
expert move rank
1
5
10
15
20
25
30
30Better Prediction Early
31Bigger Patterns Early
32Bigger Patterns ? Better Predictions
33Bayesian Ranking for Go
- Bayesian ranking makes full use of the
information available from expert moves. - Simple features used in the approach already
beats state-of-the-art prediction methods. - Approach is ideal for server-side Go AI
- Very fast at move selection time.
- Large memory footprint.
- Planned extension to 1,000,000 game records and
context-aware patterns.
34Conclusions
- Bayesian Ranking is a powerful technique
- TrueSkill generalises ELO and has a large
influence on the online gaming experience of Xbox
gamers. - Provides a principled and efficient way for
learning the value of local patterns in the game
of Go.