Automated Testing: Better, Cheaper, Faster, For Everything

About This Presentation

Title:

Automated Testing: Better, Cheaper, Faster, For Everything

Description:

Increase comfort and confidence of entire team ... Do all test suites pass? - Are the servers stable. under peak load conditions? Promotable to ... – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 40

Provided by: LMel7

Category:

more less

Transcript and Presenter's Notes

Title: Automated Testing: Better, Cheaper, Faster, For Everything

1
Automated TestingBetter, Cheaper, Faster,For
Everything

Larry Mellon, Steve Keller
Austin Game Conference
Sept, 2004

2
About This Talk

Highly visual slides are often followed by a Key
Points text slide that provides additional
details. For smoother flow, such slides are
hidden in presentation mode.
Some animations are not compatible with older
versions of PowerPoint.

3
What Is A MMPAutomated Testing System?

Push-button ability to run large-scale,
repeatable tests
Cost
Hardware / Software
Human resources
Process changes
Benefit
Accurate, repeatable measurable tests during
development and operations
Stable software, faster, measurable progress
Base key decisions on fact, not opinion

4
Key Points

Comfort and confidence level
Managers/Producers can easily judge how
development is progressing
Just like bug count reports, test reports
indicate overall quality of current state of the
game
Frequent, repeatable tests show progress
backsliding
Investing developers in the test process helps
prevent QA vs. Development shouting matches
Smart developers like numbers and metrics just as
much as producers do
Making your goals you will ship cheaper,
better, sooner
Cheaper even though initial costs may be
higher, issues get exposed when its cheaper to
fix them (and developer efficiency increases)
Better robust code
Sooner its ok to ship now is based on real
data, not supposition

5
MMP Requires A Strong Commitment To Testing

System complexity, non-determinism, scale
Tests provide hard data in a confusing sea of
possibilities
Increase comfort and confidence of entire team
Tools augment your teams ability to do their
jobs
Find problems faster
Measure / change / measure repeat as necessary
Production / exec teams come to depend on this
data to a high degree

6
How To Get There

Plan for testing early
Non-trivial system
Architectural implications
Make sure the entire team is on board
Be willing to devote time and money

7
Automation Architecture
Startup Control
Collection Analysis
Repeatable, Synced Test Inputs
System Under Test
System Under Test
System Under Test
Scripted Test Clients Emulated User Play
Sessions Multi-client synchronization
Report Managers Raw Data Collection Aggregation /
Summarization Alarm Triggers
Test Manager Test Selection/Setup Control N
Clients RT probes
8
Key Points

Scriptable test clients
Lightweight subset of the shipping client
Instrumented spits out lots of useful
information
Repeatable
Bots help you understand the test results
Log both server and client output (common
format), w/timestamps!
Automated metrics collection aggregation
High level at a glance reports with detail
drill down
Pushbutton application for both running and
analyzing a test

9
Outline

Overview Automated Testing
Definition, Value, High-Level Approach
Applying Automated Testing
Mechanics, Applications
Process Shifts Stability, Scale Metrics
Implementation Key Risks
Summary Questions

10
Scripted Test Clients

Scripts are emulated play sessions just like
somebody plays the game
Command steps what the player does to the game
Validation steps what the game should do in
response

11
Scripts TailoredTo Each Test Application

Unit testing 1 feature 1 script
Load testing Representative play session
The average Joe, times thousands
Shipping quality corner cases, feature
completeness
Integration test code changes for catastrophic
failures

12
Bread Crumbs Aggregated Instrumentation Flags
Trouble Spots
Server Crash
13
Quickly Find Trouble Spots
DB byte count oscillates out of control
14
Drill Down For Details
A single DB Request is clearly at fault
15
Process Shift Applying Automation to Development
Earlier Tools Investment Equals More Gain
Not Good Enough
16
Process Shifts Automated Testing Can Change The
Shape Of The Development Progress Curve
Stability
Keep Developers moving forward, not bailing water
Scale
Focus Developers on key, measurable roadblocks
17
Process Shift Measurable Targets, Projected
Trend Lines
Target Complete
Core Functionality Tests, Any Feature (e.g.
clients)
Time
Any Time (e.g. Alpha)
Actionable progress metrics, early enough to react
18
Stability Analysis What Brings Down The Team?
Test Case Can an Avatar Sit in a Chair?
Failures on the Critical Path block access to
much of the game. Worse, unreliable failures
use_object ()
buy_object ()
enter_house ()
buy_house ()
create_avatar ()
login ()
19
Impact On Others
20
(No Transcript)
21
Key Points

Build stability slowed forward progress
(especially the critical path)
People were blocked from getting work done
Uncertainty did I break that, or did it just
happen?
A lot of developers just didnt get
non-determinism
Backsliding things kept breaking
Monkey Tests always current baseline for
developers
Common measuring stick across builds
deployments extremely valuable

22
Monkey Test EnterLot
23
Non-Deterministic Failures
24
Key Points

30 test runs, 4 behaviours
Successful entry
Hang or Crash
Owner evicted, all possessions stolen
Random results observed in all major features
Critical Path random failures outside of Unit
Tests very difficult to track

25
Stability Via Monkey Tests
Continual Repetition of Critical Path Unit Tests
26
Key Points

Hourly stability checkers
Aging (dirty processes, growing datasets, leaking
memory)
Moving parts (race conditions)
Stability measure what works, right now?
Flares go off, etc
Unit tests (against Features)
Minimal noise / side effects
Reference point what should work?
Clarity in reporting / triaging

27
Process Shift Comb Filter Testing
Sniff Test, Monkey Tests - Fast to run -
Catch major errors - Keeps coders working
Smoke Test, Server Sniff - Is the game
playable? - Are the servers stable under a
light load? - Do all key features work?
Full Feature Regression, Full Load Test - Do
all test suites pass? - Are the servers stable
under peak load conditions?

New code ready For checkin
Promotable to full testing
Promotable to paying customers
Full system build

Cheap tests to catch gross errors early in the
pipeline
More expensive tests only run on known
functional builds

28
Key Points

Much faster progress after stability checkers
added
Sniff
Hourly reference tests (sniff monkey, unit
monkey)
Comb filters kept the manpower overhead low (on
both sides, and gave quick feedback. Fewer redos
for engs, fewer bugs for QA to findprocess)
Extra post-checkin testing story (optional)
Size of team gives high broken build cost
Fewer Redos
Fewer side-effect bugs

29
Process Shift Who Tests What?

Automation simple tasks (repetitive or
large-scale)
Load _at_ scale
Workflow (information management)
Full weapon damage assessment, broad, shallow
feature coverage
Manual judgment / innovative tasks
Visuals, playability, creative bug hunting
Combined
Tier 1 / Tier 2 automation flags potential
errors, manual investigates
Within a single test automation snapshots key
game states, manual evaluates results
Augmented / accelerated complex build steps,

30
Process Shift Load Testing (Before Paying
Customers Show Up)

Expose issues that only occur at scale

Establish hardware requirements
Establish play is acceptable _at_ scale
31
(No Transcript)
32
Client-Server Comparison
33
Highly Accurate Load TestingMonkey See /
Monkey Do
Sim Actions (Player Controlled)
Sim Actions (Script Controlled)
34
Outline

Overview Automated Testing
Definition, Value, High-Level Approach
Applying Automated Testing
Mechanics, Applications
Process Shifts Stability, Scale Metrics
Implementation Key Risks
Summary Questions

35
Data Driven Test Client
Regression
Load
Reusable Scripts Data
Single API
Test Client
Single API
Key Game States
Pass/Fail Responsiveness
Script-Specific Logs Metrics
36
Scripted Players Implementation
Commands
Presentation Layer
37
What Level To Test At?
Game Client
View
Mouse Clicks
Presentation Layer
Logic
Regression Too Brittle (UIpixel shift) Load
Too Bulky
38
What Level To Test At?
Game Client
View
Internal Events
Presentation Layer
Logic
Regression Load Too Brittle (Churn Rate vs
Logic Data)
39
Automation Scripts QA Tester Scripts
Basic gameplay changes less frequently than UI or
protocol implementations.
NullView Client
View
Presentation Layer
Logic
40
Key Points

Support costs one (data driven) client better
than N clients
Tailorable validation output turned out to be a
very powerful construct
Each test script contains required validation
steps (flexible, tunable, )
Minimize state to regress against fewer false
positives

41
Common Gotchas

Setting the Test bar too high, too early
Feature drift expensive test maintenance
Code is built incrementally reporting failures
nobody is prepared to deal with wastes
everybodys time
Non-determinism
Race conditions, dirty buffers/processState,
Developers test with a single client against a
single server no chance to expose race
conditions
Not designing for testability
Testability is an end requirement
Retrofitting is expensive
No senior engineering committed to the testing
problem

42
Outline

Overview Automated Testing
Definition, Value, High-Level Approach
Applying Automated Testing
Mechanics, Applications
Process Shifts Stability Scale
Implementation Key Risks
Summary Questions

43
Summary Mechanics Implications

Scripted test clients and instrumented code rock!
Collection, aggregation and display of test data
is vital in making decisions on a day to day
basis
Lessen the panic
ScaleBreak is a very clarifying experience
Stable codeservers in development greatly ease
the pain of building a MMP game
Hard data (not opinion) is both illuminating and
calming
Long-term operations testing is a recurring cost

44
Summary Process

Integrate automated testing at all levels
Dont just throw testing over the wall to QA
monsters
Use automation to speed focus development
Stability Sniff Test, Monkey Tests
Scale Load Test

45
Summary Key Points

Ship a better game
Lessen the panic
Constant testing for stability prevents
backsliding during development and operations,
keeps the team moving forward, roadblock free,
keeps the player experience smooth
Early load testing exposes critical server costs
and failures in time to be addressed
Everybody knows what works, every day
Testing its not just for QA anymore
Continual content extensions while keeping
previous features stable, over years of
operations
Stable systems keep customers happy developers
working on new features, not fire-fighting
Recurring cost excellent fit for tool investment

46
Tabula Rasa
PreCheckin SniffTest
Keep Mainline Working
Hourly Monkey Tests
Baseline for Developers
Dedicated Tools Group
Easy to Use Used
Executive Support
Radical Shifts in Process
Load Test Early Often
Break It Before Live
Distribute Test Development Ownership Across
Full Team
47
Cautionary Tales
Flexible Game Development Requires Flexible Tests
Signal To Noise Ratio
Defects Variance In The Testing System
48
Key Points

Initial development phase game design in
constant flux
Tests usually start by not working
Noise makes it hard to find results
boy who cried wolf syndrome
Business decisions get made off testing results
make sure theyre accurate (load testing inputs,
report generators, probing system, script errors,
)
Team trust another factor
Complex system with high degree of flex requires
Senior engineers full time
Team management commitment

49
Questions (15 Minutes)

Overview Automated Testing
Definition, Value, High-Level Approach
Applying Automated Testing
Mechanics, Applications
Process Shifts Stability, Scale Metrics
Implementation Key Risks

Slides online _at_ www.maggotranch.com/MMP

Write a Comment

User Comments (0)

About PowerShow.com

Automated Testing: Better, Cheaper, Faster, For Everything - PowerPoint PPT Presentation

Automated Testing: Better, Cheaper, Faster, For Everything

Increase comfort and confidence of entire team ... Do all test suites pass? - Are the servers stable. under peak load conditions? Promotable to ... – PowerPoint PPT presentation