Title: IBM T'J' Watson Research Center
1Deconstructing SPECweb99
IBM T.J. Watson Research Center www.research.ibm
.com/people/n/nahum nahum_at_us.ibm.com
2Talk Overview
- Workload Generators
- SPECweb99
- Methodology
- Results
- Summary and Conclusions
3Why Workload Generators?
- Allows stress-testing and bug-finding
- Gives us some idea of server capacity
- Allows us a scientific process to compare
approaches - e.g., server models, gigabit adaptors, OS
implementations - Assumption is that difference in testbed
translates to some difference in real-world - Allows the performance debugging cycle
Measure
Reproduce
Fix and/or improve
Find Problem
The Performance Debugging Cycle
4How does W. Generation Work?
- Many clients, one server
- match asymmetry of Internet
- Server is populated with some kind of synthetic
content - Simulated clients produce requests for server
- Master process to control clients, aggregate
results - Goal is to measure server
- not the client or network
- Must be robust to conditions
- e.g., if server keeps sending 404 not found, will
clients notice?
Responses
Requests
5Problems with Workload Generators
- Only as good as our understanding of the traffic
- Traffic may change over time
- generators must too
- May not be representative
- e.g., are file size distributions from IBM.com
similar to mine? - May be ignoring important factors
- e.g., browser behavior, WAN conditions, modem
connectivity - Still, useful for diagnosing and treating
problems
6What Server Workload Generators Exist?
- Many. In order of publication
- WebStone (SGI)
- SPECweb96 (SPEC)
- Scalable Client (Rice Univ.)
- SURGE (Boston Univ.)
- httperf (HP Labs)
- SPECweb99 (SPEC)
- TPC-W (TPC)
- WaspClient (IBM)
- WAGON (IBM)
- Not to mention those for proxies (e.g. polygraph)
- Focus of this talk SPECweb99
7Why SPECweb99?
- Has become the de-facto standard used in
Industry - 141 submissions in 3 years on the SPEC web site
- Hardware Compaq, Dell, Fujitsu, HP, IBM, Sun
- OSes AIX, HPUX, Linux, Solaris, Windows NT
- Servers Apache, IIS, Netscape, Tux, Zeus
- Used within corporations for performance,
testing, and marketing - E.g., within IBM, used by AIX, Linux, and 390
groups - Begs the question how realistic is it?
8Server Workload Characterization
- Over the years, many observations have been made
about Web server behavior - Request methods
- Response codes
- Document Popularity
- Document Sizes
- Transfer Sizes
- Protocol use
- Inter-arrival times
- How well does SPECweb99 capture these
characteristics?
9History SPECweb96
- SPEC Systems Performance Evaluation Consortium
- Non-profit group with many benchmarks (CPU, FS)
- Pay for membership, get source code
- First attempt to get somewhat representative
- Based on logs from NCSA, HP, Hal Computers
- 4 classes of files
- Poisson distribution within each class
10SPECweb96 (cont)
- Notion of scaling versus load
- number of directories in data set size doubles as
expected throughput quadruples (sqrt(throughput/5)
10) - requests spread evenly across all application
directories - Process based WG
- Clients talk to master via RPC's
- Does only GETS, no keep-alive
- www.spec.org/osg/web96
11Evolution SPECweb99
- In response to people "gaming" benchmark, now
includes rules - IP maximum segment lifetime (MSL) must be at
least 60 seconds - Link-layer maximum transmission unit (MTU) must
not be larger than 1460 bytes (Ethernet frame
size) - Dynamic content may not be cached
- not clear that this is followed
- Servers must log requests.
- W3C common log format is sufficient but not
mandatory. - Resulting workload must be within 10 of target.
- Error rate must be below 1.
- Metric has changed
- now "number of simultaneous conforming
connections rate of a connection must be
greater than 320 Kbps
12SPECweb99 (cont)
- Directory size has changed
- (25 (400000/122000) simultaneous conns) /
5.0) - Improved HTTP 1.0/1.1 support
- Keep-alive requests (client closes after N
requests) - Cookies
- Back-end notion of user demographics
- Used for ad rotation
- Request includes user_id and last_ad
- Request breakdown
- 70.00 static GET
- 12.45 dynamic GET
- 12.60 dynamic GET with custom ad rotation
- 04.80 dynamic POST
- 00.15 dynamic GET calling CGI code
13SPECweb99 (cont)
- Other breakdowns
- 30 HTTP 1.0 with no keep-alive or persistence
- 70 HTTP 1.1 with keep-alive to "model"
persistence - still has 4 classes of file size with Poisson
distribution - supports Zipf popularity
- Client implementation details
- Master-client communication uses sockets
- Code includes sample Perl code for CGI
- Client configurable to use threads or processes
- Much more info on setup, debugging, tuning
- All results posted to web page,
- including configuration back end code
- www.spec.org/osg/web99
14Methodology
- Take a log from a large-scale SPECweb99 run
- Take a number of available server logs
- For each characteristic discussed in the
literature - Show what SPECweb99 does
- Compare to results from the literature
- Compare to results from a set of sample server
logs - Render judgment on how well SPECweb99 does
15Sample Logs for Illustration
Well use statistics generated from these logs as
examples.
16Talk Overview
- Workload Generators
- SPECweb99
- Methodology
- Results
- Summary and Conclusions
17Request Methods
- AW96, AW00, PQ00, KR01 majority are GETs, few
POSTs - SPECweb99 No HEAD request, too many POSTS
18Response Codes
- AW96, AW00, PQ00, KR01 Most are 200s, next
304s - SPECweb99 doesnt capture anything but 200 OK
19Resource Popularity
- p(r) C/ralpha (alpha 1 true Zipf others
Zipf-like") - Consistent with CBC95, AW96, CB96, PQ00, KR01
- SPECweb99 does a good job here with alpha 1
20Resource (File) Sizes
- Lognormal body, consistent with results from
AW96, CB96, KR01. - SPECweb99 curve is sparse, 4 distinct regions
21Tails of the File Size
- AW96, CB96 sizes have Pareto tail Downey01
Sizes are lognormal. - SPECweb99 tail only goes to 900 KB (vs 10 MB for
others)
22Response (Transfer) Sizes
- Lognormal body, consistent with CBC95, AW96,
CB96, KR01 - SPECweb99 doesnt capture zero-byte transfers
(304s)
23Transfer Sizes w/o 304s
- When 304s removed, SPECweb99 much closer
24Tails of the Transfer Size
- SPECweb99 tail is neither lognormal nor pareto
- Again, max transfer is only 900 KB
25Inter-Arrival Times
- Literature gives exponential distr. for session
arrivals - KR01 Request inter-arrivals are pareto
- Here we look at request inter-arrivals
26Tails of Inter-Arrival Times
- SPECweb99 has pareto tail
- Not all others do, but may be due to truncation
- (e.g. log duration of only one day)
27HTTP Version
- Over time, more and more requests are served
using 1.1 - But SPECweb99 is much higher than any other log
- Literature doesnt look at this, so no judgments
28Summary and Conclusions
- SPECweb99 has a mixed record depending on
characteristic - Methods OK
- Response codes bad
- Document popularity good
- File sizes OK to bad
- Transfer sizes bad
- Inter-arrival times good
- Main problems are
- Needs to capture conditional GETs with IMS for
304s - Better file size distribution (smoother, larger)
29Future Work
- Several possibilities for future work
- Compare logs with SURGE
- More detail on HTTP 1.1 (requires better workload
characterization, e.g. packet traces) - Dynamic content (e.g., TPC-W) (again, requires
workload characterization) - Latter 2 will not be easy due to privacy,
competitive concerns
30Probability
- Graph shows 3 distributions with average 2.
- Note average ? median in some cases !
- Different distributions have different weight
in tail.
31Important Distributions
- Some Frequently-Seen Distributions
- Normal
- (avg. sigma, variance mu)
- Lognormal
- (x gt 0 sigma gt 0)
- Exponential
- (x gt 0)
- Pareto
- (x gt k, shape a, scale k)
32Probability Refresher
- Lots of variability in workloads
- Use probability distributions to express
- Want to consider many factors
- Some terminology/jargon
- Mean average of samples
- Median half are bigger, half are smaller
- Percentiles dump samples into N bins
- (median is 50th percentile number)
- Heavy-tailed
- As x-gtinfinity
33Session Inter-Arrivals
- Inter-arrival time between successive requests
- Think time"
- difference between user requests vs. ALL requests
- partly depends on definition of boundary
- CB96 variability across multiple timescales,
"self-similarity", average load very different
from peak or heavy load - SCJO01 log-normal, 90 less than 1 minute.
- AW96 independent and exponentially distributed
- KR01 session arrivals follow poisson
distribution, but requests follow pareto with
a1.5
34Protocol Support
- IBM.com 2001 logs
- Show roughly 53 of client requests are 1.1
- KA01 study
- 92 of servers claim to support 1.1 (as of Sep
00) - Only 31 actually do most fail to comply with
spec - SCJO01 show
- Avg 6.5 requests per persistent connection
- 65 have 2 connections per page, rest more.
- 40-50 of objects downloaded by persistent
connections
Appears that we are in the middle of a slow
transition to 1.1
35WebStone
- The original workload generator from SGI in 1995
- Process based workload generator, implemented in
C - Clients talk to master via sockets
- Configurable client machines, client
processes, run time - Measured several metrics avg max connect time,
response time, throughput rate (bits/sec),
pages, files - 1.0 only does GETS, CGI support added in 2.0
- Static requests, 5 different file sizes
www.mindcraft.com/webstone
36SURGE
- Scalable URL Reference GEnerator
- Barford Crovella at Boston University CS Dept.
- Much more worried about representativeness,
captures - server file size distributions,
- request size distribution,
- relative file popularity
- embedded file references
- temporal locality of reference
- idle periods ("think times") of users
- Process/thread based WG
37SURGE (cont)
- Notion of user-equivalent
- statistical model of a user
- active off time (between URLS),
- inactive off time (between pages)
- Captures various levels of burstiness
- Not validated, shows that load generated is
different than SpecWeb96 and has more burstiness
in terms of CPU and active connections - www.cs.wisc.edu/pb
38S-Client
- Almost all workload generators are closed-loop
- client submits a request, waits for server, maybe
thinks for some time, repeat as necessary - Problem with the closed-loop approach
- client can't generate requests faster than the
server can respond - limits the generated load to the capacity of the
server - in the real world, arrivals dont depend on
server state - i.e., real users have no idea about load on the
server when they click on a site, although
successive clicks may have this property - in particular, can't overload the server
- s-client tries to be open-loop
- by generating connections at a particular rate
- independent of server load/capacity
39S-Client (cont)
- How is s-client open-loop?
- connecting asynchronously at a particular rate
- using non-blocking connect() socket call
- Connect complete within a particular time?
- if yes, continue normally.
- if not, socket is closed and new connect
initiated. - Other details
- uses single-address space event-driven model like
Flash - calls select() on large numbers of file
descriptors - can generate large loads
- Problems
- client capacity is still limited by active FD's
- arrival is a TCP connect, not an HTTP request
- www.cs.rice.edu/CS/Systems/Web-measurement
40TPC-W
- Transaction Processing Council (TPC-W)
- More known for database workloads like TPC-D
- Metrics include dollars/transaction (unlike SPEC)
- Provides specification, not source
- Meant to capture a large e-commerce site
- Models online bookstore
- web serving, searching, browsing, shopping carts
- online transaction processing (OLTP)
- decision support (DSS)
- secure purchasing (SSL), best sellers, new
products - customer registration, administrative updates
- Has notion of scaling per user
- 5 MB of DB tables per user
- 1 KB per shopping item, 25 KB per item in static
images
41TPC-W (cont)
- Remote browser emulator (RBE)
- emulates a single user
- send HTTP request, parse, wait for thinking,
repeat - Metrics
- WIPS shopping
- WIPSb browsing
- WIPSo ordering
- Setups tend to be very large
- multiple image servers, application servers, load
balancer - DB back end (typically SMP)
- Example IBM 12-way SMP w/DB2, 9 PCs w/IIS 1M
- www.tpc.org/tpcw