Title: Comparing Squid Filesystem Performance with Web Polygraph
1Comparing Squid FilesystemPerformance with Web
Polygraph
Duane Wessels wessels_at_squid-cache.org
- OReilly Open Source Convention
- July 24, 2002
2Motivation
- Squid performance depends greatly on the
filesystem. - Want to compare different operating systems,
filesystems, filesystem options, and Squid
storage schemes. - Need a good benchmark with minimal input
parameters.
3Squid Filesystem Options
- OSes Linux, Free/Net/OpenBSD, Solaris, ...
- Filesystems UFS, ext2fs, ext3fs, xfs, reiserfs.
- Filesystem Options noatime, softdep, async.
- Squid Storage Schemes ufs, aufs, diskd.
- Other parameters we wont discuss.
4Web Polygraph
- Powerful, flexible benchmarking tool for HTTP
intermediaries. - PolyMix-4 standardized workload for client-side
caching proxies. - DUT Device Under Test Squid in this case.
5Best Effort Workloads
- N agents submit requests as fast as the DUT
allows. - Each agent is always busy.
- Many benchmarks work this way because it is easy
to implement. - The DUT response time can affect throughput.
- Easy to make mistakes.
- Difficult to trust results.
- Number of agents becomes an arbitrary input
parameter. - Difficult to compare different devices
6A Best-Effort Result
7Constant Mean Throughput Workloads
- Agents submit requests at a constant mean rate.
- Agents spend some time being idle.
- Results are more believable.
- Test may fail after running for a long time.
- Hard to find the ideal, peak throughput.
- Throughput is an input parameter.
8Some PolyMix-4 Results
9Constant Response Time Workloads
- Offered load varies depending on measured
response time. - Otherwise just like constant throughput
- Tests are less likely to fail halfway through.
- Removes throughput as an input parameter.
- Selecting the response time window is tricky.
- Sometimes observe response time spikes.
- Requires a constant/predictable hit ratio.
- Cannot compare to a no-caching workload
10The rptmstat Workload
- Response time window 1.4 1.5 seconds.
- Load delta 1.
- down sample rate 1000 transactions.
- up sample rate 2000 transactions.
- Workload goal fill the cache twice.
- Adjusts populous factor the number of active
agents.
11Sample rptmstat Console Output
398.25 i-rptmstat 2715719 89.02 1446 60.38 0
822 fyi rptmstatDn 1.47sec rptm requires no
load adjustment fyi rptmstatUp 1.41sec rptm
requires no load adjustment fyi rptmstatDn
1.67sec rptm changes load by -1.00 fyi
rptmstatDn 1.40sec rptm requires no load
adjustment fyi rptmstatUp 1.54sec rptm requires
no load adjustment fyi rptmstatDn 1.26sec rptm
requires no load adjustment fyi rptmstatDn
1.51sec rptm changes load by -1.00 fyi
rptmstatUp 1.39sec rptm changes load by
1.00 fyi rptmstatDn 1.52sec rptm changes load
by -1.00
12Sample rptmstat Result
13The Squid Filesystem Tests
- Five different operating systems.
- Identical hardware.
- Identical Squid source code.
- Nearly identical Squid configuration.
- Different filesystems, options, storage schemes.
14Hardware for Squid
- IBM Netfinity 4000R.
- 500 MHz Pentium 3.
- 1GB RAM.
- 3 x 18GB SCSI disk (one external).
- Integrated Intel 10/100 NIC.
15Squid Configuration
- Squid 2.4.STABLE5.
- 3 x 7500 MB cache_dir (L116, L2256).
- Logging disabled.
- Default cache_mem (8MB).
16The Linux Box
- Linux 2.4.9-13, with SGI XFS_1.0.2 patches
- 8192 file descriptors
- ./configure with-aio-threads32
- Xfsprogs 1.3.13
- Reiserfsprogs 3.x.0j
17Linux Results
18The NetBSD Box
- NetBSD 1.5.3_RC1
- MAXFILES8192
- NMBCLUSTERS32768
19NetBSD Results
20The OpenBSD Box
- OpenBSD 3.0
- MAXFILES8192
- Only 4096 per process however
- Not an issue usage is well below this limit
- NMBCLUSTERS32768
21OpenBSD Results
22The FreeBSD Box
- FreeBSD 4.5-STABLE
- MAXFILES8192
- NMBCLUSTERS32768
23FreeBSD Results
24The Solaris Box
- Solaris 5.8 for Intel (generic 108592-09)
- /etc/system
- rlim_fd_max8192
- /etc/nsswitch.conf
- hosts dns files
- /etc/nscd.conf
- enable-cache hosts no
- newfs b 4096 i 6144
- tunefs o space
25Solaris Results
26Validating rptmstat
- Can rptmstat predict PolyMix-4 performance?
- Start with mean throughput during last ¼ of
rptmstat run. - Increase or decrease throughput in 10
increments. - Always use the same fill rate, however.
27PolyMix-4 on Linux
28Modified PolyMix-4 on Linux
29PolyMix-4 on FreeBSD
30Modified PolyMix-4 on FreeBSD
31Conclusions
- For Squid, the best performing filesystems are
- FreeBSD softupdates and diskd storage scheme
- Linux ext2fs and aufs storage scheme
- The rptmstat workload does not accurately predict
PolyMix-4 peak throughput. - rptmstat is a more difficult workload because DHR
is always 60 - rptmstat predicts modified PolyMix-4 performance
within 10.
32More Information
- Squid www.squid-cache.org
- Web Polygraph www.web-polygraph.org
- Duane Wessels wessels_at_squid-cache.org