DotSlash: Handling Web Hotspots at Dynamic Content Web Sites - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

DotSlash: Handling Web Hotspots at Dynamic Content Web Sites

Description:

Different bottlenecks. Database server: on-line bookstore (Amazon) ... Bottlenecks & Metrics. Network (static content): outbound HTTP traffic ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 23
Provided by: henningsc
Category:

less

Transcript and Presenter's Notes

Title: DotSlash: Handling Web Hotspots at Dynamic Content Web Sites


1
DotSlash Handling Web Hotspots at Dynamic
Content Web Sites
  • Weibin Zhao
  • Henning Schulzrinne
  • zwb,hgs_at_cs.columbia.edu
  • Department of Computer Science
  • Columbia University
  • Global Internet 2005
  • March 19, 2005

2
Web Hotspots
Web Server
Internet
  • A well-identified problem
  • Flash crowds, the Slashdot effect
  • 15 minutes of fame
  • Examples
  • Slashdotting, featured Google search, special
    events, breaking news,

3
The Challenge
  • Short-term dramatic surge of request rate
  • Large quick increase
  • Last for a short period
  • Existing mechanisms are not sufficient
  • Capacity planning, CDNs
  • Good for long term, not cost-effective for
    hotspots
  • Caching
  • Not fully controlled by origin server
  • Service degradation, admission control
  • Last resort, not user friendly

4
Dynamic Content Web Sites
  • More vulnerable to hotspots
  • CPU-bound, request rate supported is low
  • Hard to cache dynamic content
  • A much harder problem
  • Different bottlenecks
  • Database server on-line bookstore (Amazon)
  • Web server auction (eBay), bulletin board
    (Slashdot)
  • Caching consistency control

5
Our Approach
  • DotSlash ? counteract the Slashdot effect
  • Rescue system
  • Triggered automatically when load spikes
  • Mutual-aid model for different web sites
  • Cost effective for rare events
  • Automated rescue process
  • Self-configuring build an adaptive distributed
    web server system on the fly
  • Techniques service discovery, dynamic virtual
    hosting, adaptive overload control, dynamic
    script replication

6
Rescue Relationship
rescuing
S3
S2
S7
S4
S1
S8
S6
S5
  • Can provide rescue to multiple servers S3
  • Can get rescue from multiple servers S1
  • Cannot provide/get rescue simultaneously
  • Origin Server S1, S2
  • Rescue Server S3, S4, S5, S6

7
Service Discovery
  • DotSlash directory services
  • Enable web servers from different sites to learn
    about each other register/query
  • Built upon mSLP (Mesh-enhanced Service Location
    Protocol) replicated Directory Agents (DAs)
  • Discover mSLP DAs
  • dot-slash.net DNS domain
  • DNS SRV for dot-slash.net
  • query_name_slpda._tcp.dot-slash.net,
    query_typesrv

8
Workload Monitoring
  • Bottlenecks Metrics
  • Network (static content) outbound HTTP traffic
  • CPU (dynamic content) /proc/stat
  • Moving average filter
  • Load regions
  • Desired
  • Configurable 40, 60
  • Trigger rescue actions

Heavily loaded region
Desired load region
Lightly loaded region
9
DotSlash Rescue Protocol
  • Application level request response
  • Requests SOS, RATE, SHUTDOWN
  • SOS initiate a rescue
  • origin ? rescue
  • RATE adjust allowed redirect (data) rate
  • rescue ? origin
  • SHUTDOWN end a rescue
  • origin ?? rescue

10
Rescue Control
Request more rescue
SOS
Increase Pr
Decrease Pr
Get rescue
Release rescue
Normal
Provide rescue
Shutdown last rescue
Rescue
Increase Rr
Decrease Rr
Provide more rescue
Shutdown some rescue
11
Request Redirection
  • Origin server
  • Offload client requests to rescue servers
  • Two-level redirection
  • DNS-RR
  • Add/remove rescue server IP addresses via dynamic
    DNS update
  • HTTP redirect
  • Use rescue server aliases
  • Dont redirect requests from rescue servers
  • Redirect policies
  • WRR based on rescue server capacity

12
Dynamic Virtual Hosting
  • Rescue server
  • Serve new content (origin server) on the fly
  • Alias
  • Generate dynamically, and register via dynamic
    DNS update
  • Mapping request ? itself / origin server
  • Based on the Host header in the request
  • Three cases
  • Its configured name www.rescue.com ? itself
  • An alias www-vh1.rescue.com (HTTP redirect) ?
    origin
  • An origin server name www.origin.com (DNS-RR) ?
    origin
  • Handle expired mapping

13
DotSlash for Dynamic Content
  • Remove the web server bottleneck
  • Dynamic Script Replication
  • LAMP configuration

MySQL
Apache
origin server
database
(1)
(2)
Client
(5) PHP
(6)
(4)
(3)
(7)
rescue server
(8)
Apache
14
Dynamic Script Replication
  • Rescue server
  • Map a redirected URI to a script file
  • Trigger 404 handler if the script file not found
  • Retrieve the script file
  • Handle file inclusions
  • Set query variables
  • Run the script by invoking native include
  • Origin server
  • If a request is from a rescue server and for
    dynamic content, return the script file

15
Handle File Inclusions
  • The problem
  • A replicated script may include files that are
    located at the origin server
  • Assume included files under DocumentRoot
  • Approaches
  • Renaming inclusion statements
  • Need to parse scripts
  • Customized error handler
  • Catch inclusion errors

16
Implementation
  • Apache module, PHP extension
  • Dynamic DNS dot-slash.net
  • Service discovery enhanced SLP

SHM
Other Dotsd
Apache
Dotsd
Mod_dots
DSRP
HTTP
Client
SLP
DNS
BIND
mSLP
17
Evaluation
  • Experimental Setup
  • Linux machines Redhat 9.0
  • HC 2 GHz CPU, 1 GB memory
  • LC 1 GHz CPU, 512 MB memory
  • Apache 2.0.48, DotSlash module,
  • PHP 4.3.6, DotSlash extension
  • MySQL 4.0.18
  • Benchmark
  • RUBBoS (Rice U.) bulletin board
  • 19 scripts 1 KB to 7 KB
  • 439 MB database

18
Increasing Max Request Rate R
Configuration
Rescue (LC)
Rescue (LC)
Rescue (LC)
Rescue (LC)
Rescue (LC)
Rescue (LC)
Origin (HC)
DB (HC)
Rescue (LC)
Rescue (LC)
Rescue (LC)
No rescue R118
CPU Origin100 DB45
With rescue R245
rescue servers 9
CPU Origin55 DB100
245/118gt2
19
Effectiveness
Another Configuration
Rescue (LC)
Rescue (LC)
Rescue (LC)
Rescue (LC)
Rescue (LC)
Origin (LC)
Rescue (LC)
DB (HC)
Rescue (LC)
Rescue (LC)
Rescue (LC)
Rescue (LC)
With rescue R245
No rescue R49
rescue server 10
245/495
Comparison
Conclusion remove web server bottleneck
20
CPU Utilization Control
21
Workload Migration
22
Conclusions
  • DotSlash framework, prototype, evaluation
  • Fully automated rescue system, transparent to
    clients
  • Scalable
  • Get 10-fold improvement (static content)
  • Remove web server bottleneck (dynamic content)
  • Future work
  • Remove database server bottleneck
  • For further information
  • http//www.cs.columbia.edu/zwb/project/dotslash
  • WCW04
Write a Comment
User Comments (0)
About PowerShow.com