Request Distribution in Server Clusters - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Request Distribution in Server Clusters

Description:

View items, prices, availability. Select an item type. Specify ... Request Distribution (LARD) attempts to exploit locality of working sets on different servers ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 23
Provided by: helent8
Category:

less

Transcript and Presenter's Notes

Title: Request Distribution in Server Clusters


1
Request Distribution in Server Clusters
2
Web site infrastructure
  • Clustered, multi-tiered architectures

e-Shopping Open the portal home page Login
View items, prices, availability Select an item
type Specify the no. of items Confirm by
entering the credit card number Logout
3
WS vs. AS
  • Web servers
  • Do well defined and quantifiable local work
  • e.g., processing HTTP headers, serving static
    content
  • Application servers
  • Run multi-layer programs
  • e.g., scripts involving calls to backends

4
ReDal
  • In clustered, multi-tiered architectures, two
    request distribution points
  • Web Server Request Distribution (WSRD)
  • Web switch distributes requests to the
    web server cluster
  • Application Server Request Distribution (ASRD)
  • Web server distributes requests
    requiring business logic to the application
    server cluster

ReDal Request Distribution for the Application
Layer An approach for efficient distribution of
requests across a cluster of application servers
5
Web Server Request Distribution
  • Many policies Random, Round Robin (RR),
  • Weighted Round Robin (WRR), Least
    Connections
  • Several of these policies are commercially
    implemented
  • (e.g., Ciscos Local Director and F5s
    BIG/IP)
  • Two improvements
  • Session Affinity
  • Locality-Aware
  • Request Distribution (LARD)
  • attempts to exploit locality of working sets on
    different servers
  • not applicable to dynamically generated
    content

Session Affinity Consecutive requests in a given
user session will be served faster if they are
handled by the same server
6
Application Server Request Distribution
  • Dynamic scheduling techniques usually presuppose
    some knowledge of task
  • (e.g., duration, weight) and/ or resource (e.g.,
    queue sizes, service times)
  • In ASRD, both tasks and resources are highly
    dynamic
  • So, techniques are adaptations of WSRD techniques
  • Most common technique combination of RR and
    Session Affinity
  • Requests starting new sessions are dispatched
    according to RR
  • Subsequent requests in a session are routed to
    the server where the sessions previous request
    was served, i.e., where the session object
    resides
  • gt frequently results in load imbalances

7
ReDal Motivation
  • Request distribution combining
  • RR and Session Affinity
  • Short and long sessions arrive at at one-minute
    intervals
  • S S L S S L S L L S

8
ReDAL Objective
  • Distribute requests across a cluster of
    application servers such that
  • Load on each application server is kept below a
    certain threshold
  • Session affinity is preserved where possible

9
ReDAL Components
Application Analyzer characterizes behavior
of application server Runs in offline phase to
record peak throughput/load values, which are
used at runtime by Request Dispatcher
Request Dispatcher
routes requests to a set of
application servers Monitors expected and
actual load on each application server Routes a
given request to the affined server if
lightly loaded else to application server
having lowest expected load
10
ReDAL Algorithm
  • based on key observation
  • think-time or view-time on a page is predictable
    based on past behavior

Jeffrey Heer and Ed H. Chi (Palo Alto Xerox
Research Center), Mining the Structure of User
Activity using Cluster Stability, Proceedings of
the Web Analytics Workshop, SIAM Conference on
Data Mining (2002)
11
ReDal Capacity Reservation
  • Consider a finite lookahead period partitioned
    into discrete time periods or slices

Current Time
Think Time
r1
r2
Time
t1
t2
Time Slice
Slice 0
Slice 1
Slice 2
  • Load metrics
  • Actual Load number of requests in time slice
  • Expected Load number of requests expected in a
    time slice based on think time, i.e., time
    between subsequent requests in a session
  • e.g., Capacity is reserved for request r2 on this
    application server during time slice 2
  • Modified Load Actual Load ? Expected Load (0
    ? ? ? 1)
  • ? accounts for prediction errors

12
ReDal Algorithm Overview
  • Inputs
  • Request in a session, Think time, Time slice
    duration, ?
  • Output
  • Assignment of request to application server A
  • A NULL
  • A SessionAffinity()
  • If A is NULL
  • A LeastLoaded()
  • UpdateLoadMetrics()
  • AdvanceTimeSlice()
  • Return A

SessionAffinity If ActualLoad() lt PeakLoad()
Return AffinedServer()
LeastLoaded If request is part of new session A
LeastLoaded(modified) Else A
LeastLoaded(actual) Return A
13
Consistent global view of metadata
  • Multicasting of changed load info by
  • WS request dispatcher
  • Session objects virtualized in a shared db
  • Web server records time of response in a cookie
  • useful for estimating think times in web server
    clusters

14
ReDal Evaluation
HJ (Hwang and Jung, 2002) uses least-active-reque
sts routing policy not applicable to stateful
applications
  • ReDal, RR, HJ implemented as
  • Apache Web Server plug-ins
  • Load generator simulates a varying number of
    simultaneous user sessions, each session
    submitting a stream of requests
  • Each request chosen from a uniform distribution
    across the high and low load transaction requests
  • Load generator (LoadRunner 6), Web server
    (Apache), 10 application server instances
    (WebLogic 7.1), and session repository (Oracle
    8), each running on separate hardware
  • Machine configuration single-CPU (900 MHz), 1GB
    RAM, 20 GB disk, running Windows 2000 Advanced
    Server (SP3)

15
ReDal Experimental Results
  • Performance Metrics
  • Average Throughput per Application Server (ATAS)
    average number of transactions per second an
    application server in the cluster provides
  • Average Response Time (ART) average response
    time provided by the application servers,
    measured from the end user perspective
  • Web Server CPU Utilization (WSCU) percentage CPU
    utilization on the web server, measured by OS
    utilities
  • Peak CPU on the Application Servers peak
    percentage CPU usage among a cluster of
    application servers measured by OS utilities.
  • Scaling with Application Servers percentage CPU
    usage in web server for various number of
    application servers in application server
    cluster.

16
Throughput Performance
  • ReDAL (0.9) is ReDAL algorithm with ? 0.9
  • ReDAL (0.5) is ReDAL algorithm with ? 0.5

ReDAL with ? 0.9 case has highest throughput
17
Response Time Performance
ReDAL with ? 0.9 case has best response time
18
CPU Overhead on the Web Server
Additional overhead of ReDal algorithm is 1.5 or
less
19
Peak CPU Utilization on Application Servers
Highest in the RR case and lowest in the ReDAL
(? 0.9) case
20
Scaling with Application Servers
overhead of ReDAL algorithm is at or below 15
for 100 concurrent sessions
21
Real World Evaluation
  • Online credit card application
  • 30 WebLogic application servers on Linux Redhat
    9.0
  • Apache Web Server on Linux RedHat 9.0
  • Machine hardware configuration 1 GB RAM, 2.2
    GHz dual processors
  • Load was simulated by re-tracing web log
    collected during various times over a day

At a peak load of 1000 simultaneous sessions,
ReDAL improved the response time of RR by 100.
22
Summary
ReDal Application server load Distribution
Maximizes affinity Exploits application
characteristics Practical and scalable
Write a Comment
User Comments (0)
About PowerShow.com