Request Distribution in Server Clusters - PowerPoint PPT Presentation

About This Presentation

Title:

Request Distribution in Server Clusters

Description:

View items, prices, availability. Select an item type. Specify ... Request Distribution (LARD) attempts to exploit locality of working sets on different servers ... – PowerPoint PPT presentation

Number of Views:56

Avg rating:3.0/5.0

Slides: 23

Provided by: helent8

Category:

more less

Transcript and Presenter's Notes

Title: Request Distribution in Server Clusters

1
Request Distribution in Server Clusters
2
Web site infrastructure

Clustered, multi-tiered architectures

e-Shopping Open the portal home page Login
View items, prices, availability Select an item
type Specify the no. of items Confirm by
entering the credit card number Logout
3
WS vs. AS

Web servers
Do well defined and quantifiable local work
e.g., processing HTTP headers, serving static
content
Application servers
Run multi-layer programs
e.g., scripts involving calls to backends

4
ReDal

In clustered, multi-tiered architectures, two
request distribution points
Web Server Request Distribution (WSRD)
Web switch distributes requests to the
web server cluster
Application Server Request Distribution (ASRD)
Web server distributes requests
requiring business logic to the application
server cluster

ReDal Request Distribution for the Application
Layer An approach for efficient distribution of
requests across a cluster of application servers
5
Web Server Request Distribution

Many policies Random, Round Robin (RR),
Weighted Round Robin (WRR), Least
Connections
Several of these policies are commercially
implemented
(e.g., Ciscos Local Director and F5s
BIG/IP)
Two improvements
Session Affinity
Locality-Aware
Request Distribution (LARD)
attempts to exploit locality of working sets on
different servers
not applicable to dynamically generated
content

Session Affinity Consecutive requests in a given
user session will be served faster if they are
handled by the same server
6
Application Server Request Distribution

Dynamic scheduling techniques usually presuppose
some knowledge of task
(e.g., duration, weight) and/ or resource (e.g.,
queue sizes, service times)
In ASRD, both tasks and resources are highly
dynamic
So, techniques are adaptations of WSRD techniques
Most common technique combination of RR and
Session Affinity
Requests starting new sessions are dispatched
according to RR
Subsequent requests in a session are routed to
the server where the sessions previous request
was served, i.e., where the session object
resides
gt frequently results in load imbalances

7
ReDal Motivation

Request distribution combining
RR and Session Affinity
Short and long sessions arrive at at one-minute
intervals
S S L S S L S L L S

8
ReDAL Objective

Distribute requests across a cluster of
application servers such that
Load on each application server is kept below a
certain threshold
Session affinity is preserved where possible

9
ReDAL Components
Application Analyzer characterizes behavior
of application server Runs in offline phase to
record peak throughput/load values, which are
used at runtime by Request Dispatcher
Request Dispatcher
routes requests to a set of
application servers Monitors expected and
actual load on each application server Routes a
given request to the affined server if
lightly loaded else to application server
having lowest expected load
10
ReDAL Algorithm

based on key observation
think-time or view-time on a page is predictable
based on past behavior

Jeffrey Heer and Ed H. Chi (Palo Alto Xerox
Research Center), Mining the Structure of User
Activity using Cluster Stability, Proceedings of
the Web Analytics Workshop, SIAM Conference on
Data Mining (2002)
11
ReDal Capacity Reservation

Consider a finite lookahead period partitioned
into discrete time periods or slices

Current Time
Think Time
r1
r2
Time
t1
t2
Time Slice
Slice 0
Slice 1
Slice 2

Load metrics
Actual Load number of requests in time slice
Expected Load number of requests expected in a
time slice based on think time, i.e., time
between subsequent requests in a session
e.g., Capacity is reserved for request r2 on this
application server during time slice 2
Modified Load Actual Load ? Expected Load (0
? ? ? 1)
? accounts for prediction errors

12
ReDal Algorithm Overview

Inputs
Request in a session, Think time, Time slice
duration, ?
Output
Assignment of request to application server A
A NULL
A SessionAffinity()
If A is NULL
A LeastLoaded()
UpdateLoadMetrics()
AdvanceTimeSlice()
Return A

SessionAffinity If ActualLoad() lt PeakLoad()
Return AffinedServer()
LeastLoaded If request is part of new session A
LeastLoaded(modified) Else A
LeastLoaded(actual) Return A
13
Consistent global view of metadata

Multicasting of changed load info by
WS request dispatcher
Session objects virtualized in a shared db
Web server records time of response in a cookie
useful for estimating think times in web server
clusters

14
ReDal Evaluation
HJ (Hwang and Jung, 2002) uses least-active-reque
sts routing policy not applicable to stateful
applications

ReDal, RR, HJ implemented as
Apache Web Server plug-ins
Load generator simulates a varying number of
simultaneous user sessions, each session
submitting a stream of requests
Each request chosen from a uniform distribution
across the high and low load transaction requests
Load generator (LoadRunner 6), Web server
(Apache), 10 application server instances
(WebLogic 7.1), and session repository (Oracle
8), each running on separate hardware
Machine configuration single-CPU (900 MHz), 1GB
RAM, 20 GB disk, running Windows 2000 Advanced
Server (SP3)

15
ReDal Experimental Results

Performance Metrics
Average Throughput per Application Server (ATAS)
average number of transactions per second an
application server in the cluster provides
Average Response Time (ART) average response
time provided by the application servers,
measured from the end user perspective
Web Server CPU Utilization (WSCU) percentage CPU
utilization on the web server, measured by OS
utilities
Peak CPU on the Application Servers peak
percentage CPU usage among a cluster of
application servers measured by OS utilities.
Scaling with Application Servers percentage CPU
usage in web server for various number of
application servers in application server
cluster.

16
Throughput Performance

ReDAL (0.9) is ReDAL algorithm with ? 0.9
ReDAL (0.5) is ReDAL algorithm with ? 0.5

ReDAL with ? 0.9 case has highest throughput
17
Response Time Performance
ReDAL with ? 0.9 case has best response time
18
CPU Overhead on the Web Server
Additional overhead of ReDal algorithm is 1.5 or
less
19
Peak CPU Utilization on Application Servers
Highest in the RR case and lowest in the ReDAL
(? 0.9) case
20
Scaling with Application Servers
overhead of ReDAL algorithm is at or below 15
for 100 concurrent sessions
21
Real World Evaluation

Online credit card application
30 WebLogic application servers on Linux Redhat
9.0
Apache Web Server on Linux RedHat 9.0
Machine hardware configuration 1 GB RAM, 2.2
GHz dual processors
Load was simulated by re-tracing web log
collected during various times over a day

At a peak load of 1000 simultaneous sessions,
ReDAL improved the response time of RR by 100.
22
Summary
ReDal Application server load Distribution
Maximizes affinity Exploits application
characteristics Practical and scalable

Write a Comment

User Comments (0)