Scalable Web Server Clustering Technologies - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Scalable Web Server Clustering Technologies

Description:

Trevor Schroeder,Steve Goddard, and Byrav Ramamurthy University of ... Client/server transactions should be relatively short and hight in frequency. 3/27/09 ... – PowerPoint PPT presentation

Number of Views:67

Avg rating:3.0/5.0

Slides: 29

Provided by: cpl5

Category:

more less

Transcript and Presenter's Notes

Title: Scalable Web Server Clustering Technologies

1
Scalable Web Server Clustering Technologies

Trevor Schroeder,Steve Goddard, and Byrav
Ramamurthy University of Nebraska-Lincoln
IEEE Netwwork ,volume 14 Issue 13 May-June 2000

2
Outline

Introduction
Terminology
L4/2 Clustering
L4/3 Clustering
L7 Clustering
Conclusions

3
Introduction

Background Growth of Internet, Dynamic content
and increasing users force us to find faster
server (Web).
In the past, we replaced the web server with
faster machine (processor).
Drawback
Short-term (Moore Law, the number of
transistors per integrated circuit would double
every 18 months)
Expensive we need to replace almost the
whole machine.
Solution Add more processor or machine to the
Web server. (It is commodity hardware and
software, so that we can keep the past
investment.)

4
Introduction (cont.)

Any server application may be clustered as long
as it fulfills the following two properties.
The application must maintain no state on the
server
Client/server transactions should be relatively
short and hight in frequency

5
Terminology

L4/2 Layer 4 Switching with Layer 2 Packet
Forwarding. The system has identical layer 3
(Network) with unique MAC address.
L4/3 Layer 4 Switching with Layer 3 Packet
Forwarding. The system has identical layer 4
(Transport, same services) with unique network
address.
Layer 7 Switch Make forwarding decision based on
the content of client requests. It can employ
L4/2 or L4/3.

6
Terminology (cont.)

Client-side Transparency The whole cluster
servers appear to be a single host to clients
because of the dispatcher.
Server-side Transparency Each cluster server
runs standard web-server designed for standalone
server. It servers the requests forwarded from
dispatcher just the same as the requests come
directly from the clients.
Performance Index Connections per seconds or
bits per seconds. (Cluster Maximum Utilization)

7
L4/2 Clustering

The clusters IP address (A) is shared by the
dispatcher and servers through the use of primary
and secondary IP addresses. (BK Each host can
have several IP addresses.)
The dispatchers primary IP address is A.
The servers use A as secondary address.
All packets whose destinations are A are
forwarded to the dispatcher through the use of
Address Resolution Protocol (ARP) in the nearest
gateway/router.

8
Technology Specification

Load-Sharing Algorithm Round-Robin or other
policies.
Session Map When request is connection
initiation, if it belongs to established
connection in the map, forward it to the
previously selected server, or select a server
and save the connection in the map. If it doesnt
contain a SYN, it maybe discarded or not.
Backup method To avoid the down of the
dispatcher and servers.

9
L4/2 Traffic Flow
10
L4/2 Traffic Flow (cont.)

A client sends a request to A.
The router sends the request to the dispatcher.
Based on the load-sharing algorithm, the
dispatcher selects actual server (2) to serve the
client.
Server 2 replies the client directly.

11
Advantage vs. Disadvantage

Advantage
Servers reply clients directly, which avoid the
dispatcher to be bottleneck.
Dont need to recalculate the checksum because it
operates on layer 2.
Disadvantage
There must be direct physical connection to all
servers and the dispatcher.

12
ONE-IP (Bell Lab, 1996)

Load-Sharing Algorithm
Routing-based Dispatching Hash the incoming
clients address to get a number that indicates
which server to service the request

13
ONE-IP (cont.)

Broadcast based dispatching Each server has a
fixed and disjoint portion of the address space.

14
ONE-IP (cont.)

Drawback Cannot adapt to the condition that the
client requests are disproportionately
distributed.
Backup Watchdog daemonm watchd
Dispatcher fail The backup dispatcher will
notice the missing heartbeat of the primary
dispatcher and take over.
Server fail Reconfigure the hash table or the
address filters on other servers.

15
L4/3 Clustering

The dispatcher appears as a single host to
clients while as a gateway to the servers (IP
address A).
Each server has its own IP address that can be
globally unique or locally unique (IP addresses
B1, B2, , Bn).
Load sharing algorithm Round robin or other
algorithms
Keep a session map table.

16
L4/3 Clustering (cont.)
17
L4/3 Clustering (cont.)

A client sends request with A as the destination
The packet comes to the dispatcher
Based on the load sharing algorithm and session
table, select the server, rewrite the destination
IP address, recalculate the checksums, forward it
to the server
The server replies the request through the
dispatcher (gateway) address A as the destination
address.
The dispatcher rewrite the source IP address of
reply as A, recalculate the checksums, forward it
to the client.
Disadvantage
Recalculate twice the checksums. (IP and TCP)
All traffic flow through the dispatcher.
(Bottleneck)

18
Magicrouter

University of California at Berkeley, 1996
Fast Packet Interposing and modifications of
kernel
Load sharing Algorithms
Round robin
Random
Incremental Load
Backup
Dispatcher primary backup model.
Server Use ARP to map server IP addresses to MAC
addresses to detect the fail of servers.

19
LocalDirector (Cisco, 1996)

Load sharing Algorithm
Least connections choose the server with fewest
connections
Fastest Response choose the server that response
the request first.
Round-Robin Strictly RR policy.
Backup
Dispatcher extra LocalDirector unit that linked
to the primary one with special failover cable
Server Contact servers periodically, when fail,
remove it, continue to contact, when up, add to
the server pool

20
LSNAT

University of Nebraska-Lincoln
RFC2391 Load Sharing using IP Network Address
Translation (LSNAT)
Backup
Dispatcher select one server as new dispatcher.
Distributed State Reconstruction Mechanism to
rebuild the map of existing connections.
Server Exclude from active servers pool. When
up, include it again.

21
L7 Clustering

Make dispatch decision based on the content.
(Application Layer)
Content-based dispatching

22
LARD

Locality-Aware Request Distribution, Rice
University
It uses TCP handoff protocol with the modified
kernel.
Different server processes different kind of
requests, which can make use of specialized
server.

23
Web Accelerator (IBM)

The accelerator can now perform content-based
routing in which it makes intelligent decisions
about where to route requests based on the URL.
L7 based on L4/2
Web page caching
The dispatcher services as a gateway/router.
All traffic flows through the dispatcher.

24
ArrowPoint

Content-based dispatching policy
Caching mechanism is similar to Web Accelerator
Sticky connection
Hot standby of the dispatcher and server node
fail detection mechanism.

25
Conclusion(1/4)

L4/2 Clustering
Bottleneck power of dispatcher to process
incoming request
Advantage Sustainable request rate.
L4/3 Clustering
Bottleneck recalculation of checksums.
L7 Clustering
Bottleneck complexity of content-based
dispatching algorithm
Advantage Localizing request space and caching
request results.

26
Conclusion (2/4)

Qualitative comparison
Client-based approach
Advantage Reduce the load on web server by
implementing route service in client side.
Disadvantage It is not general applicability and
it need the server-side cooperation.
Dispatcher-based approach
Advantage Full control of client requests to
gain good load balancing. Easy to implementation.
Disadvantage Risk of dispatcher bottleneck.

27
Conclusion (3/4)

DNS-based approach
Advantage High Scalability. No risk of
bottleneck.
Disadvantage
Due to the address caching mechanisms, need
sophisticated algorithms to gain load balancing.
Less than 32 web servers for each public URL
because of the limitation of UDP packet size.
Server-based approach
Advantage No risk of single-point failure and
bottleneck.
Disadvantage Redirection will increase the
latency time for clients.

28
Conclusion (4/4)

Write a Comment

User Comments (0)