This paper presents several key techniques for - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

This paper presents several key techniques for

Description:

... specifying the domain name are mapped to servers in round robin fashion ... Dynamic pages are essential at sites that provide frequently changing data, CPU ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 24
Provided by: eceEng
Category:

less

Transcript and Presenter's Notes

Title: This paper presents several key techniques for


1
(No Transcript)
2
Introduction
  • This paper presents several key techniques for
  • Designing Web sites that need to handle large
  • request volumes and provide high availability.
  • It also gives an overview on how many of
  • these techniques were deployed at the official
  • Web site for the 1998 Olympic Winter
  • Games in Nagano, Japan.

3
Topic of Discussion
  • Redundant Hardware and Load Balancing
  • Web Server Acceleration
  • Efficient Management Of Dynamic Data
  • 1998 Olympic games site

4
Topic One Redundant Hardware and Load Balancing
  • Redundant Hardware Multiple Server running on
    different computers(143 processor for 1998
    Olympic games )
  • Load Balancing
  • Round-Robin Domain Name server(RR-DNS) approach
    a single domain name is associate with multiple
    IP address. Clients requests specifying the
    domain name are mapped to servers in round robin
    fashion
  • problems
  • Server-side caching load imbalance even with
    specified TTL
  • Client-side caching load imbalance, lower mean
    loads
  • Node failures difficult to provide availability

5
  • TCP Routing a node of the cluster server as
    router forwarding client request to server nodes
    in the cluster in round-robin order
  • advantages
  • Using different load -based algorithms
  • Detecting Web server node failure
  • Achieving good scalability(when combined with
    RR-DNS)

6
Topic One Redundant Hardware and Load Balancing
  • Commercially available TCP router
  • IBM, Redware, Resonate and Cisco
  • Ciscos LocalDirctor the packets returned from
    server go through the router
  • IBMs Network Dispatcher(ND)
  • running under an embedded OS can rout 10,000
    HTTP request/sec

7
Topic One Redundant Hardware and Load Balancing
  • An embedded OS improves router performance by
    optimizing the TCP communications stack and
    eliminating the scheduler and interrupt
    processing overheads of a general-purpose OS
  • allowing requests to be routed with an affinity
    toward specific server. This avoid generate
    multiple session key for the requests encrypted
    using Secure Sockets Layer(SSL)

8
Topic Two Web server Accelerators
  • Description web sit cache which servers
    frequently requested pages

9
Topic Two Web server Accelerators
  • IBMs Web server Accelerators
  • Run under an embedded operating system
  • Serve higher request pages(5000pages/sec)
  • API allow caching of dynamic pages
  • To reduce cache miss overhead persistent TCP
    connections need to be kept between the cache and
    the server
  • Operate in one of two modes transparent or
    dynamic

10
Topic Two Web server Accelerators
  • Performance Cache on a uniprocessor 200-MHZ
    Power PC, and using least recently used(LRU) for
    cache replacement.

11
Topic Three Efficient Dynamic Data Serving
  • Dynamic pages are essential at sites that provide
    frequently changing data, CPU overhead associated
    with repeatedly generating them can cause
    performance bottleneck.
  • Caching technique can improve the performance,
    but rise with the problem on determining which
    pages to cache and when they become obsolete.

12
Topic Three Efficient Dynamic Data Serving
  • Data Update Propagation(DUP) is developed for
    cache management.
  • DUP maintains dependencies between cached objects
    and underlying data
  • A trigger monitor program can detect changes of
    data, and system can invalidate or update cache
    objects that are obsolete

13
Topic Three Efficient Dynamic Data Serving
  • Dependencies are represented by a directed graph-
    object dependence graph(ODG), wherein a vertex
    usually represents an object or underlying data.
    An edge from a vertex v to another vertex u
    indicates that a change to v also affects u.

14
Topic Three Efficient Dynamic Data Serving
  • Interfaces for creating dynamic data
  • Interface for invoking server programs that
    create dynamic pages has significant effect on
    performance
  • Common gateway interface(CGI) creates a new
    process to handle each request which incurs
    considerable overhead
  • FastCGI establishes long-running process to web
    requests, but needs some communication overhead
    between web server and process

15
Topic Three Efficient Dynamic Data Serving
  • IBMs GWAPI, Netscapes NSAPI, and Microsofts
    ISAPI as well as Apaches modules all run server
    tasks in separate threads. Unfortunately, these
    interface can be tricky to use in practice, with
    issues such as portability, thread safety, and
    memory management.
  • More recent approaches, such as IBMs JSP,
    Microsofts ASP, Java Servlets and Apaches
    mod-perl hide those interface and issues of
    thread safety also provide built-in garbage
    collection, thus ease the creation of program,
    maintenance, and portability.

16
Topic Four 1998 Olympic Games Site
  • The 1998 Winter games web sites architecture was
    an outgrowth of experience with 1996 Summer games
    web site.
  • A key objective for 1998 site was reduce hits by
    giving clients the information on the home page
    for the current day. Redesign of the pages led to
    at least a three-fold decrease of the hit rate.
    The 1998 server log suggests more than 25 of the
    users found the information with a signal hit.

17
Topic Four 1998 Olympic Games Site
  • Site Architecture
  • Utilized 13 IBM Scalable Power Parallel(SP2)
    system at four complexes scattered around the
    world, containing 143 processors, 78Gbytes of
    memory, and more than 2.4 Terabytes of disk space
    for high performance and availability. 100
    availability was achieved by using replication
    information and redundant hardware.
  • Dynamic pages were created via FastCGI
    interface, and cached using the DUP algorithm.
    Achieving cache hit rates of better than 97. The
    1996 web site without employing DUP, many current
    pages were invalidated in the process to ensure
    all stale pages were removed, but the hit rates
    were around 80.
  • Prefetching is another key component in
    achieving near 100 hit rates

18
Topic Four 1998 Olympic Games Site
  • System Architecture
  • web pages were served from four location
    Schaumburg(4 SP2), Illinois Columbus(3 SP2),
    Ohio Bethesda(3 SP2), Maryland and Tokyo (3
    SP2), Japan.
  • Each SP2 composed of 10 RISC/6000 UP and 1
    RISC/6000 8-way SMP. Each UP had 512 Mbyte of
    memory and approximately 18 Gbytes of disk space.
    Each SMP had 1Gbyte of memory and approximately 6
    Gbyte of disk space. Numerous machines at each
    location were also dedicate to maintenance,
    support, file serving, networking, routing, and
    various other functions

19
Topic Four 1998 Olympic Games Site
  • Data flow from Nagano to the internet and the
    scoring

20
Topic Four 1998 Olympic Games Site
  • Data flow from the master database to the
    internet server

21
Topic Four 1998 Olympic Games Site
  • Local load balancing and high availability
  • IBMs Network Dispatcher were used as load
    balancers(LB)
  • LB servers ran the gated routing daemon which
    configured to advertise IP address as routes to
    the routers via dynamic routing protocol. Each LB
    was assigned a different cost based on if it was
    the primary or secondary server for an IP
    address. The secondary server has higher cost.
  • The routers redistributed these routes into the
    network. The LB that was the primary source for
    the address assigned to incoming requests at the
    closest complex. Only if the primary LB were
    down, the request would go to the secondary LB.
  • Each LB server was connected to a pool of front
    end web servers dispersed among the SP2 at each
    site. Traffic was distributed among web server
    based on advisors information.

22
(No Transcript)
23
Summary
  • Since the 1998 Olympic, these technology has been
    deployed at highly accessed sites including the
    Web sites of 1999 Olympic and 1999 Wimbledon
    tennis
  • The 1999 Wimbledon site made extensive use of Web
    server acceleration technology that was not ready
    for the 1998 Olympic site
  • 1999 Olympic site receive 942 million hits over
    14 days. Peak hit rates of 430,000/min, 125
    million/day. 1998 Olympic site receive 643.7
    million request over 16 days with peak hit rates
    of 110,000/min and 57 million/day.
Write a Comment
User Comments (0)
About PowerShow.com