Internet-Scale Research at Universities - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Internet-Scale Research at Universities

Description:

... based measurements across 12 ... I can just quote some measurement results from previous papers ... A scalable measurement methodology helps ease of adoption ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 13
Provided by: bhaskar5
Category:

less

Transcript and Presenter's Notes

Title: Internet-Scale Research at Universities


1
Internet-Scale Research at Universities
  • Panel Session
  • SAHARA Retreat, Jan 2002
  • Prof. Randy H. Katz,
  • Bhaskaran Raman,
  • Z. Morley Mao,
  • Yan Chen

2
Problem Statement
  • Overlay network for service composition
  • Want to study recovery algorithms
  • Lots of client sessions
  • Methodology for evaluation of design?
  • Simulation?
  • Slow, does not scale with nodes, client
    sessions
  • Does not bring out processing bottlenecks
  • Real testbed?
  • Cannot be large setup and management problems
  • Non-repeatable, not good for controlled design
    study

3
Our approach so far
  • Emulation platform
  • Real implementation of software, but emulation of
    n/w parameters
  • Inspired by NistNET
  • Developed our own user-level implementation
  • Gave us better control
  • Runs on the Millennium cluster of workstations
  • Central bottleneck 20,000 pkts/sec

4
Parameters modeled
  • Overlay topology
  • Generate 6,510-node physical network using GT-ITM
  • Choose subset of nodes for overlay network
  • Latency modeling
  • Base latency according to edge weight
  • Variation in accordance with RTT spikes are
    isolated
  • Outage period
  • Using traces
  • Collected UDP-based measurements across 12 host
    pairs
  • Berkeley, Stanford, UNSW (Australia), UIUC,
    TU-Berlin (Germany), CMU
  • CDF of outage periods, used to model outage
    periods

5
My experience in Internet measurement
  • Goal
  • collect client-Local DNS server associations
  • to evaluate DNS-based server selection
  • Built a measurement infrastructure
  • Three components
  • 1x1 pixel embedded transparent GIF image
  • ltimg srchttp//xxx.rd.example.com/tr.gif
    height1 width1gt
  • A specialized authoritative DNS server
  • Allows hostnames to be wild-carded
  • An HTTP redirector
  • Always responds with 302 Moved Temporarily
  • Redirect to a URL with client IP address embedded

6
My experience in Internet measurement
7
My lessons
  • Common myths about Internet measurements
  • Measurements done from University sites are
    representative of the Internet
  • The following are good proximity metrics
  • AS hop count
  • Router hop count
  • I can just quote some measurement results from
    previous papers
  • W/o carefully considering its applicability
  • A scalable measurement methodology helps ease of
    adoption

8
Content Distribution Network (CDN)
  • Dynamic clustering for efficient Web contents
    replication
  • Use greedy algorithm for replica placement to
    reduce the response latency of end users
  • Trace-driven simulation to find optimal
    granularity of replication
  • Network Topology
  • Pure-random transit-Stub models from GT-ITM
  • A real AS-level topology from 7 widely-dispersed
    BGP peers
  • Real world traces
  • -- Cluster MSNBC Web clients with BGP prefix
  • - BGP tables from a BBNPlanet router on
    01/24/2001
  • - 10K clusters left, chooses top 10 covering
    gt70 of requests
  • -- Cluster NASA Web clients with domain names

Web Site Period Duration Total Requests Requests/day
MSNBC 8-10/1999 1011am 10,284,735 1,469,248 (1 hr)
NASA 7/1995 All day 3,461,612 56,748
WorldCup 5-7/1998 All day 1,352,804,107 15,372,774
9
Wide-area Network Distance Estimation
  • Problem formulation
  • Given N end hosts that belong to different
    administrative domains, how to select a subset of
    them to be probes and build an overlay distance
    estimation service without knowing the underlying
    topology?
  • Solution Internet Iso-bar
  • Cluster of hosts that perceive similar
    performance to Internet select a monitor for
    each cluster for active and continuous probing
  • Clustering with congestion/path outage
    correlation
  • Evaluate the prediction accuracy and stability
  • Evaluation Methodology (I)
  • NLANR AMP data set
  • 119 sites on US (106 after filtering out most off
    sites)
  • Traceroute between every pair of hosts every
    minute
  • Clustering uses daily geometric mean of
    round-trip time (RTT)
  • Raw data 6/24/00 12/3/01

10
Evaluation Methodology (II)
  • Keynote Website Perspective benchmarking
  • Measure Web site performance from more than 100
    agents
  • Heterogeneous core network various ISPs
  • Heterogeneous access network
  • Dial up 56K, DSL and high-bandwidth business
    connections
  • Agents locations
  • America (including Canada, Mexico) 67 agents in
    29 cities from 15 ISPs
  • Europe 25 agents in 12 cities from 16 ISPs
  • Asia 8 agents in 6 cities from 8 ISPs
  • Australia 3 agents in 3 cities from 3 ISPs
  • 40 most popular Web servers for benchmarking
  • Side problem how to reduce the number of agents
    and/or servers, but still represent the majority
    of end-user performance for reasonable long
    period?

11
Discussion Difficulties of Internet measurement
  • Results vary greatly depending on your
    measurement methodology
  • The number and identity of sites you measure
  • Commercial vs. educational sites
  • Your measurement location
  • Well-connected site vs. dialup site
  • Backbone vs. access network, server vs. client
  • Time when measurement is taken
  • Time of day, day of year
  • Transient effects
  • E.g., Network congestion, flash crowd
  • Frequency of measurements (for correlation
    studies)
  • Intrusiveness of the measurement
  • Does the measurement affect what you are
    measuring

12
Discussion Issues with Emulation
  • Emulation platform modeling correlations in n/w
    behavior
  • What happens in one part of the Internet may have
    non-zero correlation with behavior of another
    part
  • Scale of topology
  • We have O(100) machines in department
  • O(1500) machines on campus
  • Is this believable?
Write a Comment
User Comments (0)
About PowerShow.com