Title: Data Center Networks
1Data Center Networks
- Jennifer Rexford
- COS 461 Computer Networks
- Lectures MW 10-1050am in Architecture N101
- http//www.cs.princeton.edu/courses/archive/spr12/
cos461/
2Networking Case Studies
Data Center
Backbone
Enterprise
Cellular
Wireless
3Cloud Computing
4Cloud Computing
- Elastic resources
- Expand and contract resources
- Pay-per-use
- Infrastructure on demand
- Multi-tenancy
- Multiple independent users
- Security and resource isolation
- Amortize the cost of the (shared) infrastructure
- Flexible service management
5Cloud Service Models
- Software as a Service
- Provider licenses applications to users as a
service - E.g., customer relationship management, e-mail,
.. - Avoid costs of installation, maintenance,
patches, - Platform as a Service
- Provider offers platform for building
applications - E.g., Googles App-Engine
- Avoid worrying about scalability of platform
6Cloud Service Models
- Infrastructure as a Service
- Provider offers raw computing, storage, and
network - E.g., Amazons Elastic Computing Cloud (EC2)
- Avoid buying servers and estimating resource needs
7Enabling Technology Virtualization
- Multiple virtual machines on one physical machine
- Applications run unmodified as on real machine
- VM can migrate from one computer to another
8Multi-Tier Applications
- Applications consist of tasks
- Many separate components
- Running on different machines
- Commodity computers
- Many general-purpose computers
- Not one big mainframe
- Easier scaling
9Multi-Tier Applications
Front end Server
Aggregator
Aggregator
Aggregator
Aggregator
Worker
Worker
Worker
Worker
Worker
10Data Center Network
11Virtual Switch in Server
12Top-of-Rack Architecture
- Rack of servers
- Commodity servers
- And top-of-rack switch
- Modular design
- Preconfigured racks
- Power, network, andstorage cabling
13Aggregate to the Next Level
14Modularity, Modularity, Modularity
- Containers
- Many containers
15Data Center Network Topology
Internet
CR
CR
. . .
AR
AR
AR
AR
S
S
. . .
S
S
S
S
- Key
- CR Core Router
- AR Access Router
- S Ethernet Switch
- A Rack of app. servers
A
A
A
A
A
A
1,000 servers/pod
16Capacity Mismatch
CR
CR
2001
AR
AR
AR
AR
S
S
S
S
401
. . .
S
S
S
S
S
S
S
S
51
A
A
A
A
A
A
A
A
A
A
A
A
17Data-Center Routing
Internet
CR
CR
DC-Layer 3
. . .
AR
AR
AR
AR
DC-Layer 2
S
S
S
S
. . .
S
S
S
S
S
S
S
S
- Key
- CR Core Router (L3)
- AR Access Router (L3)
- S Ethernet Switch (L2)
- A Rack of app. servers
A
A
A
A
A
A
1,000 servers/pod IP subnet
18Reminder Layer 2 vs. Layer 3
- Ethernet switching (layer 2)
- Cheaper switch equipment
- Fixed addresses and auto-configuration
- Seamless mobility, migration, and failover
- IP routing (layer 3)
- Scalability through hierarchical addressing
- Efficiency through shortest-path routing
- Multipath routing through equal-cost multipath
- So, like in enterprises
- Connect layer-2 islands by IP routers
19Case Study Performance Diagnosis in Data Centers
- http//www.eecs.berkeley.edu/minlanyu/writeup/nsd
i11.pdf
20Applications Inside Data Centers
.
.
.
.
Aggregator
Workers
Front end Server
21Challenges of Datacenter Diagnosis
- Multi-tier applications
- Hundreds of application components
- Tens of thousands of servers
- Evolving applications
- Add new features, fix bugs
- Change components while app is still in operation
- Human factors
- Developers may not understand network well
- Nagles algorithm, delayed ACK, etc.
22Diagnosing in Todays Data Center
App logs Reqs/sec Response time 1 req. gt200ms
delay
Packet trace Filter out trace for long delay req.
Host
App
Packet sniffer
OS
Switch logs bytes/pkts per minute
SNAP Diagnose net-app interactions
23Problems of Different Logs
App logs Application-specific
Packet trace Too expensive
Host
App
Packet sniffer
OS
Switch logs Too coarse-grained
SNAP Generic, fine-grained, and lightweight
Runs everywhere, all the time
24TCP Statistics
- Instantaneous snapshots
- Bytes in the send buffer
- Congestion window size, receiver window size
- Snapshots based on random sampling
- Cumulative counters
- FastRetrans, Timeout
- RTT estimation SampleRTT, SumRTT
- RwinLimitTime
- Calculate difference between two polls
25Identifying Performance Problems
- Not any other problems
- Send buffer is almost full
- Fast retransmission
- Timeout
- RwinLimitTime
- Delayed ACK
- diff(SumRTT)/diff(SampleRTT) gt MaxDelay
Sender App
Send Buffer
Sampling
Network
Direct measure
Receiver
Inference
26SNAP Architecture
At each host for every connection
Collect data
- Direct access to OS
- Polling per-connection statistics
- Snapshots (bytes in send buffer)
- Cumulative counters (FastRestrans)
- Adaptive tuning of polling rate
27SNAP Architecture
At each host for every connection
Collect data
Performance Classifier
- Classifying based on the life of data transfer
- Algorithms for detecting performance problems
- Based on direct measurement in the OS
28SNAP Architecture
At each host for every connection
Cross-connection correlation
Collect data
Performance Classifier
- Direct access to data center configurations
- Input
- Topology, routing information
- Mapping from connections to processes/apps
- Correlate problems across connections
- Sharing the same switch/link, app code
29SNAP Deployment
- Production data center
- 8K machines, 700 applications
- Ran SNAP for a week, collected petabytes of data
- Identified 15 major performance problems
- Operators Characterize key problems in data
center - Developers Quickly pinpoint problems in app
software, network stack, and their interactions
30Characterizing Perf. Limitations
Apps that are limited for gt 50 of the time
Sender App
- Bottlenecked by CPU, disk, etc.
- Slow due to app design (small writes)
551 Apps
1 App
Send Buffer
- Send buffer not large enough
- Fast retransmission
- Timeout
6 Apps
Network
8 Apps
- Not reading fast enough (CPU, disk, etc.)
- Not ACKing fast enough (Delayed ACK)
Receiver
144 Apps
31Delayed ACK
- Delayed ACK caused significant problems
- Delayed ACK was used to reduce bandwidth usage
and server interruption
B
A
Data
Delayed ACK should be disabled in data centers
B has data to send
DataACK
.
Data
B doesnt have data to send
200 ms
ACK
32Diagnosing Delayed ACK with SNAP
- Monitor at the right place
- Scalable, low overhead data collection at all
hosts - Algorithms to identify performance problems
- Identify delayed ACK with OS information
- Correlate problems across connections
- Identify the apps with significant delayed ACK
issues - Fix the problem with operators and developers
- Disable delayed ACK in data centers
33Conclusion
- Cloud computing
- Major trend in IT industry
- Todays equivalent of factories
- Data center networking
- Regular topologies interconnecting VMs
- Mix of Ethernet and IP networking
- Modular, multi-tier applications
- New ways of building applications
- New performance challenges
34Load Balancing
35Load Balancers
- Spread load over server replicas
- Present a single public address (VIP) for a
service - Direct each request to a server replica
10.10.10.1
Virtual IP (VIP) 192.121.10.1
10.10.10.2
10.10.10.3
36Wide-Area Network
37Wide-Area Network Ingress Proxies