Berkeley RAD Lab: Research in Internet-scale Computing Systems - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

Berkeley RAD Lab: Research in Internet-scale Computing Systems

Description:

Five Year Mission Observation: Internet systems complex, fragile, manually managed, evolving rapidly To scale Ebay, must build Ebay-sized company To scale YouTube, ... – PowerPoint PPT presentation

Number of Views:41

Avg rating:3.0/5.0

Slides: 35

Provided by: csBerkel

Category:

more less

Transcript and Presenter's Notes

Title: Berkeley RAD Lab: Research in Internet-scale Computing Systems

1
Berkeley RAD Lab Research in Internet-scale
Computing Systems

Randy H. Katz
randy_at_cs.berkeley.edu
28 March 2007

2
Five Year Mission

Observation Internet systems complex, fragile,
manually managed, evolving rapidly
To scale Ebay, must build Ebay-sized company
To scale YouTube, get acquired by a Google-sized
company
Mission Enable a single person to create,
evolve, and operate the next-generation IT
service
The Fortune 1 Million by enabling rapid
innovation
Approach Create core technology spanning
systems, networking, and machine learning
Focus Making datacenter easier to manage to
enable one person to Analyze, Deploy, Operate a
scalable IT service

3
Jan 07 Announcements by Microsoft and Google

Microsoft and Google race to build next-gen DCs
Microsoft announces a 550 million DC in TX
Google confirm plans for a 600 million site in
NC
Google two more DCs in SC may cost another 950
million -- about 150,000 computers each
Internet DCs are the next computing platform
Power availability drives deployment decisions

4
Datacenter is the Computer

Google program Web search, Gmail,
Google computer

Warehouse-sized facilities and workloads
likely more common Luiz Barrosos talk at RAD
Lab 12/11/06
Sun Project Blackbox10/17/06

Compose datacenter from 20 ft. containers!
Power/cooling for 200 KW
External taps for electricity, network, cold
water
250 Servers, 7 TB DRAM, or 1.5 PB disk in 2006
20 energy savings
1/10th? cost of a building

5
Datacenter Programming System

Ruby on Rails open source Web framework
optimized for programmer happiness and
sustainable productivity
Convention over configuration
Scaffolding automatic, Web-based, UI to stored
data
Program the client write browser-side code in
Ruby, compile to Javascript
Duck Typing/Mix-Ins
Proven Expressiveness
Lines of code Java vs. RoR 31
Lines of configuration Java vs. RoR 101
More than a fad
Java on Rails, Python on Rails,

6
Datacenter Synthesis OS

Synthesis change DC via written specification
DC Spec Language compiled to logical
configuration
OS allocate, monitor, adjust during operation
Director using machine learning, Drivers send
commands

7
System StatisticalMachine Learning

S2ML Strengths
Handle SW churn Train vs. write the logic
Beyond queuing models Learns how to handle/make
policy between steady states
Beyond control theory Coping with complex cost
functions
Discovery Finding trends, needles in data
haystack
Exploit cheap processing advances fast enough to
run online
S2ML as an integral component of DC OS

8
Datacenter Monitoring

S2ML needs data to analyze
DC components come with sensors already
CPUs (performance counters)
Disks (SMART interface)
Add sensors to software
Log files
D-trace for Solaris, Mac OS
Trace 10K nodes within and between DCs
Trace App-oriented path recording framework
X-Trace Cross-layer/-domain including network
layer

9
Middleboxes in Todays DC

Middle boxes inserted on physical path
Policy via plumbing
Weakest link 1 point of failure, bottleneck
Expensive to upgrade and introduce new
functionality
Identity-based Routing Layer policy not plumbing
to route classified packets to appropriate
middlebox services

High Speed Network
intrusion detector
load balancer
firewall
10
First Milestone DC Energy Conservation

DCs limited by power
For each dollar spent on servers, add 0.48
(2005)/0.71 (2010) for power/cooling
26B spent to power and cool servers in 2005
grows to 45B in 2010
Attractive application of S2ML
Bringing processor resources on/off-line Dynamic
environment, complex cost function, measurement-
driven decisions
Preserve 100 Service Level Agreements
Dont hurt hardware reliability
Then conserve energy
Conserve energy and improve reliability
MTTF stress of on/off cycle vs. benefits of
off-hours

11
DC Networking and Power

Within DC racks, network equipment often the
hottest components in the hot spot
Network opportunities for power reduction
Transition to higher speed interconnects (10 Gbs)
at DC scales and densities
High function/high power assists embedded in
network element (e.g., TCAMs)

12
Thermal Image of TypicalCluster Rack
M. K. Patterson, A. Pratt, P. Kumar, From UPS
to Silicon an end-to-end evaluation of
datacenter efficiency, Intel Corporation
13
DC Networking and Power

Selectively power down ports/portions of net
elements
Enhanced power-awareness in the network stack
Power-aware routing and support for system
virtualization
Support for datacenter slice power down and
restart
Application and power-aware media access/control
Dynamic selection of full/half duplex
Directional asymmetry to save power, e.g.,
10Gb/s send, 100Mb/s receive
Power-awareness in applications and protocols
Hard state (proxying), soft state (caching),
protocol/data streamlining for power as well
as b/w reduction
Power implications for topology design
Tradeoffs in redundancy/high-availability vs.
power consumption
VLANs support for power-aware system
virtualization

14
Why University Research?

Imperative that future technical leaders learn to
deal with scale in modern computing systems
Draw on talented but inexperienced people
Pick from worldwide talent pool for students
faculty
Dont know what they cant do
Inexpensive -- allows focus on speculative ideas
Mostly grad student salaries
Faculty part time
Tech Transfer engine
Success Train students to go forth and
replicate
Promiscuous publication, including source code
Ideal launching point for startups

15
Why a New Funding Model?

DARPA has exiting long-term research in
experimental computing systems
NSF swamped with proposals, yielding even more
conservative decisions
Community emphasis on theoretical vs.
experimental-oriented systems-building research
Alternative turn to Industry for funding
Opportunity to shape research agenda

16
New Funding Model

30 grad students 5 undergrads 6 faculty 4
staff
Foundation Companies 500K/yr for 5 years
Google, Microsoft, Sun Microsystems
Prefer founding partner technology in prototypes
Many from company attend retreats, advise on
directions, head start on research results
Putting IP in Public Domain so partners use but
not sued
Large Affiliates 100K/yr Fujitsu, HP, IBM,
Siemens
Small Affiliates 50K/yr Nortel, Oracle
State matching programs add 1M/year MICRO,
Discovery

17
Summary

DC is the Computer
OS MLVM, Net Identity-based Routing, FS Web
Storage
Prog Sys RoR, Libraries Web Services
Development Environment RAMP (simulator), AWE
(tester), Web 2.0 apps (benchmarks)
Debugging Environment Trace X-Trace
Milestones
DC Energy Conservation Reliability Enhancement
Web 2.0 Apps in RoR

18
Conclusions

Develop-Analyze-Deploy-Operate modern systems at
Internet scale
Ruby-on-Rails for rapid applications development
Declarative datacenter for correct-by-construction
system configuration and operation
Resource management by System Statistical Machine
Learning
Virtual Machines and Network Storage for flexible
resource allocation
Power reduction and reliability enhancement by
fast power-down/restart for processing nodes
Pervasive monitoring, tracing, simultation,
workload generation for runtime
analysis/operation

19
Discussion Points

Jointly designed datacenter testbed
Mini-DC consisting of clusters, middleboxes, and
network equipment
Representative network topology
Power-aware networking
Evaluation of existing network elements
Platform for investigating power reduction
schemes in network elements
Mutual information exchange
Network storage architecture
System Statistical Machine Learning

20
Ruby on Rails DC PL

Reasons to love Ruby on Rails
Convention over Configuration
Rails framework feature enabled by Ruby language
feature (Meta Object Programming)
Scaffolding automatic, Web based, (pedestrian)
User Interface to stored data
Program the client v 1.1 write browser-side code
in Ruby then compile to Javascript
Duck Typing/Mix-Ins
Looks like string, responds like string, its a
string!
Mix-in improvement over multiple inheritance

21
DC Monitoring

Imagine a world where path information always
passed along so that can always track user
requests throughout system
Across apps, OS, network components and layers,
different computers on LAN,
Unique request ID
Components touched
Time of day
Parent of this request

22
Trace The 1 Solution

Trace Goal Make Path Based Analysis have low
overhead so it can be always on inside datacenter
Baseline path info collection with 1
overhead
Selectively add more local detail for specific
requests
Trace an end-to-end path recording framework
Capture timestamp a unique requestID across all
system components
Top level log contains path traces
Local logs contain additional detail, correlated
to path ID
Built on X-trace

23
X-Trace comprehensive tracing through Layers,
Networks, Apps

Trace connectivity of distributed components
Capture causal connections between
requests/responses
Cross-layer
Include network and middleware services such as
IP and LDAP
Cross-domain
Multiple datacenters, composed services,
overlays, mash-ups
Control to individual administrative domains

Network path sensor
Put individual requests/responses, at different
network layers, in the context of an end-to-end
request

24
ActuatorPolicy-based Routing Layer

Assign ID to incoming packets (hash table
lookup)
Route based on IDs, not locations (i.e., not IP
addr)
Sets up logical paths without changing network
topology
Set of common middle boxes get single ID
No single weakest link robust, scalable
throughput

Load- Balancer (IDLB)
Intrusion- Detection (IDID)
Service (IDS)
Firewall (IDF)

So simple can be done in FPGA?
More general
than MPLS

Identity-based Routing Layer
25
Other RAD Lab Projects

Research Accelerator for MP (RAMP) DC
simulator
Automatic Workload Evaluator (AWE) DC tester
Web Storage (GFS, Bigtable, Amazon S3) DC File
System
Web Services (MapReduce, Chubby) DC Libraries

26
1st Milestone DC Energy Conservation

Good match to Machine Learning
An optimization, so imperfection not catastrophic
Lots of data to measure, dynamically changing
workload, complex cost function
Not steady state, so not queuing theory
PGE trying to change behavior of datacenters
Properly state problem
Preserve 100 Service Level Agreements
Dont hurt hardware reliability
Then conserve energy
Radical idea can conserving energy improve
hardware reliability?

27
1st Milestone Conserve Energy Improve
Reliability

Improve component reliability?
Disks Lifetimes measured in Powered On Hours,
but limited to 50,000 start/stop cycles
Idea, if turn off disks 50, then 50 annual
failure rate as long as dont exceed 50,000
start/stop cycles ( once per hour)
Integrated Circuits lifetimes affected by
Thermal Cycling (fast change bad),
Electromigration (turn off helps), Dielectric
Breakdown (turn off helps)
Idea If limited number of times cycled
thermally, could cut IC failure rate due EM, DB
by 30?

See A Case For Adaptive Datacenters To Conserve
Energy and Improve Reliability, Peter Bodik,
Michael Armbrust, Kevin Canini, Armando Fox,
Michael Jordan and David Patterson, 2007.
28
RAD Lab 2.0 2nd Milestone Killer Web 2.0 Apps

Demonstrate RAD Lab vision of 1 person creating
next great service and scale up
Where get example great apps, given grad students
creating the technology?
Use Undergraduate Computing Clubs to create
exciting apps in RoR using RAD Lab equipment,
technology
Armando Fox is RoR club leader
Recruited Real World RoR programmer to develop
code and advise RoR computing club
30 students joined club Jan 2007
Hire best ugrads to build RoR apps in RAD Lab

29
Miracle of University Research

Talented (inexperienced) people
Pick from worldwide talent pool for students
faculty
Dont know what they cant do
Inexpensive
Mostly grad student salaries (50k-75k/yr
overhead)
Faculty part time (75k-100k/yr including
overhead)
Berkeley Stanford Swing for Fences (R, not r or
D)
Even if hit a single, train next generation of
leaders
Technology Transfer engine
Success Train students to go forth multiply
Publish everything, including source code
Ideal launching point for startups

30
Chance to Partner with a Great University

Chance to Work on the Next Great Thing
US News World Report ranking of CS Systems
universities 1 Berkeley, 2 CMU, 2 MIT, 4
Stanford
Berkeley Stanford some the top suppliers of
systems students to industry (and academia)
National Academy study mentions Berkeley in 7 of
19 1B industries from IT research, Stanford 4
times
Timesharing (SDS 940), Client-Server Computing
(BSD Unix), Graphics, Entertainment, Internet,
LANs, Workstations (SUN), GUI, VLSI Design
(Spice), RISC (ARM, MIPS, SPARC), Relational DB
(Ingres/Postgres), Parallel DB, Data Mining,
Parallel Computing, RAID, Portable Communication
(BWRC), WWW, Speech Recognition, Broadband last
mile (DSL)

31
Years to gt 1B IT industry from Research
StartNational Research Council Computer Science
Telecommunications Board, 2003
gt
32
Physical RAD Lab Radical Collocation

Innovation from spontaneous meetings of people
with different areas of expertise
Communication inversely proportional to distance
Almost never if gt 100 feet or on different floor
Everyone (including faculty) in open offices
Great Meeting Rooms, Ubiquitous Whiteboards
Technology to concentrate Cell phone, Ipod,
laptop
Google Physical RAD Lab to learn more

33
Example of Next Great Thing

Berkeley Adaptive Distributed systems Laboratory
(RAD Lab)
Founded 12/2005 with Google, Microsoft, Sun as
founding partners
Armando Fox, Randy Katz, Mike Jordan, Anthony
Joseph, Dave Patterson, Scott Shenker, Ion Stoica
Google RAD Lab to learn more

34
RAD Lab Goal Enable Next Ebay

Create technology to enable next great Internet
Service to grow rapidly without growing the
organization rapidly
Machine Learning Systems is secret sauce
Position The datacenter is the computer
Leverage point is simplifying datacenter
management
What is the programming language of the
datacenter?
What is CAD for the datacenter?
What is the OS for the datacenter?