Title: Enabling The Fortune One Million
1Enabling The Fortune One Million
- Armando Fox, Stanford University
- With Randy Katz, Michael Jordan, Dave Patterson,
Scott Shenker, Ion Stoica - IBM P3AD, April 2006
2Enabling The Fortune One Million
- Armando Fox, Berkeley RAD Lab
- With Randy Katz, Michael Jordan, Dave Patterson,
Scott Shenker, Ion Stoica - IBM P3AD, April 2006
3This talk
- Untainted by results, proofs, theorems, etc.
- BUTan attempt to formulate an operational
vision statement for self- systems and a
strategy for attaining it - presumed achievable based on bona fide previous
results
4Steps vs. Process
Cap Dado (The section of a pedestal between
cap and base) Base
- Steps Traditional, Static Handoff Model, N groups
- Process SupportDADO Evolution, 1 group
5- 1995 Pierre Omidyar develops deploys eBay 1.0
over a long weekend - Has been rewritten twice since then.
- 2005 HousingMaps.com connects Google Maps and
Craigslist apartment listings - Spawned a whole bunch of map mashups -
GoogleMapsMania blog - 2006 bug in Spell With Flickr mashup results
in author slapped for driving too much traffic - Prototyping getting easierdeployment at scale
getting harder at least as fast
6RAD Lab 5-year Mission
- Provide tools, technology platform to allow a
single person to Develop, Assess, Deploy, and
Operate the next-generation IT service - Enables The Fortune 1 Million
- Major partnership with industry
- Rest of this talk
- Early progress on DADO
- Reflections on upcoming technical challenges
opportunities - Lab organization tie-in to course plans, etc.
RAD Lab Robust, Adaptive, Distributed
systems
7RAD Lab Challenges Center
- The Challenges
- Develop a new Service using tools that facilitate
rapid prototyping - Assess Measuring, Testing, and Debugging the new
Service in a realistic distributed environment - Deploy Scaling up a new, geographically
distributed Service - Operate a service that could quickly scale to
millions of users - The Vehicle
- Interdisciplinary Center creates core technical
competency to demo 10X to 100X - Researchers are leaders in machine learning,
networking, and systems - Industrial Participants leading companies in HW,
systems SW, and online services
8Science is a way to createknowledge
- But science is also about understanding complex
artifacts - What is the science in services science?
- Going from raw observations of (complex) system
behavior to actionable interpretations - Improving and creating systematic methodologies
for the DADO steps - Creating systematic connections among the DADO
process stages
- But science is also about understanding complex
artifacts - What is the science in services science?
- Going from raw observations of (complex) system
behavior to actionable interpretations - Improving and creating systematic methodologies
for the DADO steps - Creating systematic connections among the DADO
process stages
Jim Stohrer at IBM University Day, Almaden,
April 2006
9DADO - Develop Joy of Middleware
- Dominant way to deploy services
- Innovate below abstraction
- Unmodified/proprietary apps can benefit (e.g.
from instrumentation) - Modern middleware tends to make apps more
declarative - CORBA gt J2EE gt Ruby on Rails
- Can get things running immediately
- But usually end up being rewritten
- NCSA httpd gt Apache, Ebay 0.9 gt Ebay 2.0,
Google.stanford.edu gt Google.com - Challenge can we get the best of both worlds?
10Examples Understanding the curse of success
11DADO - Assess example Packet Annotations
- Create new ways to collect information over
distributed networkAnnotation Layer - Incrementally deployable on existing
infrastructure - iBoxes label packets at annotation layer but do
not change original packet payloads - Expose annotations to application layer
Application
Presentation
Session
Annotations
Transport
Network
Link
Phy
12iBox Placement for Observation and Action
iBoxes strategically placed near entry/exit
points within the Enterprise network
13DADO - Assess Distributed Debugging
- Allows inspection of snapshots of distributed
app state - Faithful replay of distributed apps
- virtual (Lamport) clocks allow consistent replay
- Has found bugs in Chord/I3 and itself
- Works with existing toolchains
- Transparently intercepts libc calls
- Extends gdb UI
gt replay 132.239.6.225
... running
gt break update_state()
... 1 set line 75
gt advance 10000000
... done
gt fix bug for me
user
14DADO - Deploy RAMP
- How can academics experiment withsystems of
1000 nodes? - RAMP (Research Accelerator for Multiple
Processors) for parallel HW SW research - Single FPGA 25 CPUs caches in 2005
- 100k 4 FPGAs / board, 4 DIMMs / FPGA ,10-20
boards low-cost Storage Server over Ethernet - ? 1000 CPUs, 256 MB DRAM/CPU, 20 GB disk
storage/CPU - Parts of RAMP-1 prototype already running
15Using RAMP
- Current status
- Smaller-size board, 4 machines
- 4 MicroBlaze cores, Micro-C/Linux, TCP/IP, NFS,
Telnet, httpdCGI, Python Ruby coming soon - Short term plans
- Instrumentation plane think OpenView, but we can
instrument whatever we want - Simulate run simple Web apps on many many 100MHz
CPUs - Longer term plans
- Simulate wide-area networks at scales impractical
on PlanetLab - Understand Datacenter-in-a-Box model
X
16DADO - Operate
- Apply statistical machine learning to find
patterns in behavior of complex software - Example correlate high-level site health metrics
with low-level fingerprints associated with bad
health gt info retrieval - Example sample annotated software features
(language-level constructs) and correlate feature
sets with failed runs to help pinpoint bugs - Combine SML with visualization so operator sees
understands significance of anomalies - Promising early results but just the tip of the
iceberg - State of the art for visualization in operator
tools is very immature
17Some results so far
18Signatures - example
- Metric has value 1 if it is attributed with the
violation, -1 if it is not attributed, 0 if it is
not relevant
Attri- bution
19DADO - Operate Open sourcelike database of
traces/logs
- Goal large trace-like database of failure
logs and other relevant failure data for
research use - So far
- Complete source sanitized logs of 3 Flickr
mashup front-ends - Working with affiliates to make public a
sanitized version of data used in our early
results papers - Access to Microsoft desktop crash data collected
via BOINC (paper submission forthcoming)
20ReflectionsA good time to be using SML
- Technology supports use of SML
- Even building blocks of systems are
sufficiently complex, instrumentable, and have
large user bases - Advances in online algorithms research make good
fit for long-running systems - Moores Law nontrivial models can be induced and
evaluated in soft real time (seconds) for many
of these systems - Domain expertise in systems still needed
- We will develop a corps of researchers whose
strength is SML/Systems crossover
21Reflections Whats new in Services Science?
(or SOAs becoming realuh-oh!)
- Workload challenges
- AJAX and mashups change workload seen by back-end
servers - Nonlinear dynamics of changing workloads will
make spike provisioning more challenging (eg
Flickr mashups) - Long tail management challenges
- For every Amazon or Google, 1000s of smaller
services - This ratio will increase rapidly as ease of
deploying a meta-service increases - Managing 1000s of different services sharing
resources will be harder than managing one
mega-service - decoupled control loops (Jeff Mogul)
- Even if each service is well-regulated, can we
say anything about the meta-services? (like
mashups)
22Reflections An interesting opportunity
- Difficult to scale functionalities increasingly
being offered as utility services - This is mostly why SOA is taking off!
- Storage Google Base, OpenDHT, Amazon S3 (eg, new
client-side Wiki software that uses S3 and no
other server!), Salesforce.com - Mapping/GIS Google Yahoo Maps
- Build customized searches using search engine
APIs - Future functions like MapReduce?
- Indeed, mashups are often not much more than
front ends of computation soft state - This should be easy to scale!! The rubber meets
the road here. - Experiments to be done soon on RAMP, which Ill
talk about shortly
23RAD Lab Organization
- 2.5M/year, 70 industry, 20 State, 10 Fed.
govt (NSF) - 30 grad students, 10 undergrads, 6 faculty, 2
staff - Founding Companies Sun, Google, Microsoft
- Affiliate Members include Verisign, IBM,
Hewlett-Packard, NTT, Oracle, Nortel - Mid project review after 3 years by founding
partners - Benefits to Affiliates RAD Lab
- Prefer founding partner technology in prototypes
- Designate employees to act as consultants/liaisons
- Real-life training for next generation of IT
researchers - Research based on real systems data (logs,
forensics, etc.)
24Industrial collaboration
- Intellectual property policy
- Nonexclusive, royalty-free IP license so partners
not sued--BSD license (text available at
opensource.org) - Head start on research results for affiliates
(6-month embargo) - Impact from previous projects
- RAID, RISC, NOW - multibillion-dollar industries
- Berkeley regularly ranked in top 3 for systems
research (1 this year, tied with MIT)
25Education/Course plans
- Were not teaching students to think in terms of
a hosted-service development model - Inheriting, understanding, extending other
peoples long lived code - How you do testing and upgrades for an online
24x7 service - Technologies we used to teach in detail are now
encapsulated as open source, running code - Symmetric multithreaded I/O intensive apps
(Apache) - transactions and concurrency control (MySQL)
- Whats important is understanding these as system
building blocks - What are their interactions?
- What tradeoffs involved in composing these in a
system? What price do I pay (in performance,
robustness, or whatever) for selecting a given
behavior?
26Course plans
- Year 1 Graduate project courses
- Improve the RAD Lab platform, infrastructure,
technologies - Year 2 Undergrad courses
- Develop, assess, deploy, operate new apps on RAD
Lab hosting service - Improve other peoples existing services, all in
a hosted environment - Year 3 Joint courses between CS and
Business/Management - Design business model along with app
- Understand how business concerns affect DADO
process - Consistent with IBMs SSME vision of creating a
multidisciplinary corps of service scientists
27Summary
- Technology bets SML, visualization, FPGAs
- will help us better understand the behaviors of
these complex distributed systems - Will let us run credible experiments at scale
- Will improve the tools available for operators
- Eventual goal Fortune 1 Million
- 1 person can design, deploy and operate next eBay
without building an eBay-sized organization - Strong ties to industry
- Integrated with course offerings/curriculum
- http//radlab.cs.berkeley.edu
28Acknowledgments
- RAD Lab sponsors founders
- Co-PIs Patterson, Katz, Stoica, Shenker, Jordan
- Students whose work was mentioned in this talk
Archana Ganapathi, Peter Bodik, Dennis Geels, Wei
Xu,.
29BACKUP SLIDES
30DADO - DevelopBack end building block services
- servers serve client programs--not people
- Ideas like user-based anomaly detection dont
work - Workloads higher volume different profile
(e.g., prefetching for Google Maps) - Aggregational services multiply workloads (1
HousingMaps hit N Craigslist hits N Google
Maps hits) - Distributed debugging is tough because the other
sites are not under your control - Large sophisticated sites already deal with this
internally - but must now deal with less-predictable workload,
evil, etc
31DADO - OperateCapturing Operator Actions
- Systematically capture, index, retrieve operator
actions during incident response - Operators role largely ignored in most current
work - Goal try to capture semantics of how operators
are thinking when they react to a problem - Gradually increase trust by suggesting actions
based on past history - Auditable recommendations let operator explore
how the recomendation was made - Various techniques possible--reinforcement
learning, expert systems, collaborative
filtering...
32Capturing operator actions
- monitoring operators
- web-based tools look at web server access logs
- Unix command line sudo logs or history
- stand-alone GUI tools instrument them?
- trouble ticket DB operators involved, start/end,
type of problem - Challenge extract sufficient semantics to allow
cross-analysis of sources (timestamps, intent
of an action, etc) - similarity metric for failures
- eg compare signatures of failures/problems
- clickstream analysis and data mining
- e-commerce sites already do this--for their
customers
33RAD Lab Opportunity New Research Model
- Chance to Partner with the Top University in
Computer Systems on the Next Great Thing - National Academy of Engineering mentions Berkeley
in 7 of 19 1B industries that came from IT
research - NAE mentions Berkeley 7 times, Stanford 5 Times,
MIT 5, CMU 3 Timesharing (SDS 940), Client-Server
Computing (BSD Unix), Graphics, Entertainment,
Internet, LANs, Workstations, GUI, VLSI Design
(Spice) ECAD 5B?/yr , RISC 10B?/yr ,
Relational DB (Ingres/Postgres) RDB 15B?/yr,
Parallel DB, Data Mining, Parallel Computing,
RAID 15B?/yr , Portable Communication (BWRC),
WWW, Speech Recognition, Broadband - Berkeley one of the top suppliers of systems
students to industry and academia - US News World Report ranking of CS Systems
universities1 Berkeley, 2 CMU, 2 MIT, 4
Stanford, 5 Washington - For example Quanta (Taiwan PC laptop clone
manufacturer) funds MIT CSAIL _at_ 4M/year for 5
years to reinvent PC April 2005 (Tparty) - RAID project (4 faculty, 20 grads, 10 undergrads)
helped create 15B industry, but not fundable
today at DARPA, NSF
34RAD Lab Interdisciplinary Center for Reliable,
Adaptive, Distributed Systems
- Working with different industries on long-range,
pre-competitive technology - Training of dozens of future leaders of IT, plus
their recruitment - Working with researchers with track records of
successful technology transfer
35RAD Lab Timeline
- 2005 Launch RAD Lab
- 2006 Collect workloads, Internet in a Box
- 2007 SLT/CT distributed architectures, Iboxes,
annotative layer, class testing - 2008 Development toolkit 1.0, tuple space, class
testing Mid Project Review - 2009 RAD Lab software suite 1.0, class testing
- 2010 End of Project Party
36DADO - Operate
- Others ideas
- Fast recovery means can afford false positives,
enabling automated recovery mechanisms for
servers via SLT algorithms - Microreboot exemplifies Repair as local
adaptation - Safety achieved by state separation
- Linear Control Theory places constraints on SW
architectures - Will restricting systems to be controllable
make them easier to operate by humans as well by
simple controllers? - Will cost-performance still be good enough for
controllable systems? - SLT helpful in diagnosing failed components
37DADO - DevelopControl-Theory-Friendly Systems
- Problem server-like system consisting of stages
separated by queues - Lack of balance across stages results in
performance hiccups - Straightforward application of LTI control theory
to regulate queue lengths via combination of
admission control filtering - Insight build systems to allow the use of simple
linear controllers - Example Farsite Scalability TR identifies
Farsite properties that prevent it from being a
good candidate for CT - Could Farsite be architected to avoid those
properties? At what cost?