Title: TACC Retrospective: Contributions, Non-Contributions, and What We Really Learned
1TACC RetrospectiveContributions,
Non-Contributions, and What We Really Learned
- Armando FoxUniversity of California,Berkeley
- fox_at_cs.berkeley.edu
2Vision The Content You Want
- What do above apps have in common?
- Adapt (collect, filter, transform) existing
content - according to client constraints
- respecting network limitations
- according to per-user preferences
- But Lack of unified framework for designing apps
that exploit this observation
3Contributions
- TACC, a model for structuring services
- Transformation, Aggregation, Caching,
Customization of Internet content - Scalable TACC server
- Based on clusters of commodity PCs
- Easy to author industrial strength services
- Scalable Network Service (SNS) platform maps app
semantics onto cluster-based availability
mechanisms - Experience with real users
- 15,000 today at UCB
4Whats TACC?
- Transformation (local, one-to-one)
- TranSend, Anonymizer
- Aggregation (nonlocal, many-to-one)
- Search engines, crawlers, newswatchers
- Caching
- Both original and locally-generated content
- Customization
- Per user for content generation
- Per device data delivery, content packaging
5TACC Example TranSend
- Transparent HTTP proxy
- On-the-fly, lossy compression of specific MIME
types (GIF, JPG...) - Cache both original transformed
- User specifies aggressiveness and refinement
UI - Parameters to HTML image transformers
6Top Gun Wingman
- PalmPilot web browser
- Intermediate-form page layout
- Image scaling transcoding
- Controlled by layout engine
- Device-specific ADU marshalling
- Including client versioning
- Originals and device-specific pages cached
html
A
ADU
7Application Partitioning
- Client competence
- Styled text, images, widgets are fine
- Bitmaps unnecessary
- Client responsiveness
- Scrolling, etc. shouldnt require roundtrip to
server - Client independence
- Very late conversion to client-specific format
8TACC Conceptual Data Flow
To Internet
FE
User request
- Front end accepts RPC-like user requests
- Users customization profile retrieved
- Original data fetched from cache or Internet
- Aggregation/transformation workers operate on
data according to customization profile
9TACC Model Summary
- Mostly stateless, composable workers
- Unifies previously ad hoc applications under one
framework - Encourages re-use through modularization
- Composition enables both new services and new
clients - TACC breakdown provides unified way to think
about app structure
10Services Should Be Easy To Write
- Rapid prototyping
- Insulate workers from mundane details
- Easy to incorporate existing/legacy code
- Few assumptions about code structure
- Must support variety of languages
- May be fragile
- Composition to leverage existing code
11Building a TACC Server
- Challenge Scalable Network Service (SNS)
requirements - Scalability to 100Ks of users with high
availability - Cost effective to deploy administer
- But, services should remain easy to write
- Server provides some bug robustness
- Server provides availability
- Server handles load balancing and scaling
- Preserve modularity ( componentwise
upgradability) when deploying
12Layered Model of Internet Services
httpd, etc.
- TACC Layer
- Programming model based on composable building
blocks - SNS Layer large virtual server
- Implements SNS requirements
- Cluster computing for hardware F/T and
incremental scaling
TACC
ScalableNetwork Svc
- Exploit TACC model semantics for software F/T
- SNS layer is reusable and isolated from TACC
- Application content orthogonal to SNS
mechanisms - Key to making apps easy to write
13Why Use a Cluster?
- Incremental scalability, low cost components
- High availability through hardware redundancy
- Goals
- Demonstrate that clusters and TACC fit well
together - Separate SNS from TACC
14Cluster-Based TACC Server
- Component replication for scaling and
availability - High-bandwidth, low-latency interconnect
- Incremental scaling commodity PCs
User ProfileDatabase
Caches
Front Ends
Workers
Load Balancing Fault Tolerance
AdministrationInterface
15Starfish Availability LB Death
- FE detects via broken pipe/timeout, restarts LB
C
FE
FE
FE
LB/FT
16Starfish Availability LB Death
- FE detects via broken pipe/timeout, restarts LB
- New LB announces itself (multicast), contacted by
workers, gradually rebuilds load tables
- If partition heals, extra LBs commit suicide
- FEs operate using cached LB info during failure
C
FE
FE
FE
LB/FT
17Starfish Availability LB Death
- FE detects via broken pipe/timeout, restarts LB
- New LB announces itself (multicast), contacted by
workers, gradually rebuilds load tables
- If partition heals, extra LBs commit suicide
- FEs operate using cached LB info during failure
C
FE
FE
FE
LB/FT
18Fault Recovery Latency
Task queue length
19Behavior in the Large
- TranSend 160 image transformations/sec 10
Ultra-1 servers - Peak seen during UCB traces on 700-modem bank
15/sec - Amortized hardware cost lt0.35/user/month (one
5K PC serving 15,000 subscribers) - Wingman factor of 6-8 worse
- Administration one undergraduate part-time
20Building a Big System
- Restartable, atomic workers
- Read-only data from other origin server(s)
- Orthogonal separation of scalability/availability
from application content - Multiple lines of defense
- App modules agree to obey semantics compatible
with these mechanisms - Common-case failure behavior compatible with
users Internet experience - Enables reuse of whole workers, however diverse
21Availability Scalability Summary
- Pervasive strategy timeout, retry, restart
- Transient failures usually invisible to user
- Process peers watch each other
- Mostly stateless workers, xact support possible
- Simplicity from exploiting soft state
- Piggyback status info on multicast beacons
- Use of stale LB info fine in practice
- Starfish availability works in practice
22Service Authoring
- Keyword hiliting lt 1 day
- Wingman 2-3 weeks
- Various apps from graduate seminar projects
- Safe worker upload
- Annotate the Web
- Channel aggregators
23New Services By Composition
- Compose existing services to create a new one
- 2.5 hours to implement
- Composes with TranSend or Wingman
Internet
TranSend Metasearch
24Experience With Real Users
- Transparent enhancements
- Minimal downtime
- Low administration cost
- Multicast-based administration GUI
- Virtually no dedicated resources at UCB
- Overflow pool of 100 UltraSPARC servers
- Users dont mind relying on middleware proxy
25Why Now?
- Internets critical mass
- Commercial push for many device types (transistor
curves) - Cluster computing economically viable
- A good time for infrastructural services
26Related Work
- Transformational proxy services WBI, Strands
- Application partitioning Wit, InfoPad, PARC
Ubiquitous Computing - Computing in the infrastructure Active Networks
- Soft state for simplicity and robustness
Microsoft Tiger, multicast routing protocols
27Summary of Contributions
- TACC, a composition-based Internet services
programming model - captures rich variety of apps
- one view of customization
- No-hassle deployment on a cluster
- Automatic and robust partial-failure handling
- Availability scaling strategies work in
practice - New apps are easy to write, deploy, debug
- SNS behaviors are free
- Compose existing services to enable new clients
28Non-Contributions (a/k/a Future Work)
- Accidental contributions
- Legacy code glue
- Cheap test rig for next project (prototyping path
discovery a bare bones cluster OS) - Non-contributions
- Fair resource allocation over cluster
- Built-in security abstractions
- Rich state management abstractions
29What We Really Learned
- Design for failure
- It will fail anyway
- End-to-end argument applied to availability
- Orthogonality is even better than layering
- Narrow interface vs. no interface
- A great way to manage system complexity
- The price of orthogonality
- Techniques Refreshable soft state
watchdogs/timeouts sandboxing
30How About State Management?
- Transactional apps?
- APIs are there, but you have to roll your own
consistency - Groupware apps with group state?
- One way distributed, F/T group state like SRM!
- Keeps state management orthogonal to SNS layer
The Moral Consistency, Availability,
Partition-resilience pick at most 2
31Future Work
- TACC as test rig for Ninja
- Taxonomy of app structure and platforms
- What is the big picture of different types of
Internet services, and where does TACC fit in? - Joint work with Dr. Murray Mazer at the Open
Group Research Institute - Apply lessons to reliable distributed systems
- Formalize programming model