Computers for the Post-PC Era - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Computers for the Post-PC Era

Description:

Computers for the Post-PC Era David Patterson University of California at Berkeley Patterson_at_cs.berkeley.edu UC Berkeley IRAM Group UC Berkeley ISTORE Group – PowerPoint PPT presentation

Number of Views:183
Avg rating:3.0/5.0
Slides: 54
Provided by: AaronB169
Category:

less

Transcript and Presenter's Notes

Title: Computers for the Post-PC Era


1
Computers for the Post-PC Era
  • David Patterson
  • University of California at Berkeley
  • Patterson_at_cs.berkeley.edu
  • UC Berkeley IRAM Group
  • UC Berkeley ISTORE Group
  • istore-group_at_cs.berkeley.edu
  • February 2000

2
Perspective on Post-PC Era
  • PostPC Era will be driven by 2 technologies
  • 1) GadgetsTiny Embedded or Mobile Devices
  • ubiquitous in everything
  • e.g., successor to PDA, cell phone, wearable
    computers
  • 2) Infrastructure to Support such Devices
  • e.g., successor to Big Fat Web Servers, Database
    Servers

3
Outline
  • 1) Example microprocessor for PostPC gadgets
  • 2) Motivation and the ISTORE project vision
  • AME Availability, Maintainability, Evolutionary
    growth
  • ISTOREs research principles
  • Proposed techniques for achieving AME
  • Benchmarks for AME
  • Conclusions and future work

4
New Architecture Directions
  • media processing will become the dominant force
    in computer arch. and microprocessor design.
  • ...new media-rich applications ... involve
    significant real-time processing of continuous
    media streams, and make heavy use of vectors of
    packed 8-, 16-, 32-bit integer and Fl. Pt.
  • Needs include real-time response, continuous
    media data types (no temporal locality), fine
    grain parallelism, coarse grain parallelism,
    memory bandwidth
  • How Multimedia Workloads Will Change Processor
    Design, Diefendorff Dubey, IEEE Computer (9/97)

5
Intelligent RAM IRAM
  • Microprocessor DRAM on a single chip
  • 10X capacity vs. SRAM
  • on-chip memory latency 5-10X, bandwidth 50-100X
  • improve energy efficiency 2X-4X (no off-chip
    bus)
  • serial I/O 5-10X v. buses
  • smaller board area/volume
  • IRAM advantages extend to
  • a single chip system
  • a building block for larger systems

6
Revive Vector Architecture
  • Single-chip CMOS MPU/IRAM
  • IRAM
  • Much smaller than VLIW
  • For sale, mature (gt20 years)(We retarget Cray
    compilers)
  • Easy scale speed with technology
  • Parallel to save energy, keep performance
  • Multimedia apps vectorizable too N64b, 2N32b,
    4N16b
  • Cost 1M each?
  • Low latency, high BW memory system?
  • Code density?
  • Compilers?
  • Performance?
  • Power/Energy?
  • Limited to scientific applications?

7
V-IRAM1 Low Power v. High Perf.

4 x 64 or 8 x 32 or 16 x 16
x
2-way Superscalar
Vector
Instruction

Processor
Queue
Load/Store
Vector Registers
16K I cache
16K D cache
4 x 64
4 x 64
Serial I/O
Memory Crossbar Switch
M
M
M
M
M
M
M
M
M
M

M
M
M
M
M
M
M
M
M
M
4 x 64
4 x 64
4 x 64
4 x 64
4 x 64










M
M
M
M
M
M
M
M
M
M
8
VIRAM-1 System on a Chip
  • Prototype scheduled for tape-out mid 2000
  • 0.18 um EDL process
  • 16 MB DRAM, 8 banks
  • MIPS Scalar core and
    caches _at_ 200 MHz
  • 4 64-bit vector unit
    pipelines _at_ 200 MHz
  • 4 100 MB parallel I/O lines
  • 17x17 mm, 2 Watts
  • 25.6 GB/s memory (6.4 GB/s per direction
    and per Xbar)
  • 1.6 Gflops (64-bit), 6.4 GOPs (16-bit)

Memory (64 Mbits / 8 MBytes)
Xbar
I/O
Memory (64 Mbits / 8 MBytes)
9
Media Kernel Performance
10
Base-line system comparison
  • All numbers in cycles/pixel
  • MMX and VIS results assume all data in L1 cache

11
IRAM Chip Challenges
  • Merged Logic-DRAM process Cost Cost of wafer,
    Impact on yield, testing cost of logic and DRAM
  • Price on-chip DRAM v. separate DRAM chips?
  • Delay in transistor speeds, memory cell sizes in
    Merged process vs. Logic only or DRAM only
  • DRAM block flexibility via DRAM compiler (vary
    size, width, no. subbanks) vs. fixed block
  • Apps advantages in memory bandwidth, energy,
    system size to offset challenges?

12
Other examples IBM Blue Gene
  • 1 PetaFLOPS in 2005 for 100M?
  • Application Protein Folding
  • Blue Gene Chip
  • 32 Multithreaded RISC processors ??MB Embedded
    DRAM high speed Network Interface on single 20
    x 20 mm chip
  • 1 GFLOPS / processor
  • 2 x 2 Board 64 chips (2K CPUs)
  • Rack 8 Boards (512 chips,16K CPUs)
  • System 64 Racks (512 boards,32K chips,1M CPUs)
  • Total 1 million processors in just 2000 sq. ft.

13
Other examples Sony Playstation 2
  • Emotion Engine 6.2 GFLOPS, 75 million polygons
    per second (Microprocessor Report, 135)
  • Superscalar MIPS core vector coprocessor
    graphics/DRAM
  • Claim Toy Story realism brought to games

14
Outline
  • 1) Example microprocessor for PostPC gadgets
  • 2) Motivation and the ISTORE project vision
  • AME Availability, Maintainability, Evolutionary
    growth
  • ISTOREs research principles
  • Proposed techniques for achieving AME
  • Benchmarks for AME
  • Conclusions and future work

15
The problem space big data
  • Big demand for enormous amounts of data
  • today high-end enterprise and Internet
    applications
  • enterprise decision-support, data mining
    databases
  • online applications e-commerce, mail, web,
    archives
  • future infrastructure services, richer data
  • computational storage back-ends for mobile
    devices
  • more multimedia content
  • more use of historical data to provide better
    services
  • Todays SMP server designs cant easily scale
  • Bigger scaling problems than performance!

16
Lampson Systems Challenges
  • Systems that work
  • Meeting their specs
  • Always available
  • Adapting to changing environment
  • Evolving while they run
  • Made from unreliable components
  • Growing without practical limit
  • Credible simulations or analysis
  • Writing good specs
  • Testing
  • Performance
  • Understanding when it doesnt matter

Computer Systems Research-Past and Future
Keynote address, 17th SOSP, Dec. 1999 Butler
Lampson Microsoft
17
Hennessy What Should the New World Focus Be?
  • Availability
  • Both appliance service
  • Maintainability
  • Two functions
  • Enhancing availability by preventing failure
  • Ease of SW and HW upgrades
  • Scalability
  • Especially of service
  • Cost
  • per device and per service transaction
  • Performance
  • Remains important, but its not SPECint

Back to the Future Time to Return to
Longstanding Problems in Computer Systems?
Keynote address, FCRC, May 1999 John
Hennessy Stanford
18
The real scalability problems AME
  • Availability
  • systems should continue to meet quality of
    service goals despite hardware and software
    failures
  • Maintainability
  • systems should require only minimal ongoing human
    administration, regardless of scale or complexity
  • Evolutionary Growth
  • systems should evolve gracefully in terms of
    performance, maintainability, and availability as
    they are grown/upgraded/expanded
  • These are problems at todays scales, and will
    only get worse as systems grow

19
The ISTORE project vision
  • Our goal
  • develop principles and investigate hardware/sof
    tware techniques for building storage-based
    server systems that
  • are highly available
  • require minimal maintenance
  • robustly handle evolutionary growth
  • are scalable to O(10000) nodes

20
Principles for achieving AME (1)
  • No single points of failure
  • Redundancy everywhere
  • Performance robustness is more important than
    peak performance
  • performance robustness implies that real-world
    performance is comparable to best-case
    performance
  • Performance can be sacrificed for improvements in
    AME
  • resources should be dedicated to AME
  • compare biological systems spend gt 50 of
    resources on maintenance
  • can make up performance by scaling system

21
Principles for achieving AME (2)
  • Introspection
  • reactive techniques to detect and adapt to
    failures, workload variations, and system
    evolution
  • proactive techniques to anticipate and avert
    problems before they happen

22
Outline
  • 1) Example microprocessor for PostPC gadgets
  • 2) Motivation and the ISTORE project vision
  • AME Availability, Maintainability, Evolutionary
    growth
  • ISTOREs research principles
  • Proposed techniques for achieving AME
  • Benchmarks for AME
  • Conclusions and future work

23
Hardware techniques
  • Fully shared-nothing cluster organization
  • truly scalable architecture
  • architecture that tolerates partial failure
  • automatic hardware redundancy

24
Hardware techniques (2)
  • No Central Processor Unit distribute processing
    with storage
  • Serial lines, switches also growing with Moores
    Law less need today to centralize vs. bus
    oriented systems
  • Most storage servers limited by speed of CPUs
    why does this make sense?
  • Why not amortize sheet metal, power, cooling
    infrastructure for disk to add processor, memory,
    and network?
  • If AME is important, must provide resources to be
    used to help AME local processors responsible
    for health and maintenance of their storage

25
Hardware techniques (3)
  • Heavily instrumented hardware
  • sensors for temp, vibration, humidity, power,
    intrusion
  • helps detect environmental problems before they
    can affect system integrity
  • Independent diagnostic processor on each node
  • provides remote control of power, remote console
    access to the node, selection of node boot code
  • collects, stores, processes environmental data
    for abnormalities
  • non-volatile flight recorder functionality
  • all diagnostic processors connected via
    independent diagnostic network

26
Hardware techniques (4)
  • On-demand network partitioning/isolation
  • Internet applications must remain available
    despite failures of components, therefore can
    isolate a subset for preventative maintenance
  • Allows testing, repair of online system
  • Managed by diagnostic processor and network
    switches via diagnostic network

27
Hardware techniques (5)
  • Built-in fault injection capabilities
  • Power control to individual node components
  • Injectable glitches into I/O and memory busses
  • Managed by diagnostic processor
  • Used for proactive hardware introspection
  • automated detection of flaky components
  • controlled testing of error-recovery mechanisms
  • Important for AME benchmarking (see next slide)

28
Hardware techniques (6)
  • Benchmarking
  • One reason for 1000X processor performance was
    ability to measure (vs. debate) which is better
  • e.g., Which most important to improve clock
    rate, clocks per instruction, or instructions
    executed?
  • Need AME benchmarks
  • what gets measured gets done
  • benchmarks shape a field
  • quantification brings rigor

29
ISTORE-1 hardware platform
  • 80-node x86-based cluster, 1.4TB storage
  • cluster nodes are plug-and-play, intelligent,
    network-attached storage bricks
  • a single field-replaceable unit to simplify
    maintenance
  • each node is a full x86 PC w/256MB DRAM, 18GB
    disk
  • more CPU than NAS fewer disks/node than cluster

Intelligent Disk Brick Portable PC CPU Pentium
II/266 DRAM Redundant NICs (4 100 Mb/s
links) Diagnostic Processor
  • ISTORE Chassis
  • 80 nodes, 8 per tray
  • 2 levels of switches
  • 20 100 Mbit/s
  • 2 1 Gbit/s
  • Environment Monitoring
  • UPS, redundant PS,
  • fans, heat and vibration sensors...

30
A glimpse into the future?
  • System-on-a-chip enables computer, memory,
    redundant network interfaces without
    significantly increasing size of disk
  • ISTORE HW in 5-7 years
  • building block 2006 MicroDrive integrated with
    IRAM
  • 9GB disk, 50 MB/sec from disk
  • connected via crossbar switch
  • 10,000 nodes fit into one rack!
  • O(10,000) scale is our ultimate design point

31
Software techniques
  • Fully-distributed, shared-nothing code
  • centralization breaks as systems scale up
    O(10000)
  • avoids single-point-of-failure front ends
  • Redundant data storage
  • required for high availability, simplifies
    self-testing
  • replication at the level of application objects
  • application can control consistency policy
  • more opportunity for data placement optimization

32
Software techniques (2)
  • River storage interfaces
  • NOW Sort experience performance heterogeneity
    is the norm
  • e.g., disks outer vs. inner track (1.5X),
    fragmentation
  • e.g., processors load (1.5-5x)
  • So demand-driven delivery of data to apps
  • via distributed queues and graduated declustering
  • for apps that can handle unordered data delivery
  • Automatically adapts to variations in performance
    of producers and consumers
  • Also helps with evolutionary growth of cluster

33
Software techniques (3)
  • Reactive introspection
  • Use statistical techniques to identify normal
    behavior and detect deviations from it
  • Policy-driven automatic adaptation to abnormal
    behavior once detected
  • initially, rely on human administrator to specify
    policy
  • eventually, system learns to solve problems on
    its own by experimenting on isolated subsets of
    the nodes
  • one candidate reinforcement learning

34
Software techniques (4)
  • Proactive introspection
  • Continuous online self-testing of HW and SW
  • in deployed systems!
  • goal is to shake out Heisenbugs before theyre
    encountered in normal operation
  • needs data redundancy, node isolation, fault
    injection
  • Techniques
  • fault injection triggering hardware and software
    error handling paths to verify their
    integrity/existence
  • stress testing push HW/SW to their limits
  • scrubbing periodic restoration of potentially
    decaying hardware or software state
  • self-scrubbing data structures (like MVS)
  • ECC scrubbing for disks and memory

35
Applications
  • ISTORE is not one super-system that demonstrates
    all these techniques!
  • Initially provide library to support AME goals
  • Initial application targets
  • cluster web/email servers
  • self-scrubbing data structures, online
    self-testing
  • statistical identification of normal behavior
  • decision-support database query execution system
  • River-based storage, replica management
  • information retrieval for multimedia data
  • self-scrubbing data structures, structuring
    performance-robust distributed computation

36
Outline
  • 1) Example microprocessor for PostPC gadgets
  • 2) Motivation and the ISTORE project vision
  • AME Availability, Maintainability, Evolutionary
    growth
  • ISTOREs research principles
  • Proposed techniques for achieving AME
  • Benchmarks for AME
  • Conclusions and future work

37
Availability benchmark methodology
  • Goal quantify variation in QoS metrics as events
    occur that affect system availability
  • Leverage existing performance benchmarks
  • to generate fair workloads
  • to measure trace quality of service metrics
  • Use fault injection to compromise system
  • hardware faults (disk, memory, network, power)
  • software faults (corrupt input, driver error
    returns)
  • maintenance events (repairs, SW/HW upgrades)
  • Examine single-fault and multi-fault workloads
  • the availability analogues of performance micro-
    and macro-benchmarks

38
Methodology reporting results
  • Results are most accessible graphically
  • plot change in QoS metrics over time
  • compare to normal behavior?
  • 99 confidence intervals calculated from no-fault
    runs
  • Graphs can be distilled into numbers?

39
Example results software RAID-5
  • Test systems Linux/Apache and Win2000/IIS
  • SpecWeb 99 to measure hits/second as QoS metric
  • fault injection at disks based on empirical fault
    data
  • transient, correctable, uncorrectable, timeout
    faults
  • 15 single-fault workloads injected per system
  • only 4 distinct behaviors observed
  • (A) no effect (C) RAID enters degraded mode
  • (B) system hangs (D) RAID enters degraded mode
    starts
    reconstruction
  • both systems hung (B) on simulated disk hangs
  • Linux exhibited (D) on all other errors
  • Windows exhibited (A) on transient errors and (C)
    on uncorrectable, sticky errors

40
Example results multiple-faults
Windows 2000/IIS
Linux/ Apache
  • Windows reconstructs 3x faster than Linux
  • Windows reconstruction noticeably affects
    application performance, while Linux
    reconstruction does not

41
Conclusions (1) Benchmarks
  • Linux and Windows take opposite approaches to
    managing benign and transient faults
  • Linux is paranoid and stops using a disk on any
    error
  • Windows ignores most benign/transient faults
  • Windows is more robust except when disk is truly
    failing
  • Linux and Windows have different reconstruction
    philosophies
  • Linux uses idle bandwidth for reconstruction
  • Windows steals app. bandwidth for reconstruction
  • Windows rebuilds fault-tolerance more quickly
  • Win2k favors fault-tolerance over performance
    Linux favors performance over fault-tolerance

42
Conclusions (2) ISTORE
  • Availability, Maintainability, and Evolutionary
    growth are key challenges for server systems
  • more important even than performance
  • ISTORE is investigating ways to bring AME to
    large-scale, storage-intensive servers
  • via clusters of network-attached,
    computationally-enhanced storage nodes running
    distributed code
  • via hardware and software introspection
  • we are currently performing application studies
    to investigate and compare techniques
  • Availability benchmarks a powerful tool?
  • revealed undocumented design decisions affecting
    SW RAID availability on Linux and Windows 2000

43
Conclusions (3)
  • IRAM attractive for two Post-PC applications
    because of low power, small size, high memory
    bandwidth
  • Gadgets Embedded/Mobile devices
  • Infrastructure Intelligent Storage and Networks
  • PostPC infrastructure requires
  • New Goals Availability, Maintainability,
    Evolution
  • New Principles Introspection, Performance
    Robustness
  • New Techniques Isolation/fault insertion,
    Software scrubbing
  • New Benchmarks measure, compare AME metrics

44
Berkeley Future work
  • IRAM fab and test chip
  • ISTORE
  • implement AME-enhancing techniques in a variety
    of Internet, enterprise, and info retrieval
    applications
  • select the best techniques and integrate into a
    generic runtime system with AME API
  • add maintainability benchmarks
  • can we quantify administrative work needed to
    maintain a certain level of availability?
  • Perhaps look at data security via encryption?
  • Even consider denial of service?

45
The UC Berkeley IRAM/ISTORE ProjectsComputers
for the PostPC Era
  • For more information
  • http//iram.cs.berkeley.edu/istore
  • istore-group_at_cs.berkeley.edu

46
Backup Slides
  • (mostly in the area of benchmarking)

47
Case study
  • Software RAID-5 plus web server
  • Linux/Apache vs. Windows 2000/IIS
  • Why software RAID?
  • well-defined availability guarantees
  • RAID-5 volume should tolerate a single disk
    failure
  • reduced performance (degraded mode) after failure
  • may automatically rebuild redundancy onto spare
    disk
  • simple system
  • easy to inject storage faults
  • Why web server?
  • an application with measurable QoS metrics that
    depend on RAID availability and performance

48
Benchmark environment metrics
  • QoS metrics measured
  • hits per second
  • roughly tracks response time in our experiments
  • degree of fault tolerance in storage system
  • Workload generator and data collector
  • SpecWeb99 web benchmark
  • simulates realistic high-volume user load
  • mostly static read-only workload some dynamic
    content
  • modified to run continuously and to measure
    average hits per second over each 2-minute
    interval

49
Benchmark environment faults
  • Focus on faults in the storage system (disks)
  • How do disks fail?
  • according to Tertiary Disk project, failures
    include
  • recovered media errors
  • uncorrectable write failures
  • hardware errors (e.g., diagnostic failures)
  • SCSI timeouts
  • SCSI parity errors
  • note no head crashes, no fail-stop failures

50
Disk fault injection technique
  • To inject reproducible failures, we replaced one
    disk in the RAID with an emulated disk
  • a PC that appears as a disk on the SCSI bus
  • I/O requests processed in software, reflected to
    local disk
  • fault injection performed by altering SCSI
    command processing in the emulation software
  • Types of emulated faults
  • media errors (transient, correctable,
    uncorrectable)
  • hardware errors (firmware, mechanical)
  • parity errors
  • power failures
  • disk hangs/timeouts

51
System configuration
  • RAID-5 Volume 3GB capacity, 1GB used per disk
  • 3 physical disks, 1 emulated disk, 1 emulated
    spare disk
  • 2 web clients connected via 100Mb switched
    Ethernet

52
Results single-fault experiments
  • One expt for each type of fault (15 total)
  • only one fault injected per experiment
  • no human intervention
  • system allowed to continue until stabilized or
    crashed
  • Four distinct system behaviors observed
  • (A) no effect system ignores fault
  • (B) RAID system enters degraded mode
  • (C) RAID system begins reconstruction onto spare
    disk
  • (D) system failure (hang or crash)

53
State of the Art Ultrastar 72ZX
  • 73.4 GB, 3.5 inch disk
  • 2/MB
  • 16 MB track buffer
  • 11 platters, 22 surfaces
  • 15,110 cylinders
  • 7 Gbit/sq. in. areal density
  • 17 watts (idle)
  • 0.1 ms controller time
  • 5.3 ms avg. seek (seek 1 track gt 0.6 ms)
  • 3 ms 1/2 rotation
  • 37 to 22 MB/s to media

Embed. Proc.
Track
Sector
Cylinder
Track Buffer
Platter
Arm
Head
source www.ibm.com www.pricewatch.com 2/14/00
Write a Comment
User Comments (0)
About PowerShow.com