Multicore: Panic or Panacea - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Multicore: Panic or Panacea

Description:

Single Core. Dual Core. Quad Core. Core area. A ~A/2 ~A/4. Core power. W ~W/2 ~W/4. Chip power ... Parallel scaling limits many-core 4 cores only for well ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 21
Provided by: kevin127
Category:

less

Transcript and Presenter's Notes

Title: Multicore: Panic or Panacea


1
Multicore Panic or Panacea?
  • Mikko H. Lipasti
  • Associate Professor
  • Electrical and Computer Engineering
  • University of Wisconsin Madison

http//www.ece.wisc.edu/pharm
2
Multicore Mania
  • First, servers
  • IBM Power4, 2001
  • Then desktops
  • AMD Athlon X2, 2005
  • Then laptops
  • Intel Core Duo, 2006
  • Soon, your cellphone
  • ARM MPCore, prototypes for a while now

3
What is behind this trend?
  • Moores Law
  • Chip power consumption
  • Single-thread performance trend
  • source Intel

4
Dynamic Power
  • Static CMOS current flows when active
  • Combinational logic evaluates new inputs
  • Flip-flop, latch captures new value (clock edge)?
  • Terms
  • C capacitance of circuit
  • wire length, number and size of transistors
  • V supply voltage
  • A activity factor
  • f frequency
  • Future Fundamentally power-constrained

5
Easy answer Multicore
6
Amdahls Law
n
f
CPUs
1
f
1-f
Time
  • f fraction that can run in parallel
  • 1-f fraction that must run serially

7
Fixed Chip Power Budget
n
CPUs
  • Amdahls Law
  • Ignores (power) cost of n cores
  • Revised Amdahls Law
  • More cores ? each core is slower
  • Parallel speedup lt n
  • Serial portion (1-f) takes longer
  • Also, interconnect and scaling overhead

8
Fixed Power Scaling
  • Fixed power budget forces slow cores
  • Serial code quickly dominates

9
Predictions and Challenges
  • Parallel scaling limits many-core
  • gt4 cores only for well-behaved programs
  • Optimistic about new applications
  • Interconnect overhead
  • Single-thread performance
  • Will degrade unless we innovate
  • Parallel programming
  • Express/extract parallelism in new ways
  • Retrain programming workforce

10
Research Agenda
  • Programming for parallelism
  • Sources of parallelism
  • New applications, tools, and approaches
  • Single-thread performance and power
  • Most attractive to programmer/user
  • Chip multiprocessor overheads
  • Interconnect, caches, coherence, fairness

11
Finding Parallelism
  • Functional parallelism
  • Car engine, brakes, entertain, nav,
  • Game physics, logic, UI, render,
  • Automatic extraction UW Multiscalar
  • Decompose serial programs
  • Data parallelism
  • Vector, matrix, db table, pixels,
  • Request parallelism
  • Web, shared database, telephony,

12
Balancing Work
  • Amdahls parallel phase f all cores busy
  • If not perfectly balanced
  • (1-f) term grows (f not fully parallel)
  • Performance scaling suffers
  • Manageable for data request parallel apps
  • Very difficult problem for other two
  • Functional parallelism
  • Automatically extracted
  • Scale power to mismatch Multiscalar

13
Coordinating Work
  • Synchronization
  • Some data somewhere is shared
  • Coordinate/order updates and reads
  • Otherwise ? chaos
  • Traditionally locks and mutual exclusion
  • Hard to get right, even harder to tune for perf.
  • Research Transactional Memory UW Multifacet
  • Programmer Declare potential conflict
  • Hardware and/or software speculate check
  • Commit or roll back and retry

14
Single-thread Performance
  • Still most attractive source of performance
  • Speeds up parallel and serial phases
  • Can use it to buy back power
  • Must focus on power consumption
  • Performance benefit Power cost

15
Single-thread Performance
  • Hardware accelerators and circuits
  • Domain-specific UW MESA
  • Reconfigurable UW Compton
  • VLSI and design automation UW WISCAD, Kursun
  • Increasing frequency
  • Seems prohibitive clock power
  • Clever clocking schemes can help UW Pharm
  • Increasing instruction-level parallelism
  • UW Multiscalar, UW Pharm, UW Smith
  • Without blowing power budget
  • Alternatively, reduce power for same performance

16
Chip Multiprocessor Overheads
  • Core Interconnect UW Pharm
  • 80 of chip power Borkar, ISLPED 07 panel
  • Need fundamentally different approach
  • Revisit circuit switching
  • Cache coherence UW Multifacet, Pharm
  • Match workload behavior
  • Optimize for on-chip communication

17
Chip Multiprocessor Overheads
  • Shared caches UW Multifacet, Multiscalar, Smith
  • On-chip memory can be shared
  • Optimize replacement, replication
  • Fairness UW Smith
  • Maintain Performance isolation
  • Share resources fairly (memory, caches)

18
Research Groups _at_ UW
19
Conclusion
  • Forecast
  • Limited multicore (4) is here to stay
  • Manycore (gt4) will find its place
  • Hardware Challenges
  • Single-thread performance and power
  • Multicore overhead
  • Software Challenges
  • Finding application parallelism
  • Creating correct parallel programs
  • Creating scalable parallel programs

20
Questions?
  • http//www.ece.wisc.edu/pharm
Write a Comment
User Comments (0)
About PowerShow.com