Title: Instruction-based System-level Power Evaluation of System-on-a-chip Peripheral Cores
1Instruction-based System-level Power Evaluation
of System-on-a-chip Peripheral Cores
- Tony Givargis, Frank Vahid
- Dept. of Computer Science Engineering
- University of California, Riverside
- also with the Center for Embedded Computer
Systems, UC Irvine
Joerg Henkel NEC CC Research Princeton, New
Jersey
This work was supported by the National Science
Foundation under grant CCR-9876006 , and by a
Design Automation Conference graduate scholarship.
2System-on-a-chip (SOC)
- Want to explore alternative cores, parameter
settings, and applications
- Gate/RT level simulation too slow
SOC
Application2
Micro- processor
Cache
Memory
Bridge
3SOC System-level Power Estimation
- Microprocessor
- Tiwari/Malik/Wolfe 94
- Instruction set simulator
- Marculescu/Pedram 96
- Instruction trace reduction
Micro- processor
- Plus cache, memory bus
- Simunic/Benini/DeMicheli 99
- Extended instruct. simulator
- Givargis/Vahid/Henkel 99
- Trace reductions
4Core Providers Step 1 Instruction-based
System-Level Model Creation
- System simulation model already commonly used,
and required in VSIA standard - Executes 1000x faster than gate-level model
Core database
UART
JPEG decode
.
5Core Providers Step 2 Low-level Per-instruction
Power Evaluation
- Measure power of gate/layout model, per
instruction - Use unique testbench per instruction, may take
hours/days - Low-level model differentiates cores from other
SOC modules enabling accurate power estimation
- Must account for core parameters
6Core Providers Step 3 Back Annotation of System
Model
Core database
Reset() uJtot 13 Enable_tx() uJtot
23 Enable_rx() uJtot 18 Send() uJtot
76 Rcceive() uJtot 44
UART
UART
UART
JPEG decode
.
7Core Power Modes Requires Extra Effort by Core
Provider
- Unlike microprocessor, certain peripheral core
instructions can greatly modify power consumption
of other instructions - Must create power mode transition function, and
measure power per instruction per mode.
8User Performs System Simulation, Which Yields
Power Data
- Simulation takes only seconds or minutes
SOC
Application
Micro- processor
Cache
Memory
Core database
Bridge
Peripheral
Peripheral
UART
UART
UART
JPEG decode
.
9Results Image-decode Accelerator
- Examined 3 peripheral cores UART, DMA, JPEG
- Compared our instruction-based system-level
method with - Gate-level simulation slow but accurate
- Databook RT-level cycle-accurate simulation,
used databook average-power values
2000
1800
1600
1400
1200
1000
Energy (mJ)
800
600
400
200
0
UART
DMA
JPEG
10Results Importance of Power Modes
- Proper power-mode selection is critical for
peripheral cores - Too few modes or wrong modes can lead to much
error
UART example
11Conclusions
- Introduced instruction-based method is
- Accurate (less than 5 error)
- Fast (1000x speedup over gate-level)
- Fits with current core-based methodology
- Concept of power modes is necessary for accuracy
- Future work includes
- Trace-simulator-based approach (10x speedup)
- Trace-analysis-based approach (100x speedup)