InstructionLevel Parallelism for LowPower Embedded Processors

About This Presentation

Title:

InstructionLevel Parallelism for LowPower Embedded Processors

Description:

Reduction of critical path. Control Dependences. Resource ... A new VLIW architecture to reduce increase in code size. A prefix based predicated ... – PowerPoint PPT presentation

Number of Views:811

Avg rating:3.0/5.0

Slides: 44

Provided by: vand169

Category:

more less

Transcript and Presenter's Notes

Title: InstructionLevel Parallelism for LowPower Embedded Processors

1
Instruction-Level Parallelism for Low-Power
Embedded Processors
Ph.D. Thesis Jean Michel Puiati, EPFL

January 23, 2001
Presented By
Anup Gangwar

2
Introduction

Need for high performance low power processors
Synergistic hardware -compiler design for EPIC or
VLIW like architectures
A new variable instruction length scheme
Full predication support in hardware

3
Outline

Instruction-Level Parallelism
Power Consumption in VLSI Circuits
A Look at Available Mobile and DSP Processors
High-Level Evaluation of A Low-Power VLIW
Processor
The DEVIL Low-Power Processor
A Step Towards Predicated Execution
Conclusion

4
ILP Concepts and Limitations

Data Dependences
Flow Dependence or RAW
Anti Dependence or WAR
Output Dependence or WAW
Reduction of critical path
Control Dependences
Resource Conflicts

5
(No Transcript)
6
Achieving ILP Pipelining

Control dependencies affect pipelined execution
Data dependencies affect pipelined execution
Resource conflicts affect pipelined execution

7
Achieving ILP Superscalar
Architectures

In-order issue with in-order completion
In-order issue with out-of-order completion
Out-of-order issue with out-of-order completion

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
Achieving ILP VLIW Processors

Low circuit overhead than Superscalar Processors
Limited number of resources
Explicit insertion of NOPs increases code size

12
(No Transcript)
13
Extracting ILP BasicBlock Scheduling
14
Extracting ILP Superblock Scheduling
15
Extracting ILP Predicated Execution
16
Power Consumption in CMOS Circuits Parallelism
for Energy Efficiency
17
(No Transcript)
18
Available Mobile and VLIW Processors

The ARM Family
The ARM7 Generation
The StrongARM
The ARM Thumb Option
The ARM Piccolo Option
The ARM9 and ARM10

19
Available Mobile and VLIW Processors

The Motorola M-Core
The LSI TinyRisc
The Hitachi SuperH Family
VLIW Processors
The Motorola-Lucent StarCore
The Philips TriMedia
The HP/Intel IA-64

20
High Level Evaluation of A Low-Power VLIW
Processor

Energy consumption distribution

21
High Level Evaluation of A Low-Power VLIW
Processor

NOP Elimination in VLIW Processor

22
High Level Evaluation of A Low-Power VLIW
Processor

Speed-up Comparison

23
High Level Evaluation of A Low-Power VLIW
Processor

Energy Comparison

24
High Level Evaluation of A Low-Power VLIW
Processor

Energy-Delay Product Comparison

25
The DEVIL Low-Power Processor

Complexity in VLIW Architectures
Hardware Duplication
FUs and number of registers as well as ports
Number of FUs versus type of FU
Number of FUs versus available ILP

26
The DEVIL Low-Power Processor

Code Memory

27
The DEVIL Low-Power Processor
28
The DEVIL Low-Power Processor

Instruction Fetch Mechanism

29
The DEVIL Low-Power Processor

Branch Prediction Mechanism

30
The DEVIL Low-Power Processor

Performance with and without superscalar
optimizations

31
The DEVIL Low-Power Processor

Effect of SuperScalar optimization on code size

32
The DEVIL Low-Power Processor

Effect of NOP elimination on code size

33
The DEVIL Low-Power Processor

Effect of NOP elimination on the number of
accesses to code memory

34
The DEVIL Low-Power Processor

Effect of instruction fetch mechanism on code size

35
The DEVIL Low-Power Processor

Code size comparison with existing mobile
processors

36
A Step Towards Predicated Execution

Compiler techniques for reducing predicate code
size
Reduction of number of Control Instructions
Predicate promotion and Instruction merging
Instruction reduction for advanced code generation

37
A Step Towards Predicated ExecutionReduction of
number of Control Instructions
38
A Step Towards Predicated Execution Predicate
promotion and Instruction merging
39
A Step Towards Predicated Execution

Introducing predication support into processor
Effect on code size of full predication
Predication code size and Execution
Characterstics
Prefix based predication

40
A Step Towards Predicated Execution

Relative number of predicated instructions

41
A Step Towards Predicated Execution

Code expansion considering predication

42
A Step Towards Predicated Execution

Code reductions due to predicated execution

43
Conclusions

A synergistic hardware-compiler approach for
low-power processors
A new VLIW architecture to reduce increase in
code size
A prefix based predicated execution architecture
framework

Write a Comment

User Comments (0)

About PowerShow.com

InstructionLevel Parallelism for LowPower Embedded Processors - PowerPoint PPT Presentation

InstructionLevel Parallelism for LowPower Embedded Processors

Reduction of critical path. Control Dependences. Resource ... A new VLIW architecture to reduce increase in code size. A prefix based predicated ... – PowerPoint PPT presentation