Title: PowerPointPrsentation
1Embedded SoftwareHow to make it efficient?
2What is an embedded system?
These are not the embedded systems we will talk
about!
3Embedded Systems
Embedded systems information processing systems
embedded into a larger product
- Main reason for buying is not information
processing - Characteristics
- not recognised as information processing
- frequently real-time behaviour required
- must be dependable guarantee privacy
- many of these systems are mobile systems
- fundamental technology for pervasive
computing/ambient intelligence, implemented in
complex software
4Views on embedded software
? On Nanoscale Integration and Gigascale
Complexity in the Post - .com world de Man,
Keynote, DATE 2002
For many products in the area of consumer
electronics the amount of code is doubling every
two years Fritz Vaandrager in Rozenberg,
Vaandrager (eds.) Lectures on Embedded Systems,
LNCS, Vol. 1494, 1998
5The energy/flexibility conflict- Intrinsic Power
Efficiency -
Operations/WattMOPS/mW
Ambient Intelligence
10
DSP-ASIPs
hardwired muxed
1
Processors
Reconfigurable Computing
µPs
0.1
0.01
Technology
0.13µ
0.07µ
0.25µ
0.5µ
1.0µ
Necessary to optimize software otherwise the
prize for software flexibility cannot be paid!
H. de Man, Keynote, DATE02T. Claasen, ISSCC99
6Power is considered as the most important
constraint in embedded systemsin L. Eggermont
(ed) Embedded Systems Roadmap 2002, STW
Importance of Power and Energy Consumption
Current UMTS phones can hardly be operated for
more than an hour, if data is being
transmitted.from a report of the Financial
Times, Germany, on an analysis by Credit Suisse
First Boston http//www.ftd.de/tm/tk/9580232.html
?nvse
7Key requirements for embedded software
- Hardware/software efficiency
- run-time efficiency, code-size efficiency,
- energy efficiency, power consumption, .....
Many standards published as reference
implementations (just provide the correct
results do not care about efficiency)
? proposal of the software washing machine
(Catthoor)
8Generating efficient software requireswork at
all levels
- Algorithmic level(using the most efficient
algorithm data structures) - High-level source code transformations
- Compiler optimizations
- Code-Compression
- Operating system support(e.g. for minimizing
power consumption)
9Algorithmic level
- Choosing best decoding/filtering etc.
algorithmdata structures - Example MPEG-2 data structures Inverse
Discrete Cosine Transform (IDCT) most power/cycle
hungry hot spot. - Transformations
- Replacing double by float still acceptable
quality - Energy consumption reduced to 34, cycles
reduced to 35 - Standard IDCT ? Fast IDCT (double float?
integer),significant loss of precision. - Energy consumption reduced to 4.86, cycles
reduced to 5.10
T. Huels, Inf 12, UniDo, 2002
10High-level transformations
Example Separation of margin handling
many if-statements for margin-checking
only few margin elements to be processed
no checking,efficient
11Loop nest splitting at University of
DortmundLoop nest from MPEG-4 full search motion
estimation
for (z0 zlt20 z) for (x0 xlt36 x)
x14x for (y0 ylt49 y) y14y for
(k0 klt9 k) x2x1k-4 for (l0 llt9 )
y2y1l-4 for (i0 ilt4 i) x3x1i
x4x2i for (j0 jlt4j) y3y1j
y4y2j if (x3lt0 35ltx3y3lt048lty3)
then_block_1 else else_block_1 if
(x4lt0 35ltx4y4lt048lty4) then_block_2
else else_block_2
if (xgt10ygt14) for ( ylt49 y) for
(k0 klt9 k) for (l0 llt9l )
for (i0 ilt4 i) for (j0 jlt4j)
then_block_1 then_block_2 else
y14y for (k0 klt9 k) x2x1k-4
for (l0 llt9 ) y2y1l-4 for (i0 ilt4
i) x3x1i x4x2i for (j0 jlt4j)
y3y1j y4y2j if (0 35ltx3 0
48lty3) then-block-1 else else-block-1
if (x4lt0 35ltx4y4lt048lty4)
then_block_2 else else_block_2
analysis of polyhedral domains, selection with
genetic algorithm
for (z0 zlt20 z) for (x0 xlt36 x)
x14x for (y0 ylt49 y)
H. Falk et al., Inf 12, UniDo, 2002
12Results for loop nest splitting- Execution times
-
H. Falk et al., Inf 12, UniDo, 2002
13Results for loop nest splitting- Code sizes -
H. Falk et al., Inf 12, UniDo, 2002
14Generating efficient software requireswork at
all levels
- Algorithmic level(using the most efficient
algorithm data structures) - High-level source code transformations
- Compiler optimizations
- Code-Compression
- Operating system support(e.g. for minimizing
power consumption)
15Compilers Translation from C critical bottleneck
(Real-time) UML or equiv.
(Real-time) UML or equiv.
StateCharts/SDL
RT-Java
(sets of) C-programs
VHDL
Assembly level
Assembly level
HW
16Overhead of compilers for DSP processors
DSPStone (Zivojnovic et al.). Example ADPCM
Cycle overhead n
? Optimizations exploiting architectural features
of embedded processors. Current focus VLIW
processors (powerful multimedia processors). In
this talk focus on energy consumption.
8.0
7.0
6.0
5.0
4.0
3.0
2.0
1.0
TI-C51
ADI-2101
17Larger off-chip memories need more energythan
smaller on-chip memories
Example (CACTI Model)
2.5
2
1.5
Energy per access nJ
1
0.5
0
64
128
256
512
1024
2048
4096
8192
Memory size
Steinke et al., Inf 12, UniDo, 2002
18Example Off-chip vs. on-chip memories
ARM7TDMI cores, well-known for low power
consumption
ARM Atmel Evaluation Board
19On-chip vs. off-chip current
Example Atmel ARM-Evaluation board
current reduction / 3.02
board
On-board memory
On-chip memory
Processor
20On-chip vs. off-chip energy
Example Atmel ARM-Evaluation board
Off-chip access takes more cycles? savings (86)
are larger than for the current.
energy reduction/ 7.06
21Exploitation of on-chip memory
For i . for j .. while ... Repeat call
...
Example
Which segment (array, loop, etc.) to be stored in
on-chip memory? Gain gi and size si for each
segment i. Maximise gain G ?gi, respecting
constraint K ? ? si. Static memory
allocation Solution knapsack algorithm. Dynamic
reloading Where to insert calls to copy
function? ? IP-model
board
On-board memory
Array ...
?
On-chip memory,capacity K
Array
Processor
Int ...
22Why not just use a cache ?
Energy consumption in tags, comparators and muxes
significant.
R. Banakar, S. Steinke, B.-S. Lee, 2001
23Results for optimization algorithm
0.5
Steinke et al., Inf 12, UniDo, 2002
24Total energy reduction for MPEG-2
0
20
40
60
80
100
100
Original
33.97
Algorithm (float)
31.83
High-level opt.
21.68
Compiler opt.
6.21
Cache
4.87
Scratch pad (static)
T. Huels, Inf 12, UniDo, 2002
25Optimization technique for microcontrollers and
network processors Bit-field detection
a
7
1
b
0xF1
b
- Assembly
- mov b, 1, a, 0, 3 Cost 1
Wagner, Inf 12, UniDo, 2002
26Results available to industry?
yes!
Informatik 12, UniDo
Center of excellence (IMEC)
ICD e.V.(technology transfer center)
partners of the trinity model
Design houses/ semiconductor vendors
CAD vendors
27Generating efficient software requireswork at
all levels
- Algorithmic level(using the most efficient
algorithm data structures) - High-level source code transformations
- Compiler optimizations
- Code-Compression
- Operating system support(e.g. for minimizing
power consumption)
28Code compression/decompression
Key idea
µP
µP
Addr
Addr
decompressor
ROM
ROM
Very good survey Rik van de Wiel The Code
Compaction Bibliography, www.extra.
research.philips.com/ccb/
29Variable-voltage/frequency example INTEL Xscale
OS should schedule distribution of the energy
budget.
From Intels Web Site
30Conclusion
Making embedded software efficient requires
efforts at alllevels
- At the algorithmic level
- At the level of high-level transformations
- Within the compiler
- At the code compression level
- Within the Embedded OS
The focus of this talk was on compilers and
energy efficiency using new algorithms, the
energy consumption can be significantly reduced..