Title: Embedded Multimedia Systems
1Embedded Multimedia Systems on Silicon (5P520)
Jef van Meerbergen jef.van.meerbergen_at_philips.com
J.L.v.Meerbergen_at_ele.tue.nl slides and course
notes available via www.ics.ele.tue.nl/jef TU
Eindhoven/EESI
2Embedded System Architectures on Silicon
TIVO
- Application oriented
- smart devices
- adaptable, flexible
- real-time DSP
1 cm2 1V 1 W 10 Euro
implemented in silicon
not a Pentium but a domain specific and
programmable ES
3DVP/Viper example (PNX8500)
- ASB and DTV
- domain specific
- and programmable
- 0.18 ?m / 8M
- 1.8V / 4.5 W
- 75 clock domains
- 35 M transistors
Santanu Dutta
4Embedded System Architect
Applications (DSP) algorithms C/C,
Java Matlab, SDL, ...
- is reponsible for a strategic
- interaction between the
- different disciplines
- has a basic knowledge of the
- different disciplines
- is a generalist, not a specialist
Embedded System Architect
low power analog, robustness/dfm VHDL, Verilog
Challengepermanently confronted with new
domains
5Overview of chapter 1
- 1. Introduction
- systems in general
- electronic systems - from analog to digital
- - signals and events
- embedded electronic systems examples and
classification - 2. Silicon as a carrier for embedded systems
- Moores law
- ITRS roadmap
- scaling
- 3. The design process
- behaviour, implementation, constraints and cost
function - different abstraction level boolean, RT and
high level - design flow
- 4. Overview of the course and goals
- types of embedded processors and impact on the
design space
6What is a system ?What is system level design ?
- Can an IC be a system?
- Or is the PCB that uses the IC the real system ?
- IC gt PCB gt set gt transmission system
- There is always a larger system surrounding the
current one. - This is often seen as the real system.
- So nobody is doing system level design ?
- Its hard to define unless we can find an
underlying common - characteristic.
- A system is something complex.
7Complexity
DeMan
- Complexity depends on
- the number of different component types (not
number of components) - different types of interactions
- lack of structure in the interactions
Complex
simple
Complexity is different for the architect and for
the IC technologist
8What is a system ?
Rechtin
- A system is a complex set of heterogeneous
elements - that all together form an organic whole.
- The whole is more than the sum of the parts.
- The system has properties beyond those of the
parts. - The added value comes from the interaction
- between the parts.
Ex. CD player electronics optics mechanics
9Electronic systemsFrom analog to digital
processing
- There is a trend towards digitalisation of
signals. - speech gt audio gt images gt video (e.g. DTV
in the US) - digital processing mathematical operations on
numbers - analog processing depends on physical properties
of devices - advantages of digital
- reproducable, reliable, testable, low
voltage, low power - drawback of digital
- need for compute power
- solution progress in IC technology
10Examples ABS assembly line navigation coding
compression transmission ...
Digital processing
DeMan
A/D
D/A
Sensor
Actuator
Real time
External world
An embedded system performs a specific task for
the environment, typically with real-time
constraints. real-time constraints dictated by
the environment counter example non real-time
word processing on a PC
11Digital Processing
DeMan
Event in
Event out
Control
Mode
Status
DSP
Signal in
Signal out
Digital signals represent measurable (physically
observable) quantities which are sampled
(discrete time), quantised (discrete value) and
binary encoded. A digital signal is a periodic
stream of numbers represented as bitvectors.
12b0 b1 b2 bn
in0 in1 in2 in3 ...
in0 in1 in2 in3 ...
13Signals can have more dimensions
Example 1 audio left and right channel
in ileft
Example 2 video sample frame
line pixel
Frame_i-1
Frame_i
Frame_i1
20 msec
14Event unpredictable and irregular appearance
of a (time,value) pair at an IO port.
Digital Processing
Event in
Event out
Control
Mode
Status
DSP
Signal in
Signal out
Mode or parameter settings control the DSP
part based on external events or internal
information (status). e.g. adaptive changes to
the DSP part dependent on transmission
line properties, conditional access, quality of
service, fallback mechanisms (e.g. frame
skip) increase of volume, switch to a
different TV channel
15Digitaal signaal Onststaat door sampelen van een
analoog signaal (meetbare grootheid) gevolgd door
quantisatie en binaire encodering. Het resultaat
is een periodieke stroom bitvectoren in0,
in1, in2, etc ... waarbij elke inibj
voor j0..n met bj 01 . DSP algoritme
consumeert inputsignalen en produceert
outputsignalen.
DeMan
Inputsignaal
tijdas
Outputsignaal
Event Onregelmatige en onvoorspelbare
(tijd,waarde) paren aan de poorten van de
controle subsysteem. Ze worden eveneerns
voorgesteld door bitvektoren. Controle algoritme
consumeert inputevents en produceert outputevents.
Inputevents
tijdas
Outputevents
16Conclusion type of signal and control processing
is different
- 2 schools
- build implementation hardware which can execute
both - e.g. general purpose processor
- separation on 2 different parts
- e.g. processing of events in software (ARM,
MIPS, etc) - processing of signals more hardware
oriented
17Example CD system
Servo
index
Decoder
A/D
A/D
D/A
4 laser diodes
loudspeakers
Motor
External world
18Embedded electronic systems
- This is explained in the following slides by
comparing - the introduction of embedded systems with
- the introduction of the electric motor in the
19th century.
19Phase 1 the electric factory
- One central large electric motor - Power
was distributed to the workplaces via axes
and belts
20Phase 2 the home electric motor
- Every home got its private electric motor -
A whole suite of appliances could be plugged
into this single motor
21(No Transcript)
22Phase 3 ubiquitous electric motor
- The electric motor is embedded in the
appliance - You often are not aware of the fact
that it contains an electric motor (e.g. 60
electric motors in a modern high end car)
23Phase 1 the computing factory
- One central large mainframe computer -
Compute power was distributed to the workplace
terminals via 9600 bps telephone wires
24Phase 2 the personal computer at home
- Every home got its private computer - A
whole suite of add-ons can be plugged into
this single computer
25Phase 3 embedded systems
- The micro-controller is embedded in the
appliance - You often are not aware of the fact
that it contains a micro-controller (e.g. 70
micro-controllers in a modern high end car
engine control, ABS, airbag, airco, interior
illumination, central lock, alarm, radio, ...)
26Ambient Intelligence, the concept
- An environment that is sensitive, adaptive and
responsive to - the presence of people or objects
- An environment where technology is embedded,
hidden in - the background
- An environment that will preserve security,
privacy and - trustworthiness while utilizing information when
needed - and appropriate.
People to the foreground, technology to the
background
- Ubiquitous communications
- Distributed computing
- Intelligent interfaces
Boekhorst
27DeMan
28Comparison
PC general purpose Who Computes, anyway
? Single hardware platform asap env. adapts to
the system (wait) lower reliability difficult to
use end-user software unlimited resources
embedded system purpose-built and
programmable appliance oriented smart
devices multiple hw/sw platforms real-time
constraint system adapts to the environment high
reliability (no reset button) user
friendly deeply embedded software running on
limited resources
BUT both use similar technology e.g.
programmable cores, RTK (Win-CE)
29Embedded systems terminology
- safety critical
- reactive systems fast reaction on critical
control events - portable weight, power dissipation
- mobile network protocols, power dissipation
- consumer systems cost, reliability, user
friendly interface - professional systems availability, reliability,
remote analysis - and
diagnosis, redundancy - multimedia text, graphics, speech, audio,
images and video - internet oriented embedded systems
30Overview of chapter 1
- 1. Introduction
- systems in general
- electronic systems - from analog to digital
- - signals and events
- embedded electronic systems examples and
classification - 2. Silicon as a carrier for embedded systems
- Moores law
- ITRS roadmap
- scaling
- 3. The design process
- behaviour, implementation, constraints and cost
function - different abstraction level boolean, RT and
high level - design flow
- 4. Overview of the course and goals
- types of embedded processors and impact on the
design space
31Why are embedded systems important now ?
The reason is the enormous and continuous
progress in IC technology. This offers a unique
rate at which systems with more transistors can
be build in a reliable way.
S ystems O n S ilicon
Problem complexity increase Gorden Moores
law (founder of Intel) transistor
count doubles every 18 months transistor count
increases with 58 per year
This is the most important trend driving embedded
systems.
32HistoryDivergence and Convergence
DeMan
33Moores law for Microprocessors
Rensink
34Moores law for dynamic memories
Rensink
35Moores law for Programmable Logic Devices
Rensink
361971 vs 1993
4004
Pentium
37ITRS roadmap (International technology roadmap
for semiconductors) before SIA (Semiconductor
Industry Association)
- consensus building process w.r.t. future
technologies - every 2/3 years a new process generation ( 0.7)
ICS website
38Example possibilities of a 0.13 micron process
Logic (Mtrans/cm2) 18 memory (stand
alone) 4Gb VDD 1.2-1.5 V
Mips R3000 (32 bit proc) 0.5 mm2 1 Mbit Dram
0.5 mm2 1 frame delay (6258638)
2.1 mm2
39Consequence 1 software centric
- Importance of embedded software will increase
- gt software centric systems
- see facts (next slide)
- 2 reasons
- features changing after the spec is frozen
- deep sub-micron effects make VLSI design very
complex - Note difference between PC software and embedded
software - (no unlimited Bill-Gates-type-of-resources)
40Increase in SW content for TV and VCR
411st reason for SW flexibility TV example
422nd reason for SW design time of HW
Deep submicron transistors absolute distances
Al
Al
W
W
Cu
Cu
W
W
n
n
n
n
0.5 ?m
0.1 ?m
43Deep submicron transistors relative distances
TaN
Al
Al
Ti/TiN
W
W
n
n
0.5 ?m
0.1 ?m
44Multi-layer interconnect
Diep-Submicron Twin Well 0.25
micron H. Veendrick DeepSubmicron CMOS ICs
From Basics to Asics (Kluwer)
45RC delay of the interconnect becomes important
? 30m?/m/mm2 3 ? ?cm
R 3 10-6 ? 104 ?/0.25 ?2 0.12 ? / ?
C is more difficult because of assumptions on
the environment
46Wire delay (metal 1) vs. gate delay
Delay increases quadratically with the length
47Gate vs. Interconnect DelayCan technology help ?
430 ?
48Overview of trends
- 1. importance of embedded software
- features changing after the spec is frozen
- deep sub-micron effects make VLSI design very
complex - 2. True limit power dissipation, not area
- Ex 1 Mips core in a 0.35 ? process
- 324 mW _at_ 81 MHz, 20 mm2, 4 nJ/op
- Ex 2 TM core in a 0.18 ? process
- 1.4 W _at_ 166 MHz, 17 mm2, 2.1 nJ/op
- Ex 3 32 bit accumulator in a 0.35 ? process
- 1 mW _at_ 100 MHz, 0.02 mm2, 0.01 nJ/op
- 3. The basic contradiction between 1 and 2 leads
to - new architectures
- reconfigurable computing
- distributed embedded memories
49Scaling
50Scaling
drain
gate
source
Ids ?/2 W/L (Vgs-Vt)2
? ?n?0?ox/tox
Example ?? 240 ?A/V2 for an n-channel
transistor in a 0.25 ? process (tox 5nm 50 Å)
51Scaling
T_gate Q / Ids C V / Ids
s p / p2 / s s2 / p
VDD
p
clock frequency f p / s2
energy per transition 1/2C V2 s p2
C
n
power dissipation f C V2 p3 / s
VSS
52- Fill in the table
- Verify that the limiting factor is not area but
power dissipation. Analyze the trend.
Scaling exercise
53Overview of chapter 1
- 1. Introduction
- systems in general
- electronic systems - from analog to digital
- - signals and events
- embedded electronic systems examples and
classification - 2. Silicon as a carrier for embedded systems
- Moores law
- ITRS roadmap
- scaling
- 3. The design process
- behaviour, implementation, constraints and cost
function - different abstraction level boolean, RT and
high level - design flow
- 4. Overview of the course and goals
- types of embedded processors and impact on the
design space
54Basic concepts
Behaviour WHAT the circuit is supposed to do
relation between inputs and
outputs
Implementation HOW the circuit is designed
internals of the circuit
55no implementation bias (ideally) must be
validated important as documentation not poetry
Basic concepts
behaviour
specification
constraints (sometimes considered as part
of the specification)
Design (synthesis)
costfunction
feasible space constraints are satisfied
optimal implementation minimal cost
Implementation space (can be very large)
56Basic concepts
Specification of behaviour
simulation
Design (mapping)
verification
Implementation
simulation
57Examples of constraints and costfunctions
area, timing, power dissipation, quality
(perceptive difficult to measure)
cost
constraint
timing (signal rates)
area, power
Signal processing
embedded
timing (latency)
area, power
Event processing
General purpose computing
execution time
area
58hard real-time missing a deadline is
catastrophic e.g. safety critical ES
soft real-time a deadline can be missed without
compromising the system integrity.
utility
utility
time
time
damage
damage
59Example at the boolean level
carry ab bd ad sum (a b d) !carry abd
a
b
d
sum
a
a
b
b
d
b
d
a
carry
d
60Example of pure behaviour (exhaustive enumeration
of IO patterns) at the boolean level
a
b
d
sum
a
a
b
b
d
b
d
a
carry
d
61Behaviour versus structure
Pure behavioural description is difficult and
only possible for very small circuits.
Func main (x) y u f1(x) v
f2(x) y f3(u,v)
Structural interpretation hierarchy is a hint
for the implementation
Behavioural interpretation hierarchy is only an
aid for specification
f1
main
f3
x
x
y
y
f2
62Example 2 RT-level behaviour
process variable z t_z begin wait
until clock'event and clock '1' if (reset
'1') then state lt 0 else case state
is when '0' gt z x1-r r lt x4z state
lt (zgt0) ? 12 when '1' gt z x2-r r lt
x4z state lt 2 when '2' gt z x3-r
r lt x4- z state lt 0 end case end
if end process
63RT-level structureFSMD
0
x1
Mux M1
A1
1
f0
x2
r
Mux M2
-
S1
Mux M3
2
x3
-
S2
f1
x4
s
cm2
cm1
Zgt0
clock
cm4
FSM
st
2
cm1 st1 cm2 cm4 cm4 st1 !st0
st0 !cm4 !st0 !s st1 !cm4
64FSMD design
- FSMD construction rules
- each variable (which is transferred between
state) and constant - corresponds to a register
- each operator corresponds to a functional unit
- functional units are shared if they are not used
in the same - clockcycle
- connect outputs of registers to input of
functional units - when multiple outputs connect to the same
input MUX or bus - with tristate drivers
- connect output of functional units to input of
registers - when multiple outputs connect to the same
input MUX or bus - with tristate drivers
- Controller is a FSM
65High-level behaviour
Y/X (8 - 7z-1) / (2 - z-1) if h1 Y/X (2
- z-1) / (8 - 7z-1) if h0
filter (x, h, reset) int x,h,reset static int
y, xd / storage vars / if (reset) y
0 xd 0 else if (h) y ( x 7(x-xd)
y ) / 2 else y ( x (x-xd) 7y ) /
8 xd x return (y)
DeMan
66process variable u, y, d t_22 begin wait
until clock'event and clock '1' x
((wl-16)input
/ 16-gt22 bit sign extensie/ if (reset
'1') then t lt 0 ry lt 0 xd lt 0 high lt
h state lt 1 else case state is when '1' gt
if (start) d x-xd xd lt x high lt h
state lt 2 if (h) t lt (d ltlt 3)
- d else t lt (ry ltlt 3) d when
'2' gt if (high) u txdry y
ugtgt1 else u txd-ry
y ugtgt3 ry lt y ready lt 1
state lt 1 end case end if end process
DeMan
670 1
3 0
Asl
0 1
16 22
xd
Asl
t
1 0
1 3
ry
!reset cxd cma cas1 cmc cmb cs30 cas2
cs13 !reset ct cy cry
nextstate !state !reset start cxd ch
reset start !state cma cas1 cmc cs30
reset !state cmb !state !reset start
!h cas2 ...
C logic of controller
h
high
state
nextstate
ch
cl
reset
start
ready
DeMan
68Design flow
what_0
behaviour
implementation
69Constraints
Cost
Application domain
Design flow
Concept fase
Executable spec. (parallel processes) concept
architecture
Multi-Processor (chapters 7-9)
System level design
Multi-proc. architecture (instance) HL
behaviour per proc.
High level design
RT behaviour
RT level design
Processor design (chapters 3-6)
RT level architecture boolean behaviour
Logic level design
Gate level architecture
physical design
layout
70Concept phase
Input application domain, constraints, cost
function desired features, rough
planning,
Outputs
Executable spec - sequential or parallel
processes - heterogeneous (dataflow FSM)
Reduce uncertainty, marketdomain analysis with a
small architecting team
71Control task 3
Control task 4
Dsp task 1
sum00 for i1 to n-1 sumisumi-1ai
DSP task 2
Example of HL behaviour
Phideo proc.
Real DSP
MS_int
MS_int
BCU
RAM
I
D
MIPS3930 8KB I 4KB D
M_int
72instr.
C o n t r o l
clock
clock
rega
regb
RAM
rega
addr
alu
regsum
clock
0
flags
alublock begin with c select out lt
ab when 000 a-b when 001 ab1
when 010 ... end block alu
Switch (pc) 1 sum 0 pcpc1 2
addraddr1 pcpc1 3 read(a,addr) pcpc1
4 read(b,addr) pcpc1 5 b a x c
pcpc1 6 sum sum b pcpc1
73a
b
FU
b2
direction
74Design flow summary
We concentrate on the levels above RT because the
impact of a design decision at higher levels is
larger. Vendors concentrate on lower levels
(more general solutions). ( Cadence (Ambit),
Mentor Graphics (Autologic), Synopsis (Design
Compiler)
75impact of a design decision
Conceptual level high level RT level gate
level transistor level
complexity
76Overview of chapter 1
- 1. Introduction
- systems in general
- electronic systems - from analog to digital
- - signals and events
- embedded electronic systems examples and
classification - 2. Silicon as a carrier for embedded systems
- Moores law
- ITRS roadmap
- scaling
- 3. The design process
- behaviour, implementation, constraints and cost
function - different abstraction level boolean, RT and
high level - design flow
- 4. Overview of the course and goals
- types of embedded processors and impact on the
design space
77Its all about costs
- Design/volume?less designs, more GP
- Masks/volume ?less designs, more GP
- Processed wafer/GDW ?lower area, more AS
- Packaging ?lower power, more AS
- test
78(No Transcript)
79DSP
Programmable CPU
Programmable DSP
Application specific instruction set processor
(ASIP)
Application specific processor
80efficiency
ASIC
high medium low
ASIP
DSP
GP proc FPGA
low medium high
flexibility
81The limiting factor for integration on a single
chip is no longer area but power dissipation
Ex TM core in a 0.18 ? process 1.4 W _at_ 166
MHz, 17 mm2
gt Computational efficiency (MOPS/Watt) differs
more than 2 orders of magnitude
82ICE of silicon
83Types of Embedded Cores
weakly-programmable (parametrizable) functions
(FSMD) e.g. video functions
application specific and programmable cores
(ASIP) e.g. Tensilica, ART VLIW
programmable DSP cores e.g. R.E.A.L., Palm, Oak
...
programmable general purpose CPU cores e.g. ARM,
MIPS, TM ...
- embedded memories
- a heterogeneous multiprocessor architecture
84Consequences
- Consequence 1 more software
- Consequence 2 power dissipation becomes the
limiting factor - Contradiction leads to new architectures
- heterogeneous architectures combining different
types - of embedded cores
- reconfigurable architectures
- embedded distributed memory
- Large design space and many exponential
processes but small - margin on cost defines ultimate failure or succes
85Overview
2. General topics on specification and mapping 3.
Programmable CPU processors 4. Programmable DSP
processors 5. Application domain specific
processors (synthesized, prog.) 6. Application
specific processors (synthesized) 7. Embedded
multiprocessor systems 8. Simple multiprocessor
architectures with one bus 9. Complex
multiprocessor architectures with hierarchy
86Course goals
- understand the design space and the trade-offs
between area, - time and power
- understand the trends and the driving forces
behind the - different types of embedded cores (hardware and
software) - understand the role and the task of the system
level architect - understand the different ways of implementing
the circuit on siliconS