Advanced Tutorial on Hardware Design and Petri nets - PowerPoint PPT Presentation

About This Presentation
Title:

Advanced Tutorial on Hardware Design and Petri nets

Description:

Advanced Tutorial on Hardware Design and Petri nets Jordi Cortadella Univ. Polit cnica de Catalunya Luciano Lavagno Universit di Udine Alex Yakovlev Univ ... – PowerPoint PPT presentation

Number of Views:229
Avg rating:3.0/5.0
Slides: 104
Provided by: Josep430
Learn more at: https://www.cs.upc.edu
Category:

less

Transcript and Presenter's Notes

Title: Advanced Tutorial on Hardware Design and Petri nets


1
Advanced Tutorial on Hardware Design and Petri
nets
  • Jordi Cortadella Univ. Politècnica de Catalunya
  • Luciano Lavagno Università di Udine
  • Alex Yakovlev Univ. Newcastle upon Tyne

2
Tutorial Outline
  • Introduction
  • Modeling Hardware with PNs
  • Synthesis of Circuits from PN specifications
  • Circuit verification with PNs
  • Performance analysis using PNs

3
Introduction.Outline
  • Role of Hardware in modern systems
  • Role of Hardware design tools
  • Role of a modeling language
  • Why Petri nets are good for Hardware Design
  • History of relationship between Hardware Design
    and Petri nets
  • Asynchronous Circuit Design

4
Role of Hardware in modern systems
  • Technology soon allows putting 1 billion
    transistors on a chip
  • Systems on chip is a reality 1 billion
    operations per second
  • Hardware and software designs are no longer
    separate
  • Hardware becomes distributed, asynchronous and
    concurrent

5
Role of Hardware design tools
  • Design productivity is a problem due to chip
    complexity and time to market demands
  • Need for well-integrated CAD with simulation,
    synthesis, verification and testing tools
  • Modelling of system behaviour at all levels of
    abstraction with feedback to the designer
  • Design re-use is a must but with max technology
    independence

6
Role of Modelling Language
  • Design methods and tools require good modelling
    and specification techniques
  • Those must be formal and rigorous and easy to
    comprehend (cf. timing diagrams, waveforms,
    traditionally used by logic designers)
  • Todays hardware description languages allow high
    level of abstraction
  • Models must allow for equivalence-preserving
    refinements
  • They must allow for non-functional qualities such
    as speed, size and power

7
Why Petri nets are good
  • Finite State Machine is still the main formal
    tool in hardware design but it may be inadequate
    for distributed, concurrent and asynchronous
    hardware
  • Petri nets
  • simple and easy to understand graphical capture
  • modelling power adjustable to various types of
    behaviour at different abstraction levels
  • formal operational semantics and verification of
    correctnes (safety and liveness) properties
  • possibility of mechanical synthesis of circuits
    from net models

8
A bit of history of their marriage
  • 1950s and 60s Foundations (Muller Bartky,
    Petri, Karp Miller, )
  • 1970s Toward Parellel Computations (MIT,
    Toulouse, St. Petersburg, Manchester )
  • 1980s First progress in VLSI and CAD,
    Concurrency theory, Signal Transition Graphs
    (STGs)
  • 1990s First asynchronous design (verification
    and synthesis) tools SIS, Forcage, Petrify
  • 2000s Powerful asynchronous design flow

9
Introduction to Asynchronous Circuits
  • What is an asynchronous circuit?
  • Physical (analogue) level
  • Logical level
  • Speed-independent and delay-insensitive circuits
  • Why go asynchronous?
  • Why control logic?
  • Role of Petri nets
  • Asynchronous circuit design based on Petri nets

10
What is an asynchronous circuit
  • No global clock circuits are self-timed or
    self-clocked
  • Can be viewed as hardwired versions of parallel
    and distributed programs statements are
    activated when their guards are true
  • No special run-time mechanism the program
    statements are physical components logic gates,
    memory latches, or hierarchical modules
  • Interconnections are also physical components
    wires, busses

11
Synchronous Design
Clock
Data input
Data
Register Sender
Register Receiver
Clock
Logic
Tsetup
Thold
Timing constraint input data must stay unchanged
within a setup/hold window around clock event.
Otherwise, the latch may fail (e.g. metastability)
12
Asynchronous Design
Req(est)
Ack(nowledge)
Data input
Data
Register Sender
Register Receiver
Req
Logic
Ack
Req/Ack (local) signal handshake protocol instead
of global clock Causal relationship Handshake
signals implemented with completion detection in
data path
13
Physical (Analogue) level
  • Strict view an asynchronous circuit is a
    (analogue) dynamical system e.g. to be
    described by differential equations
  • In most cases can be safely approximated by logic
    level (0-to-1 and 1-to-0 transitions)
    abstraction even hazards can be captured
  • For some anomalous effects, such as metastability
    and oscillations, absolute need for analogue
    models
  • Analogue aspects are not considered in this
    tutorial (cf. reference list)

14
Logical Level
  • Circuit behaviour is described by sequences of up
    (0-to-1) and down (1-to-0) transitions on inputs
    and outputs
  • The order of transitions is defined by causal
    relationship, not by clock (a causes b, directly
    or transitively)
  • The order is partial if concurrency is present
  • A class of async timed (yet not clocked!)
    circuits allows special timing order relations (a
    occurs before b, due to delay assumptions)

15
Simple circuit example
ack1
req1

C
x
out(xy)(ab)
y
req3
ack2
ack3
req2

a

out
b
16
Simple circuit example
ack1
req1

C
x
out(xy)(ab)
y
req3
ack2
ack3
req2

a

out
b
x

y
out

a

b
Data flow graph
17
Simple circuit example
ack1
req1

C
x
out(xy)(ab)
y
req3
ack2
ack3
req2

a

out
b
x
req1
ack1

req3
ack3
y
out

ack2
a
req2

b
Data flow graph
Control flow graph Petri net
18
Muller C-element
Key component in asynchronous circuit design
like a Petri net transition
x1
yx1x2(x1x2)y
C
y
x2
19
Muller C-element
Key component in asynchronous circuit design
like a Petri net transition
x1
yx1x2(x1x2)y
C
y
x2
20
Muller C-element
Key component in asynchronous circuit design
like a Petri net transition
0
x1
0
yx1x2(x1x2)y
C
y
0
x2
Set-part
Reset-part
21
Muller C-element
Key component in asynchronous circuit design
like a Petri net transition
0-gt1
x1
0
yx1x2(x1x2)y
C
y
0
x2
Set-part
Reset-part
22
Muller C-element
Key component in asynchronous circuit design
behaves like a Petri net transition
0-gt1
x1
0
yx1x2(x1x2)y
C
y
0-gt1
x2
Set-part
Reset-part
excited
23
Muller C-element
Key component in asynchronous circuit design
behaves like a Petri net transition
1
x1
0
yx1x2(x1x2)y
C
y
1
x2
Set-part
Reset-part
excited
24
Muller C-element
Key component in asynchronous circuit design
behaves like a Petri net transition
1
x1
1
yx1x2(x1x2)y
C
y
1
x2
Set-part
Reset-part
stable (new value)
25
Muller C-element
Key component in asynchronous circuit design
behaves like a Petri net transition
1
x1
1
yx1x2(x1x2)y
C
y
1
x2
Set-part
Reset-part
26
Muller C-element
Key component in asynchronous circuit design
behaves like a Petri net transition
1-gt0
x1
1
yx1x2(x1x2)y
C
y
1
x2
Set-part
Reset-part
27
Muller C-element
Key component in asynchronous circuit design
behaves like a Petri net transition
1-gt0
x1
1
yx1x2(x1x2)y
C
y
1-gt0
x2
Set-part
Reset-part
excited
28
Muller C-element
Key component in asynchronous circuit design
behaves like a Petri net transition
0
x1
0
yx1x2(x1x2)y
C
y
0
x2
Set-part
Reset-part
stable (new value)
29
Muller C-element
Key component in asynchronous circuit design
like a Petri net transition
x1
yx1x2(x1x2)y
C
y
x2
It acts symmetrically for pairs of 0-1 and 1-0
transitions waits for both input events to occur
30
Muller C-element
Key component in asynchronous circuit design
like a Petri net transition
x1
yx1x2(x1x2)y
C
y
x2
It acts symmetrically for pairs of 0-1 and 1-0
transitions waits for both input events to occur
31
Muller C-element
Power
NMOS circuit implementation
y
x1
x2
x1
x2
Ground
32
Muller C-element
Power
y
x1
x2
x1
x2
Ground
33
Muller C-element
Power
y
x1
x2
x1
x2
Ground
34
Why asynchronous is good
  • Performance (work on actual, not max delays)
  • Robustness (operationally scalable no clock
    distribution important when gate-to-wire delay
    ratio changes)
  • Low Power (change-based computing fewer
    signal transitions)
  • Low Electromagnetic Emission (more even
    power/frequency spectrum)
  • Modularity and re-use (parts designed
    independently well-defined interfaces)
  • Testability (inherent self-checking via ack
    signals)

35
Obstacles to Async Design
  • Design tool support commercial design tools are
    aimed at clocked systems
  • Difficulty of production testing production
    testing is heavily committed to use of clock
  • Aversion of majority of designers, trained with
    clock biggest obstacle
  • Overbalancing effect of periodic (every 10 years)
    asynchronous euphoria

36
Why control logic
  • Customary in hardware design to separate control
    logic from datapath logic due to different design
    techniques
  • Control logic implements the control flow of a
    (possibly concurrent) algorithm
  • Datapath logic deals with operational part of the
    algorithms
  • Datapath operations may have their (lower level)
    control flow elements, so the distinction is
    relative
  • Examples of control-dominated logic a bus
    interface adapter, an arbiter, or a modulo-N
    counter
  • Their behaviour is a combination of partial
    orders of signal events
  • Examples of data-dominated logic are a register
    bank or an arithmetic-logic unit (ALU)

37
Role of Petri Nets
  • We concentrate here on control logic
  • Control logic is behaviourally more diverse than
    data path
  • Petri nets capture causality and concurrency
    between signalling events, deterministic and
    non-deterministic choice in the circuit and its
    environment
  • They allow
  • composition of labelled PNs (transition or place
    sync/tion)
  • refinement of event annotation (from abstract
    operations down to signal transitions)
  • use of observational equivalence (lambda-events)
  • clear link with state-transition models in both
    directions

38
Design flow with Petri nets
Abstract behaviour synthesis
Abstract behavioural model Labelled Petri nets
(LPNs)
Signalling refinement
Timing diagrams
Verification and Performance analysis
Logic behavioural model Signal Transition Graphs
(STGs)
STG-based logic synthesis (deriving boolean
functions)
Syntax-direct translation (deriving circuit
structure)
Decomposition and gate mapping
Circuit netlist
Library cells
39
Tutorial Outline
  • Introduction
  • Modeling Hardware with PNs
  • Synthesis of Circuits from PN specifications
  • Circuit verification with PNs
  • Performance analysis using PNs

40
Modelling.Outline
  • High level modelling and abstract refinement
    processor example
  • Low level modelling and logic synthesis
    interface controller example
  • Modelling of logic circuits event-driven and
    level-driven parts
  • Properties analysed

41
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
42
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
43
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
44
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
45
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
46
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
47
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
48
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
49
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
50
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution (not exactly yet!)
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
51
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
52
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution (now it is!)
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
53
High-level modellingProcessor Example
Instruction Fetch
Instruction Execution
One-word Instruction Decode
One-word Instruction Execute
Memory Read
Two-word Instruction Execute
Program Counter Update
Memory Address Register Load
Two-word Instruction Decode
Instruction Register Load
54
High-level modellingProcessor Example
  • The details of further refinement, circuit
    implementation (by direct translation) and
    performance estimation (using UltraSan) are in
  • A. Semenov, A.M. Koelmans, L.Lloyd and A.
    Yakovlev. Designing an asynchronous processor
    using Petri Nets, IEEE Micro, 17(2)54-64, March
    1997
  • For use of Coloured Petri net models and use of
    Design/CPN in processor modeling
  • F.Burns, A.M. Koelmans and A. Yakovlev.
    Analysing superscala processor architectures with
    coloured Petri nets, Int. Journal on Software
    Tools for Technology Transfer, vol.2, no.2, Dec.
    1998, pp. 182-191.

55
Low-level ModellingInterface Example
  • Insert VME bus figure 1 timing diagrams

56
Low-level ModellingInterface Example
  • Insert VME bus figure 2 - STG

57
Low-level ModellingInterface Example
  • Details of how to model interfaces and design
    controllers are in
  • A.Yakovlev and A. Petrov,
  • complete the reference

58
Low-level ModellingInterface Example
  • Insert VME bus figure 3 circuit diagram

59
Logic Circuit Modelling
Event-driven elements
Petri net equivalents
C
Muller C-element
Toggle
60
Logic Circuit Modelling
Level-driven elements
Petri net equivalents
y(0)
x0
x(1)
y1
y0
x1
NOT gate
x0
x(1)
z(0)
z1
y0
y(1)
b
NAND gate
x1
z0
y1
61
Event-driven circuit example
  • Insert the eps file for fast fwd pipeline cell
  • control

62
Level-driven circuit example
  • Insert the eps file for the example with
  • two inverters and OR gate

63
Properties analysed
  • Functional correctness (need to model
    environment)
  • Deadlocks
  • Hazards
  • Timing constraints
  • Absolute (need for Time(d) Petri nets)
  • Relative (compose with a PN model of order
    conditions)

64
Adequacy of PN modelling
  • Petri nets have events with atomic action
    semantics
  • Asynchronous circuits may exhibit behaviour that
    does not fit within this domain due to inertia

a b
a
a
00
10
01
b
11
b
65
Other modelling examples
  • Examples with mixed event and level based
    signalling
  • Lazy token ring arbiter spec
  • RGD arbiter with mutex

66
Lazy ring adaptor
Lr
R
dum
dum
G
Rr
La
D
Ra
t0 (token isnt initially here)
t1
t0
67
Lazy ring adaptor
Lr
R
R
dum
G
D
dum
Rr
Lr
G
Rr
Ring adaptor
Ra
La
La
D
Ra
t0-gt1-gt0 (token must be taken from the right and
past to the left
t1
t0
68
Lazy ring adaptor
Lr
R
R
dum
G
D
dum
Rr
Lr
G
Rr
Ring adaptor
Ra
La
La
D
Ra
t1 (token is already here)
t1
t0
69
Lazy ring adaptor
Lr
R
R
dum
G
D
dum
Rr
Lr
G
Rr
Ring adaptor
Ra
La
La
D
Ra
t0-gt1 (token must be taken from the right)
t1
t0
70
Lazy ring adaptor
Lr
R
R
dum
G
D
dum
Rr
Lr
G
Rr
Ring adaptor
Ra
La
La
D
Ra
t1 (token is here)
t1
t0
71
Tutorial Outline
  • Introduction
  • Modeling Hardware with PNs
  • Synthesis of Circuits from PN specifications
  • Circuit verification with PNs
  • Performance analysis using PNs

72
Synthesis.Outline
  • Abstract synthesis of LPNs from transition
    systems and characteristic trace specifications
  • Handshake and signal refinement (LPN-to-STG)
  • Direct translation of LPNs and STGs to circuits
  • Examples
  • Logic synthesis from STGs
  • Examples

73
Synthesis from trace specs
  • Modelling behaviour in terms of characteristic
    predicates on traces (produce LPN snippets)
  • Construction of LPNs as compositions of snippets
  • Examples n-place buffer, 2-way merge

74
Synthesis from transition systems
  • Modelling behaviour in terms of a sequential
    capture transition system
  • Synthesis of LPN (distributed and concurrent
    object) from TS (using theory of regions)
  • Examples one place buffer, counterflow pp

75
Synthesis from process-based languages
  • Modelling behaviour in terms of a process
  • (-algebraic) specifications (CSP, )
  • Synthesis of LPN (concurrent object with explicit
    causality) from process-based model (concurrency
    is explicit but causality implicit)
  • Examples modulo-N counter

76
Refinement at the LPN level
  • Examples of refinements, and introduction of
    silent events
  • Handshake refinement
  • Signalling protocol refinement (return-to-zero
    versus NRZ)
  • Arbitration refinement
  • Brief comment on what is implemented in Petrify
    and what isnt yet

77
Translation of LPNs to circuits
  • Examples of refinements, and introduction of
    silent events

78
Why direct translation?
  • Logic synthesis has problems with state space
    explosion, repetitive and regular structures
    (log-based encoding approach)
  • Direct translation has linear complexity but can
    be area inefficient (inherent one-hot encoding)
  • What about performance?

79
Direct Translation of Petri Nets
  • Previous work dates back to 70s
  • Synthesis into event-based (2-phase) circuits
    (similar to micropipeline control)
  • S.Patil, F.Furtek (MIT)
  • Synthesis into level-based (4-phase) circuits
    (similar to synthesis from one-hot encoded FSMs)
  • R. David (69, translation FSM graphs to CUSA
    cells)
  • L. Hollaar (82, translation from parallel
    flowcharts)
  • V. Varshavsky et al. (90,96, translation from
    PN into an interconnection of David Cells)

80
Synthesis into event-based circuits
  • Patils translation method for simple PNs
  • Furteks extension to 1-safe net
  • Pragmatic extensions to Patils set (for
    non-simple PNs)
  • Examples modulo-N counter, Lazy ring adapter

81
Synthesis into level-based circuits
  • Davids method for FSMs
  • Holaars extensions to parallel flow charts
  • Varshavskys method for 1-safe Petri nets
  • Examples counter, VME bus, butterfly circuit

82
Davids original approach
a
x1
yb
x1
x2
b
d
ya
yc
c
x2
x1
x2
CUSA for storing state b
Fragment of flow graph
83
Hollaars approach
(0)
M
(1)
K
A
(1)
N
M
N
(1)
B
(1)
L
L
K
1
(0)
A
1
B
Fragment of flow-chart
One-hot circuit cell
84
Hollaars approach
1
M
0
K
A
(1)
N
M
N
0
B
(1)
L
L
K
1
(0)
A
1
B
Fragment of flow-chart
One-hot circuit cell
85
Hollaars approach
1
M
0
K
A
(1)
N
M
N
1
B
(1)
L
L
K
0
(0)
A
1
B
Fragment of flow-chart
One-hot circuit cell
86
Varshavskys Approach
Controlled
Operation
p1
p2
p2
p1
(0)
(1)
(1)
(0)
(1)
1
To Operation
87
Varshavskys Approach
p1
p2
p2
p1
0-gt1
1-gt0
(1)
(0)
(1)
1-gt0
88
Varshavskys Approach
p1
p2
p2
p1
1-gt0
0-gt1
1-gt0
0-gt1
1
1-gt0-gt1
89
Translation in brief
This method has been used for designing control
of a token ring adaptor Yakovlev et al.,Async.
Design Methods, 1995 The size of control was
about 80 David Cells with 50 controlled hand
shakes
90
Direct translation examples
  • In this work we tried direct translation
  • From STG-refined specification (VME bus
    controller)
  • Worse than logic synthesis
  • From a largish abstract specification with high
    degree of repetition (mod-6 counter)
  • Considerable gain to logic synthesis
  • From a small concurrent specification with dense
    coding space (butterfly circuit)
  • Similar or better than logic synthesis

91
Example 1 VME bus controller
Result of direct translation (DC unoptimised)
92
VME bus controller
After DC-optimisation (in the style of Varshavsky
et al WODES96)
93
David Cell library
94
Data path control logic
Example of interface with a handshake control
(DTACK, DSR/DSW)
95
Example 2 Flat mod-6 Counter
  • TE-like Specification
  • ((p?q!)5p?c!)
  • Petri net (5-safe)

q!
5
p?
5
c!
96
Flat mod-6 Counter
Refined (by hand) and optimised (by Petrify)
Petri net
97
Flat mod-6 counter
Result of direct translation (optimised by hand)
98
David Cells and Timed circuits
(a) Speed-independent
(b) With Relative Timing
99
Flat mod-6 counter
(a) speed-independent
(b) with relative timing
100
Butterfly circuit
STG after CSC resolution
Initial Specification
a
b
x
a
a-
z
y
x-
b-
a-
b
b-
y-
z-
101
Butterfly circuit
Speed-independent logic synthesis solution
102
Butterfly circuit
Speed-independent DC-circuit
103
Butterfly circuit
DC-circuit with aggressive relative timing
Write a Comment
User Comments (0)
About PowerShow.com