Title: Priority encoder
1Priority encoder
2Overview
- Priority encoder- theoretic view
- Other implementations
- The chosen implementation- simulations
- Calculations and comparisons
3The target of the project
- Building priority encoder using the multilevel
lookahead and folding techniques
4Uses of priority encoding
- INR - interconnection network router
- design of SAE sequential address encoder of a
content associate memory (CAM) - microcontroller and microprocessor
- (incrementer / decrementer)
-
5basic concepts of priority encoders
- The i-th output bit EPi Di Pi
- Di- the input data
- Pi- the priority token passed into this bit
- the relationship between Pi and Pi-1
- Pi Di-1 Pi-1
- the generated EPi is
- EPi Di Di-1 Di-2 D1 D0
6Different implementations
- For 4 bit priority encoder
7matrix
Sum of minterms, the straight-forward
implementation
- Because of a minimal distance needed between the
lines the layout is large and complicated.
8Basic units
- The structure is build from equal units. Each
unit calculates yi and xpi for the i-th bit
9- Then, by chaining the units we construct the
output
In this implementation we save silicon area, but
pay in propagation delay
10tree
- Tree of multiplexers implemented by butterflies
- Efficient implementation in area and power, has
longer propagation - then the folding technique
11the multilevel lookahead structure
- The output third-level lookahead signal of the
ith 8-bit macro is - LA3ii0n-1 D8i7 D8i6 D8i5 D8i4
D8i3 D8i2 D8i1 D8i LA3i-1 - LA3-1 0
- n N/8
- N number of input bits
- The ith 4-bit sub macros
- LA2i D8i3D8i2D8i1D8iLA3i-1
12The 8-bit macro formulas
- EP8i D8i LA3i-1
- EP8i1 D8i1 D8i LA3i-1
- EP8i2 D8i2 D8i1 D8i LA3i-1
- EP8i3 D8i3 D8i2 D8i1 D8i LA3i-1
- EP8i4 D8i4 LA2i
- EP8i5 D8i5 D8i4 LA2i
- EP8i6 D8i6 D8i5 D8i4 LA2i
- EP8i7 D8i7 D8i6 D8i5 D8i4 LA2i
138-bit macro cell
14Diagram of 32-bit chain designed encoder
15The folding technique-first level folding
- The LA3i that generated by the macro with the
higher priority can be connected to other macros
with lower priority. - Such connection can make the critical path
shorter - In this connection well lose the advantage in
layout arrangement and wiring complexity
16Folding - implementation
- Well connect LA30 to the second and the fourth
macros (not to the third) and well get 2x2
matrix - in this way the fourth macro is connected to 2
neighboring macros - the number of gate delays is reduced to 4
(ltlog232 )
17Block diagram of a 32-bit priority encoder with
folding
1864 bit priority encoder with first level folding
19Multilevel folding
- In order to reduce the gate delay to be less then
log2N in grater priority encoders, we can apply
the folding technique again again for example - N128
- First-Level folding 8 gate delay
- Second-Level folding 7 gate delay
- Third-Level folding lt7 gate delay
2064-bit priority encoder with 2 levels of folding
21 - For 256-bit priority encoder the new design can
achieve about 10 times performance while spending
½ power consumption.
22The implementation
- We decided to implement the project using bottom
up architecture, starting with a 1 bit unit. - Each stage will be checked separately.
- Moving to the next stage is only after the
previous stage is finished
231 bit unit
- At first we implemented 1 bit unit and checked
it. - The circuit
24The simulation
The output
Lookahead bit
The input
The clock
25The 4 bit unit
26The input signals
27The outputs
Lookahead
When the lookahead high all the outputs equals
zero
outputs
28The 8-bit unit
29The output signals
v3
Not valid
v0
30The next lookahead
v7
v4
31The 32-bit chain encoder
32The results
33The problem we encountered
glitches
34The glitch
the glitch starts after clock rising
clock rising
35The widest glitch comes at higher bits
clock
Bit 60
3632 bit-folding
3764 bit first level folding
3864 bit second level folding
39(No Transcript)
4064 bit second level folding with one critical
path
41Propagation delay - reduction
- To minimize the propagation delay of the EP
- we made the following changes
- Reduced the clock period from 200ns to 20ns.
- Divide the clock pulse to different periods for
low time and high time. - Those changes made under the constrains of
- Keeping the high pulse length 80 of the base
pulse. - Making sure all the requested changes and
currents are stable before clock raising. - The optimum result we conclude for the clock
period 5ns for low time and 15ns high time.
42Results 32 bit
43Results 64 bit
44Results 64 bit (high)
4580 high pulse
46The vhdl simulation
47The vhdl simulation of a 32 bit priority encoder
Here the lsb of input changes from 0 to 1, and
the output changes
48(No Transcript)
49Compare table
unit matrix tree folding
Area mm² 0.076 0.076 0.043 0.053
Power 10-11fw 149.6 173.4 112.8 127.5
Time ns 241.2 75 18 8
50(No Transcript)
51(No Transcript)