Title: Looking Under the Hood
1Looking Under the Hood
2Objectives
After completing this module, you will be able to
- Understand what is happening behind the scenes
and how your decisions can impact the quality of
the result - Use good design habits to achieve the best
solution
3Outline
- Quantization and Overflow
- The Costs of Hardware of System Abstraction
- Lab 4 Looking under the hood
- Bit Picking
- Tips for Good Designs Using System Generator
4Quantization and Overflow
5Quantization
- Occurs if the number of fractional bits is
insufficient to represent the fractional portion
of a value - Users can choose to
- Truncate - Discard bits to the right of the least
significant bit - Round - Round to the nearest representable value
or to the value farthest from zero if there are
two equidistant nearest representable values
-2.26171875
-2.26171875
6Quantization
- A signed full precision number will have a
different output depending on whether truncation
or rounding is employed
Full Precision
1
0
1
1
0
1
1
1
1
0
1
0
1
0
0
0
0
-2.2607421875
FIX_12_9
-Truncate - Round
-2.26171875
FIX_12_9
-2.259765625
7Quantization
- An unsigned full precision number will have a
different output depending on whether truncation
or rounding is employed
Full Precision
1
0
1
1
0
1
1
1
1
0
1
0
1
0
0
0
0
5.7392578125
UFIX_12_9
-Truncate - Round
5.73828125
UFIX_12_9
5.740234375
8Overflow
- Occurs if a value lies outside the representable
range - Users can choose to
- Saturate to the largest positive (or maximum
negative) value - Wrap the value (i.e., discard any significant
bits beyond the most significant bit in the fixed
point number) - Flag an overflow as a Simulink error during
simulation
9Quantization and Overflow
- Whatever option is selected, the generated HDL
model and Simulink model will behave identically - This also means rounding and saturation will use
FPGA resources
10Outline
- Quantization and Overflow
- The Costs of Hardware of System Abstraction
- Lab 4 Looking under the hood
- Bit Picking
- Tips for Good Designs Using System Generator
11IP Wrappers
- Every IP core has a wrapper to interface between
Simulink and hardware - Each SysGen block has an RTL HDL wrapper to
- Extend IP core functionality
- Simplify IP core interface
- Support fixed-point arithmetic
- Number of bits, binary point
- Overflow and quantization
- Valid bit control on some cores
- Note a Simulink parameter may not be identical
to the corresponding COREGen parameter - Wrapper file will have xlltcore-namegt in its
filename without core keyword
xlMult.vhd
COREGen IP Core (xmult_x_0_core.vhd)
SysGen VHDL IP Core Wrapper (xmult_x_0.vhd)
12Implications of IP Wrappers
- Saturation arithmetic and rounding requires
hardware (full adder) - Excess latency is implemented with a shift
register (SRL16) on the core output - Some SysGen blocks perform implicit conversion of
inputs - Unsigned to signed
- Sign extension
- Zero padding
- SysGen mask parameters may not be identical to
COREGen parameters - Valid bit pipeline parallels to the data path,
typically implemented using SRL16
System level abstraction is very expressive and
powerful, but comes at some expense in hardware.
Be aware!
13Outline
- Quantization and Overflow
- The Costs of Hardware of System Abstraction
- Lab 4 Looking under the Hood
- Bit Picking
- Tips for Good Designs Using System Generator
14Lab 4
- Looking Under the Hood
- You are given a simple design of adder circuit
- Make changes to a System Generator model to see
what the results would be in hardware by changing
saturation arithmetic and rounding - Use the Resource Estimator block to estimate the
resources usage in each case - Use an RTL viewer to see the changes, and
- Use the Xilinx implementation results to
determine the cost of these system-level decisions
15Outline
- Quantization and Overflow
- The Costs of Hardware of System Abstraction
- Lab 4 Looking under the hood
- Bit Picking
- Tips for Good Designs Using System Generator
16Picking Bits Why We Do It
- To combine two data buses together to form a new
bus - To force a conversion of data type including the
number of bits and binary bits - To reinterpret unsigned data as signed, or the
converse - To extract certain bits of data, especially when
there is bit growth
17The Xilinx Blocks
- Concat
- Available in basic elements, data types, and
index libraries
- Convert
- Available in basic elements, data types, math,
and index libraries
- Slice
- Available in basic elements, control logic, data
types, and index libraries
- Reinterpret
- Available in basic elements, math, and index
libraries
18The Concat Block
- Performs a concatenation of two bit vectors
- Both inputs must be unsigned integers
- i.e., two unsigned numbers with binary points at
position zero - Reinterpret block provides signed to unsigned
conversion capabilities that can extend the
functionality of the concat block - Does not use Xilinx LogiCORE and hardware
resources
19The Convert Block
- The Xilinx convert block converts each input
sample to a number of a desired arithmetic type - A number can be converted to a signed (twos
complement) or unsigned value - Total number of bits and binary point are
specified by the user - Rounding and quantization options apply to the
output value - Does not use Xilinx LogiCORE but may use
additional hardware depending on the overflow and
quantization options
20The Convert Block
- What is it doing?
- User specifies the total number of bits, where
the binary point is, and the arithmetic type
(signed or unsigned) - First it lines up the binary point between input
and output port types - Next, the total number of bits and binary point
the user specifies are used, and depending if
overflow and quantization options are used the
output may change, as opposed to dropping bits
21The Convert Block
- The following through the convert block would
result in the same value using a different
number of bits and binary point
22The Convert Block
- Saturating the overflow may change the fractional
number to get the saturated value - Rounding the quantization may also affect the
value to the left of the binary point (the whole
number)
23The Convert Block
- When we convert to a Fix_6_0, how do we get two
different values?
QUANTIZATION
OVERFLOW
- Wrap - Saturate - Flag Error
- Truncate - Round
FIX_10_8
Round to decimal 2 Add 1 to round
FIX_6_0
Truncate to decimal 1 Drop the bits
FIX_6_0
24The Reinterpret Block
- Forces its output to a new type without any
regard for retaining the numerical value
represented by the input - Total number of bits in total number of bits
out - Allows for unsigned data to be reinterpreted as
signed data, and the converse - Also allows scaling of the data through the
repositioning of the binary point - Does not use Xilinx LogiCORE and hardware
resources
25The Reinterpret Block
- Reinterpret the UFIX_10_8 number and force the
binary point to position 5
1.5
FIX_10_8
12
FIX_10_5
26The Slice Block
- The Xilinx slice block allows you to slice off a
sequence of bits from your input data and create
a new data value - The output data type is unsigned with its binary
point at zero
27The Slice Block
- Take a slice of the FIX_10_8 number by taking a
4-bit slice and offsetting the bottom bit of the
slice by 5 bits - Upper Bit Location Width Offset of top bit
from MSB 0 and width 4 - Two Bit Locations Offset of top bit from MSB of
Input -1 and Offset of Bottom bit from LSB of
Input 5
28What Values Do You Expect?
- Signed Data
- Truncate and Wrap
Signed Data Output Binary Point of 3
Total Number of Bits 3 Bottom Slice offset by
5 from the LSB
29Outline
- Quantization and Overflow
- The Costs of Hardware of System Abstraction
- Lab 4 Looking under the hood
- Bit Picking
- Tips for Good Designs Using System Generator
30SysGen Design Tips
- Remember that saturation arithmetic and rounding
have area and performance costs. Use only as
necessary