Title: Lab 3: FPGA Implementation
1Lab 3 FPGA Implementation
Specification
RTL design and Simulation
Logic Synthesis
Gate Level Simulation
ASIC Layout
FPGA Implementation
2Why Top-Down?
- Design of complex systems
- Reduce time-to-market
- shorten the design verification loop
- focus on functionality
- Easier and cheaper to explore different design
option
3RTL Design
- Characteristics
- fully clock driven RTL code with some behavioral
constructs - contain complete functional description
- cycle accurate
- Coding style
- structural description (component
connections/net-list) - data flow description (continuous assignment)
- RTL description (always block)
- combinational RTL
- sequential RTL
4Logic Synthesis
- Translate synthesizable RTL code to gate-level
design
Always _at_(posedge clk) begin if(sel1) begin
if(sel2) out in1 else out
in2 else if(sel3) if(sel4) out
in3 else out in4 end endmodule
Gate-level circuits
5Structural Mapping
6Resource Sharing
- Example
- if (op_code 0)
- r a c
- else
- r a b
- Sharing
- a single ALU for the two additions
- a MUX for the second input of the ALU
- No-Sharing
- two adders for the two additions
- an output MUX to select the output
7Register Inferencing
- Determines which signals must be preserved across
cycle boundaries - incomplete logic specification (missing branches)
- explicit register instantiation
- always _at_(posedge clk)
- signal used before assigned
8Two-level Logic Optimization
- AND-OR representations
- easy implementation as PLAs and PLDs
- a key optimization technique
- efficient algorithms and heuristics exist
- in commercial use for several years
- minimize the number of product terms
- Example
- F XYZ XYZ XYZ XYZ XYZ
- F XY YZ
9Multi-Level Logic Optimization
- Meet performance or area constraints through
restructuring and simplifications - two-level minimization
- common factor extraction
- common expression re-substitution
- Trade-off between area and delay
- In commercial use for several years
- f1 abcdabceabcdabcdaccdfabcdeabc
df - f2 bdg bdfg bdgbdeg
- f1 c(ax)acx
- f2 gx
- x d(bf) d(be)
10Transformation Examples
- Algebraic Factoring
- F B ABC AC
G 16 - Factoring
- F ( B ) A (BC C )
G 16 - Factoring again
- F ( B ) AC (B )
G 12 - Factoring again
- F ( AC) (B )
G 10 -
-
11Transformation Examples
- Decomposition
- The terms B and AC can be defined
as new functions E and H respectively,
decomposing F - F E H, E B , and H AC
G 10 - This series of transformations has reduced G from
16 to 10, a substantial savings. The resulting
circuit has three levels plus input inverters.
12Transformation Examples
- Substitution of E into F
- Returning to F just before the final factoring
step - F ( B ) AC (B )
G 12 - Defining E B , and substituting in F
- F E ACE
G 10 - This substitution has resulted in the same cost
as the decomposition
13Transformation Examples
- Elimination
- Beginning with a new set of functions
- X B C
- Y A B
- Z X C Y
G 10 - Eliminating X and Y from Z
- Z (B C) C (A B)
G 10 - Flattening (Converting to SOP expression)
- Z B C AC BC
G 12 - This has increased the cost, but has provided an
new SOP expression for two-level optimization.
14Transformation Examples
- Two-level Optimization
- The result of 2-level optimization is
- Z B C
G 4 - This example illustrates that
- Optimization can begin with any set of equations,
not just with minterms or a truth table - Increasing gate input count G temporarily during
a series of transformations can result in a final
solution with a smaller G
15Transformation Examples
- Extraction
- Beginning with two functions
- E BD
- H C BCD
G 16 - Finding a common factor and defining it as a
function - F BD
- We perform extraction by expressing E and H as
the three functions - F BD, E F, H CF
G 10 - The reduced cost G results from the sharing of
logic between the two output functions
16Technology Mapping
- Translation of a technology independent
representation of a circuit into a circuit in a
given technology with optimal cost - Optimization criteria
- minimum area
- minimum delay
- meeting specified timing constraints
- meeting specified timing constraints with minimum
area - Usages
- Technology mapping after technology independent
logic optimization
17Sample covers
18State Machine Synthesis
- Translate state table or graph
- state minimization
- state assignment to minimize the cost function
- Challenges
- state machine decomposition
- state assignment for performance
- state assignment for testability
- extract state graph from implementation
19Spartan II Features
- Plentiful logic and memory resources
- 15K to 200K system gates (up to 5,292 logic
cells) - Up to 57 Kb block RAM storage
- Flexible I/O interfaces
- From 86 to 284 I/Os
- 16 signal standards
- Advanced 0.25/0.22um 6-Layer Metal Process
- High performance
- System frequency as high as 200 MHz
- Advanced Clock Control with 4 Dedicated DLLs
- Unlimited Re-programmability
- Fully PCI Compliant
20Spartan-II Top-level Architecture
- Configurable logic blocks
- Implement logic here!
- I/O blocks
- Communicate with other chips
- Choose from 16 signal standards
- Block RAM
- On-chip memory for higher performance
21Spartan-II Top-level Architecture
- Clocks and delay locked loops
- Synchronize to clock on and off chip
- Rich interconnect resources
- Three-state internal buses
- Power down mode
- Lower quiescent power
22CLB Slice (Simplified)
- 1 CLB holds 2 slices
- Each slice contains two sets of the following
- Four-input LUT
- Any 4-input logic function
- Or 16-bit x 1 RAM
- Or 16-bit shift register
23CLB Slice (contd)
- Each slice contains two sets of the following
- Carry control
- Fast arithmetic logic
- Multiplier logic
- Multiplexer logic
- Storage element
- Latch or flip-flop
- Set and reset
- True or inverted inputs
- Sync. or async. control
24Dedicated Expansion Multiplexers
- MUXF5 combines 2 LUTs to form
- 4x1 multiplexer
- Or any 5-input function
- MUXF6 combines 2 slices to form
- 8x1 multiplexer
- Or any 6-input function
25I/O Block (Simplified)
- Registered input, output, 3-state control
- Programmable slew rate, pull-up, pull-down,
keeper and input delay
26I/O Interface Standards
- I/O can be programmed for 16 different signal
standards - VCCO controls maximum output swing
- VREF sets input, output, three-state control
- Different banks can support different standards
at the same time - Logic level translation
- Boards with mixed standards
27IOBs Organized As Independent Banks
- As many as eight banks on a device
- Package dependent
- Each bank can be assigned any of the 16 signal
standards - XC2S50
- GCK 0 pin 80
- GCK 1 pin 77
- GCK 2 pin 182
- GCK 3 pin 185
28High Performance Routing
- Hierarchical routing
- Singles, hexes, longs
- Sparse connections on longer interconnects for
high speed - Routing delay depends primarily on distance
- Direction independent
- Device-size independent
- Predictable for early design analysis
29Power-down Mode
- Controlled by single power down pin
- All inputs blocked, appear low internally
- All outputs disabled
- All register states preserved
- Power-down status pin
- Synchronous wake up
- 100 uA typical
30Configuration Modes
There are four ways to program a Spartan-II FPGA
31Spartan-II Family Overview
32Spartan-II Architecture Summary
- Delivers all the key requirements for ASIC
replacement - 200,000 gates
- 200 MHz
- Flexible I/O interfaces
- On-chip distributed and block RAM
- Clock management
- Low power
- Complete development system support
33Xilinx ISE 8
- Integrated Software Environment
34Foundation Project Manager
- Integrates all tools into one environment
35Schematic Entry
36State Machine Graphical Editor
- Graphical editor synthesizes into ABEL or VHDL
code
37Simulation - Easy to Use and Learn
- Generate stimulus easily and quickly
- Keyboard toggling
- Simple clock stimulus
- Custom formulas
- Easy debugging
- Waveform viewer
- Signals easily added and removed
- Simulator access from schematic
- Color-coded values on schematic
- Script Editor
38What is Implementation?
- More than just Place Route
- Implementation includes many phases
- Translate Merge multiple design files into a
single netlist - Map Group logical symbols from the netlist
(gates) into physical components (CLBs and IOBs) - Place Route Place components onto the chip,
connect them, and extract timing data into
reports - Timing (Sim) Generate a back-annotated netlist
for timing simulation tools - Configure Generate a bitstream for device
configuration
39Terminology
- Project
- Source file has a defined working directory and
family - Version
- A Xilinx netlist translation of the schematic
- Multiple Versions result from iterative schematic
changes - Revision
- An implementation of a Xilinx netlist
- Multiple revisions typically result from
different options - Part type
- Specified at translation can be changed in a new
revision
40Starting the Flow Engine
Foundation Project Manager
41LP-2900-XC2S50PQ208
42FPGA XC2S50
43Data Switches
447-segment LED
45Keyboard
468x8 LED
478051
48Lab 3 7-Segment Display LED
- Input two 4-bit numbers
- num1,num2
- ( push buttons sw1sw8 )
- Show the number in 7-Segment
- Display(active high)
- Compare num1 and num2,
- Use LED to show the result
- ( LED4 1 when num1 gt num2
- LED6 1 when num1 num2
- LED8 1 when num1 lt num2 )
49agbo
7485
agb,alb,aeb3b001
albo
aebo
7-seg dec.
SW14
7-seg dec.
SW58
50Example
51(No Transcript)
52(No Transcript)