Title: ComputerAided System Design FPGA
1Computer-Aided System DesignFPGA
- Prof. Chien-Mo Li
- Graduate Institute of Electronics Engineering
- National Taiwan University
2Where We Are in CASD Flow
specifications
behavior.v
RTL Coding Verilog-XL
FPGA Compiler Quartus
rtl.v
Synthesis Design Compiler
gate.v
FPGA Implementation
Dft Insertion dft compiler
gate_scan.v
Place and route Silicon Ensemble/ Apollo
.gds2
Tape out
3Outline
- Introduction
- FPGA Structure
- Design Verification
- Conclusions
4What is FPGA?
- Field Programmable Gate Array
- i.e. blank IC
- FPGA usually consists of
- Programmable logic
- Programmable interconnect
- Programmable IO
- Functions can be configured
- Electrically in the field
Example configuration
5Evolution of Programmable Devices
- PROM (Programmable ROM)
- Logic implemented by ROM
- EPROM erasable
- EEPROM electrically erasable
- PLD (Programmable Logic Device)
- AND-OR expression
- PLA both AND OR planes programmable
- PAL only AND plane programmable OR plane fixed
- MPGA (Mask Programmable Gate Array)
- fixed transistor Interconnect to be designed by
user - Not field programmable
- FPGA (Filed Programmable Gate Array)
6Categories of FPGA
- Classified by way of configuration
- SRAM based
- Volatile configuration ( lost when power off)
- Need external support to download configuration
data - Easier to reconfigure
- Anitfuse
- Nonvolatile configuration
- Faster than SRAM based
- No reconfiguration
- EEPROM
- EPROM
7Major FPGA Providers
- Xilinx since 1984
- SRAM based FPGA leader
- Altera since 1983
- Atmel ? Actel since 1984
- Antifuse
8Why FPGA
- Design Verification
- Emulation usually faster run time than simulators
- FPGA better than ASIC in certain applications
- E.g. Network communication chips
- protocols changed by download new FPGA config.
- FPGA cost less when volume is low
- Penalty of FPGA
- Performance not as good as ASIC
- Silicon are utilization not as high as ASIC
FPGA
cost
ASIC
volume
9Performance Comparison
- Full Custom ASIC Design
- Semi Custom ASIC Design
- Standard Cell
- Mask Programmable Gate Array
- FPGA
- General purpose processor software
High performance High design effort
High programmability Fast design cycle
10CIC Flow
Full Custom IC
Cell-Based IC
FPGA
(SPW, Ansoft)
System Design
Verilog
Functional Sim/Ver
Synopsys
Logic Syn/Ver
Altera Xilinx
Circuit Sim/Ver
TimeMill, PowerMill
Physical Design
Dracula
SE, Apollo
11Outline
- Introduction
- FPGA Structure
- Xilinx
- Altera
- Design Verification
- Conclusions
12Xilinx FPGA Architecture
- Xilinx FPGA consists of
- N x N Configurable Logic Block (CLB)
- Programmable I/O Blocks (IOB)
- programmable interconnect network
IOB
IOB
IOB
IOB
IOB
IOB
IOB
IOB
CLB
CLB
CLB
IOB
IOB
CLB
CLB
CLB
IOB
IOB
CLB
CLB
IOB
CLB
IOB
IOB
IOB
IOB
IOB
IOB
IOB
13What is a CLB?
- CLB can implement different logic functions
- CLB contains
- Look-Up Table (LUT)
- Small RAM, writable only when configuration
- Mux, DFF
- A simplified CLB example
LUT
F0
Out
F1
0
1
C
0
Cin
DFF
1
C
C configuration bits
14A CLB Configuration Example
- AND gate followed by DFF
- LUT content 0,0,0,1
F0
DFF
F1
F0
Out
F1
0
1
1
0
Cin
DFF
1
0
15Interconnect Network
- Wires (aka. Interconnect)
- Connection BOX (CB) aka.IO MUX Programmable
Switch Box (PSB) - Switch Box (SB) aka. Switch Matrix (SM)
IOB
IOB
IOB
IOB
SB
CB
SB
CB
SB
CB
SB
IOB
IOB
CLB
CLB
CLB
CB
CB
CB
CB
IOB
IOB
SB
CB
SB
CB
SB
CB
SB
IOB
IOB
CLB
CLB
CLB
CB
CB
CB
CB
IOB
IOB
SB
CB
SB
SB
SB
CB
CB
IOB
IOB
CLB
CLB
CLB
CB
CB
CB
CB
IOB
IOB
SB
CB
SB
CB
SB
CB
SB
IOB
IOB
IOB
IOB
16What is CB?
- CB controls inputs and outputs of CLB
- Consists of configurable MUXes DeMUXed
- A simplified example
CC
F0
Out
CC
0
1
F1
C
0
Cin
DFF
1
C
CB
CB
CLB
17What is SB?
- Switches wires from four directions N E W S
- Consist of programmable switches
C
North
East
West
south
18PIP
- Programmable Interconnect Points (PIPs Xilinx)
- Aka. Configurable interconnect points (CIPs
Lucent) - Consist of one configuration bit and a transistor
- Open PIP C 0
- Closed PIP C 1
- Basic component of programmable switches
C
0
1
19Common Programmable Switches
structure
- Break points
- Cross points
- MUX
- All-direction (draw it by yourself!)
notation
B
A
A
B
B
B
A
A
A
B
A
C
C
B
B
A
20Wires
- Many different types of wires, from all
directions - buffered, unbuffered
- Unidirectional, bidirectional
- Xilinx example
- Direct lines from CLB to CLB
- from SB to SB
- Single lines connect adjacent SB
- Hex lines connect SBs 3 or 6 blocks apart
- Long lines connect SBs 6 or 12 blocks apart
- Global lines
- Clock, reset
-
21Wires
CB
Horizontal single lines (to next SB)
CLB-CLB Direct (to next CLB)
Horizontal long line
Vertical Long line
Vertical single line
Direct
22Signal Routing
- (1) take direct route from CLB to CLB very
fast, - less flexible - (2) take single lines via SB switching
flexible - (3) take long lines long distance, - less
flexible
CB
CB
(1)
(2)
(3)
23What is in a IOB?
- Pull-up pull-down (for floating input pin)
- Tri-state buffers, FF, MUX
- A simplified example
C
Output enable
Output Data
PAD
D FF
C
input Data
0
D FF
input Data (Reged)
1
delay
C
C configuration bits
24Detailed FPGA Flow
specifications
RTL Coding Verilog-XL
rtl.v
FPGA Compiler
Configuration bit stream
Configure FPGA
Configured FPGA
Run testbench (emulation)
Emulation results
25Compilation
- Translate RTL code into
- FPGA configuration bits
- Includes
- Logic optimization, minimization
- Mapping to FPGA logic blocks
- Placement
- Routing
- FPGA compilation synthesis place route
in cell-based ASIC - Compilation software usually provided by FPGA
vendor - Very FPGA structure dependent
26Compilation Results
- Highlighted wires are signal paths
CLB
SB
27Configure FPGA
- FPGA in configure mode
- Shift in configuration bit streams into FPGA
- Via long scan chains
- IOB, CLB, CB, SB,
...010
CLB
IOB
28Run Test Bench on Your Design
- FPGA in function mode
- Stimulus provided by host computer
- Results observed by host computer
FPGA Board
29Outline
- Introduction
- FPGA Structure
- Xilinx
- Altera
- Design Verification
- Conclusions
30Altera FPGA Architecture
- Consist of
- Logic Array Blocks (LAB)
- IO Element (IOE)
- Interconnect
31Altera FLEX8000
32Logic Element (LE)
- One LAB contains 8 LE
- LUT, DFF, MUX
- Carry chain provide carry signal propagation
33IOE
34Interconnect
- Row and column routers
- named FastTrack
- Once signal on FastTrack
- Accessible to all LE along the same track
- Constant delay
- Major difference between Xilinx Altera
- Xilinx segmented routing
- More flexible routing
- Altera Hierarchical routing
- Constant signal delay
35Outline
- Introduction
- Xilinx FPGA Structure
- Design Verification
- Conclusions
36Verification
- What is verification?
- Check if design conforms to its specifications
- Why verification?
- Guarantee the correct operation
- Cost of Bug gtgt Cost of verification
- Case example
- Pentium division bug
- 475 Million USD EE Times 97
- Toshiba floppy disk controller
- 2 billion USD PC World 99
Verification is expensive but Bugs are much
more expensive!
37How Many Potential Bugs?
- Assume
- 1 error in every 10K lines of RTL code
- 1 line of RTL code ? 10 transistors
- 10M transistor chip (now)
- 100 bugs per chip
- 100M transistor chip (2007)
- 1,000 bugs per chip
- Potential bugs are more than you can image
- Designers spend more time on debug/verification
than design - Estimate 30 design, 70 debug/verification
- Do you call your self design engineer?
- or verification engineer?
38Verification Techniques
- Dynamic verification
- Exercising a model of a design with a set of
verification stimuli - e.g. Simulations
- Static verification
- Exploit formal mathematical techniques to verify
a design without use of stimuli - e.g.
- Formal Verification
- Static timing analysis
- Emulation
- Exercise a hardware implementation (usually Field
Programmable Gate Array, FPGA) of a design in
with a set of verification stimuli - e.g.
- Hardware accelerator
- Rapid prototyping system
39Simulation vs. Emulation
Simulator
Stimuli Control
Responses
Model
Emulator
Stimuli Control
Responses
40Verification by Simulations
specifications
behavior.v
Simulation Verilog-XL
RTL Coding Verilog-XL
rtl.v
Synthesis Design Compiler
Simulation Verilog-XL
Pre-layout simulation
gate.v
.sdf
Dft Insertion dft compiler
Simulation Verilog-XL
gate_scan.v
.sdf
Place and route Silicon Ensemble/ Apollo
Simulation Verilog-XL
post-layout simulation
RC Dracula
.sdf
Tape out
41Verification by FPGA
specifications
behavior.v
RTL Coding Verilog-XL
FPGA Compiler Quartus
rtl.v
Synthesis Design Compiler
gate.v
FPGA Implementation
Dft Insertion dft compiler
gate_scan.v
Place and route Silicon Ensemble/ Apollo
.gds2
Tape out
42Emulation
- Hardware accelerator
- DUT implemented by programmable hardware (FPGA)
- Stimuli provided by host computer
- Every internal nodes accessible
- Just like a very fast simulator for user
- Can be a stand alone system or board plugged into
host computers
Source Axis Corp.
43Emulation contd
- Rapid prototyping (aka. In-circuit emulation)
- DUT implemented by programmable hardware
- can be directly plugged in system
- Sometimes at-speed emulation possible
Source Axis Corp.
44Comparison of Verification Techniques
- Circuit Simulations
- Very accurate results
- Very slow even for medium size circuit
- Logic Simulations
- Verifies functions as well as timing
- Slow for a large chip
- What input stimuli to apply?
- RTL/ behavior Simulations
- fast
- Does not guarantee actual silicon behavior
- Emulation
- Fast
- expensive
45Comparison of Verification Techniques
- Static Timing Analysis
- Faster than logic simulation
- No input stimuli needed
- Can report false path
- Formal Verification
- No input stimuli needed
- Guarantee correctness
- Detects high quality bugs
- Computation complexity high
46Current Practice of Verification
- Behavior simulation
- Used in system level simulations
- RTL Simulation
- Used in chip level or IP level simulations
- Logic Simulation
- Used in block level simulation
- Verify block functions as well as timing
- Circuit Simulation
- Only used in critical paths, customized blocks
- Static timing analysis
- Used in block level
- Make sure the true critical path meets
specification - Formal verification
- Used in block level to find unexpected bugs
- Emulation
- Used in chip or IP level verification
- Suitable for expensive chips or IP, like
processors
47Appendix
48Xilinx XC3000 CLB
- 6 inputs (ABCDE Din), two outputs (XY)
- one LUT 2 FF
From Xilinx databook
49Xilinx XC4000 CLB
50Xilinx Virtex CLB
51Xilinx XC3000 IOB
From Xilinx Databook
52Xilinx XC 4000 IOB
53- Thick line PIP on ? no conduction
- Thin line PIP off ? conduction