Title: ECE 224a CMOS VLSI Design Lab
1ECE 224a CMOS VLSI Design Lab
2ECE 224a
- Fabricate a real design
- MMI/Cadence/Mentor/Synopsys Tools
- MMI Full Custom (Cell, Array, Data-Path)
- Cadence/Synopsys PR (digital path)
- Not a first class in VLSI
- 124a or equivalent required, 124d is good plan
- Review Essential Concepts
- FET, Diode, Transient Model (Elmore), Sizing
- Layout/Design Rules Wire Planning, Gradient
Variation, Tricks of Trade
3Class Logistics
- Homework (out wed, due 1 week)
- Quizzes (3 in-class)
- No Final
- Design Proposal
- Design Review
- Submitted Project Report
4The First Integrated Circuits
Bipolar logic 1960s
ECL 3-input Gate Motorola 1966
5 Intel 4004 Micro-Processor
1971 1000 transistors 1 MHz operation
6ECE224a Project
- 0.6um 3M
- 3.3-5V bulk CMOS
- P1/P2 CAP
- Poly Resistor
- HV Implants (up to 40V!)
- 2.25mm2
- 1.5mmx1.5mm
- 9 week design cycle, 3 person
7Current State of Affairs
- High-End Technology (32-22nm) still a driver
- Limited to large design efforts (NRE)
- Small number of Players
- FPGA Actel, Lattice, Xilinx, Altera
- Processor AMD, Intel, IBM
- SOC Conexant, Cisco, Juniper, Nintendo
- Structured ASIC NEC, Fujitsu, Hitatchi, Samsung
- Most Design Starts gt 0.09um!
- Mixed Signal Applications
- Mature Technology Lower NRE and Risk
- High Potential for Innovative Design/Architecture
8224a Project Limits
- Get 1 1.5x1.5mm design/2-3 students
- 1500 Standard Cell Gates
- 50kbits ROM/5kbits SRAM
- 64 Comparators/ 15 Op-Amps
- 40-48 pins (at least 8 used for Pwr/Gnd)
- 100Mhz practical large swing (3.3V) limit
- 800MHz differential 300mV
- 3.3 or 5V default, 12V possible
9Design Schedule
- 9 week design flow
- 1 week project definition
- 3 weeks schematic/simulation test design
- 2 weeks layout
- 2 weeks design verification and tweak
- Tape Out
- Must be DRC, LVS Clean
- Must have Full Die Simulation/Sanity
- Must have test plan and agree to physical test
10Survival Guide
- Choose Team to Complement Skills!
- No more than 3. 2 is fine, 1 if enough project
slots - Under-Specify/Over-Deliver
- If you cannot finish basic design in 1 week
simplify design! - Basic Design through layout before adding
features! - Make decisions early, stick to them
- Use expert resources Professors, experienced
students - Goal Have Fun! --
11What to make?
- Mixed Signal Designs Rock
- Pure Digital 1-bit signal processing
- Analog Sensors/Digital Output Good Choice
- Temperature, Light, Magnetic, RF, Field, voltage,
current, time, phase - Digital Synthesis/Power also good
- Sound (even music!), RFID/Xmit, motor
driver/controller, PLL (clock synthesis or
other), Display (LCD) or LED - Tricky Small Designs
- Journal of Consumer Circuits, JSSC about 10-15
years ago (0.5um in vogue), LFSR tricks
12What to NOT make
- MicroProcessor
- 4-bit possible (8 bit tiny MIPS wont fit w/o
reg) - 1 success in 22 years, 5 months design time
- No non-volatile Memory
- (design some is good, but hard, project!)
- Digital Multiplier/Adder/Function Block
- Space, Pins (How to test!), Why?
- Generic OpAmp
- How to test/characterize?
- If you have a use in mind it is not generic!
13Design Methodology
14Evolution in Complexity
15The Design Productivity Challenge
Logic Transistors per Chip (K)
Productivity (Trans./Staff-Month)
1981
1983
1985
1987
1989
1991
1993
1995
1997
1999
2001
2003
2005
2007
2009
A growing gap between design complexity and
design productivity
Source ITRS97
16Scaling?
- Technology shrinks by 0.7/generation
- With every generation can integrate 2x more
functions per chip chip cost does not increase
significantly - Cost of a function decreases by 2x
- But
- How to design chips with more and more functions?
- Design engineering population does not double
every two years - Physical design constraints more and more
difficult to surmount - Diminishing Returns for Design Dollars
17The Custom Approach
Intel 4004
Courtesy Intel
18Transition to Automation and Regular Structures
Courtesy Intel
19Automating Design
- Exploitation By Algorithms
- Regular Structures
- Logic Synthesis
- Regularization of Connection
- Floorplanning (Localization of function)
- System Level Performance/Power/Cost
- Allocation of Physical Resources
- Communication/Interconnect
- Hierarchy based on Sensitivity to Latency
- Wires to Link Protocols
20A System-on-a-Chip Example
Courtesy Philips
21Design Methodology
- Design process traverses iteratively between
three abstractions behavior, structure, and
geometry - Trick automate these steps
22Implementation Choices
23Implementation Strategies
- Data-Path
- 1D tiling, custom in depth
- Cell based logic
- Technology confined to cells (area)
- 2D via 1D cell rows, automatic PR
- 2D Arrays (Memory, CAM, CCD, MPY)
- Dense but very constrained
- Design time consuming!
242-d Cell Based Hard Modules
256?32 (or 8192 bit) SRAM Generated by hard-macro
module generator
251-d Cell-based Design (standard cells)
Logic cell
Feedthrough cell
Routing
channel
Rows of cells
Routing channel requirements are reduced by
presence of more interconnect layers
Functional
module
(RAM,
multiplier,
)
26Concepts of Placement
- Standard cells are placed in placement rows
- Cells in a timing-critical path are placed close
together to reduce routing related delays (Timing
Driven) - Placement rows can be abutting or non-abutting
27Concepts of Routing
- Connecting between metal layers requires one or
more vias - Metal Layers have preferred routing directions
- Metal 1 (Blue) Horizontal
- Metal 2 (Yellow) Vertical
- Metal 3 (Red) Horizontal
28Concept of Routing Tracks
- Metal routes must meet minimum width and spacing
design rules to prevent open and short circuits
during fabrication - In grid based routing systems, these design rules
determine the minimum center-to-center distance
for each metal layer (Track/Grid spacing) - Congestion occurs if there are more wires to be
routed than available tracks
29Grid-Based Routing System
- Metal traces (routes) are built along and
centered around routing tracks - Each metal layer has its own tracks and preferred
routing direction - Metal 1 Horizontal
- Metal 2 Vertical
- Track and pitch information can be located in the
technology file - Design Rules
30Standard Cell Old Example
- Automation
- Height fixed
- Width variable
- Channel routing
- Optimization
- Place by annealing to minimize wire-length and
net criticality
Brodersen92
31Standard Cell The New Generation
Cell-structure hidden underinterconnect
layers Same basic scheme -- more layers -- wires
over cells -- power/clock plan -- leave spaces
for Filler/bypass and Buffer cells
32Standard Cell - Example
3-input NAND cell (from ST Microelectronics) C
Load capacitance T input rise/fall time
33Soft MacroModules
Synopsys DesignCompiler
34Gate Array Sea-of-gates
Uncommited Cell
Committed Cell(4-input NOR)
35Sea-of-gate Primitive Cells
Using oxide-isolation
Using gate-isolation
36Sea-of-gates
Random Logic
Memory Subsystem
LSI Logic LEA300K (0.6 mm CMOS)
Courtesy LSI Logic
37The return of gate arrays?
Via programmable gate array(VPGA)
Via-programmable cross-point
metal-6
metal-5
programmable via
Exploits regularity of interconnect
Pileggi02
38Pre-wired Arrays
- Classification of prewired arrays (or
field-programmable devices) - Based on Programming Technique
- Fuse-based (program-once)
- Non-volatile EPROM based
- RAM based
- Programmable Logic Style
- Array-Based
- Look-up Table
- Programmable Interconnect Style
- Channel-routing
- Mesh networks
39Fuse-Based FPGA
antifuse polysilicon
ONO dielectric
n
antifuse diffusion
2
l
Open by default, closed by applying current pulse
From Smith97
40Array-Based Programmable Logic
Programmable
OR array
Fixed OR array
Programmable AND array
Programmable AND array
O
O
O
O
O
O
1
2
3
1
2
3
PLA (flexible sizing)
PROM (dense)
PAL (uniform load)
41Programming a PROM
42Rents Rule
- Rent described a relation between the number of
components in a subsystem and the number of wires
to connect it. - The rule was developed for large digital systems,
but is reflected in all human design (hierarchy) - A Rent coefficient of 0.5 corresponds to a planar
scalable design i.e. the perimeter (where wires
go) is grows to support the area of a planar
figure.
43Rents Rule
10,000
board level
high performance computers
1,000
gate arrays
r0.5K1.9
chip level
microprocessors
r0.63K1.4
r0.45K0.82
100
r0.12K6
static ram
dynamic ram
10
100
1,000
10,000
100,000
1,000,000
Bakoglu, 1987
44Design Flow
45Design Flow - Overview
- Generic VLSI Design Flow from System
Specification to Fabrication and Testing - Steps prior to Circuit/Physical design are part
of the FRONT-END flow - Physical Level Design is part of the BACK-END
flow - Physical Design is also known as Place and
Route - CAD tools are involved in all stages of VLSI
design flow - Different tools can be used at different stages
due to EDA common data formats
46Where does the Gate Level Netlist come from?
47Floorplanning
- 2D layout
- Area does not correspond to architecture
- Communication on boundaries wire length
48Design Must Be Floorplanned Before PR
- Floorplan of design
- Core area defined with large macros placed
- Periphery area defined with I/O macros placed
- Power and Ground Grid (Rings and Straps)
established - Utilization
- Percentage of the core used by placed standard
cells and macros - typically 80-85
49I/O Placement and Chip Package Requirements
- Some Bond Wire requirements
- No Crossing
- Minimum Spacing
- Maximum Angle
- Maximum Length
50Guidelines for a Good Floorplan
- A few quick iterations of place and route with
timing checks may reveal the need for a different
floorplan
51Defining the Power/Ground Grid and Blockages
- Purpose of Grid is to take the VDD and VSS
received from the I/O area and distribute it over
the core area - Blockages can also be added in the floorplan to
prohibit standards cells from being placed in
those areas - Loading IR drop and noise issues
- Sometimes need Guard rings around critical regions
52Design Flow Timing Driven Placement
- Astro optimizes, places, and routes the logic
gates to meet all timing constraints - Balancing design requirements
- Timing
- Area
- Power
- Signal Integrity
53Timing Constraints
- Astro needs constraints to understand the timing
intentions - Arrival time of inputs
- Required arrival time at outputs
- Clock period
- Constraints come from the Logic Synthesis tool
- SDC (Synopsys Design Constraints) format
54Cell and Net Delays
- Astro calculates delay for every cell and every
net - To calculate delays, Astro needs to know the
resistance and capacitance of each net - Uses geometry of net and Look Up Tables to
estimate the resistances and capacitances
55Timing Driven Placement
- Timing Driven Placement places critical path
cells close together to reduce net RC - Prior to routing, RC are based on Virtual Routes
- What if critical paths do not meet timing
constraints with placement?
56Logic Optimizations
- These optimizations can be done during pre-place,
in-place, or post-place stages of placement - Each optimization can be done separately or all
done concurrently during placement (none one
all)
57The Design Closure Problem
Iterative Removal of Timing Violations (white
lines) Problem no guarantee of convergence, can
take weeks
Synopsys
58Timing Closure and Mask Verification
RTL ECO
Constraint, Library ECO
Density, Order ECO
Fail Timing
59Design Flow Clock Tree Synthesis
- All clock pins are driven by a single clock
source - Large delay and transition time due to length of
net - Clock signal reach some registers before others
(Skew)
60After Clock Tree Synthesis
- A clock (buffer) tree is built to balance the
output loads and minimize the clock skew - A delay line can be added to the network to meet
the minimum insertion delay (clock balancing)
61Gated - CTS
- Clocks may not be generated directly from I/O
- Power saving techniques such as clock-gating are
used to turn of the clock to sections of the
design
62Effects of CTS
- Several (Hundreds/Thousands) of clock buffers
added to the design - Placement / Routing congestion may increase
- Non-clock cells may have been moved to less ideal
locations - Timing violations can be introduced
63Timing Driven Routing
- Routing along the timing-critical path is given
priority - Creates shorter, faster connections
- Non-critical paths are routed around critical
areas - Reduces routing congestion problems for critical
paths - Does not adversely impact timing of non-critical
paths
64Timing Verification
- Calibre PEX performs the layout parasitic
extraction of the resistances and capacitances of
all wires and devices in the design - Results in Augmented Spice Deck and in logic
format decks such as SPEF (Standard Parasitic
Extended Format) - SPEF is an extended form of Standard Parasitic
Format (SPF), which enables the transfer of
design specific resistances and capacitances from
physical design to timing analysis and simulation
tools - Primetime performs static timing analysis
- Detects timing violations by combining SPEF and
netlist and checks against the design timing
constraints (setup and hold times)
65Physical Verification
- Checks the design for fabrication feasibility and
physical defects that could result in the design
to not function properly - 3 checks (DRC/ERC, and LVS)
- Design Rule Checks (DRC)
- Verifies that design does not violate any
fabrication rules associated with the target
process technology (metal width/space, antenna
wires, fill ratio, etc) - Electrical Rules Checks (ERC)
- Verifies that there are no short or open circuits
with power and ground as well as
resistors/capacitors/transistors with floating
nodes (part of LVS) - Layout Versus Schematic (LVS)
- Final physical design matches the logical
(schematic) version in terms of correct
connectivity and number of electrical devices
66Fabrication
- Physical Design process is complete upon
successful completion of timing, functional, and
physical verification - The design can be Taped-Out and GDSII created
for the manufacturer - GDSII (Graphic Design System II) is a binary
format containing the physical geometry
information of the design. - The shapes are assigned numeric attributes in the
form of Layer Number and Data Type (Metal 1
gt 1000) - Fabrication and Test determine which chips can
be implemented into the system (yield)
67UCSB Tools
- Tools from all the major vendors
- Cadence (ON-Semi 0.6um via NCSU/OSU SCMOS)
- Synopsys (Logic Synthesis, Simulation, Timing)
- Mentor (Calibre DRC/LVS/PEX)
- MMI (Full Custom, Array and DataPath)
- Final Chip physical verification must be through
Calibre full-chip DRC and LVS - Full-chip extracted spice sanity check
68HW 1
- If 6 wafers cost 800/ea, 10 cost 1400 and 12
cost 2000 estimate the die cost of a 5mmx5mm die
in each case given a uniform defect rate of
0.2/cm2. (Hint A point defect in your chip kills
it). - Analog circuits typically lag by several orders
of magnitude is scale compared to digital ones.
How does this relate to the complexity of
composing two analog circuits to make a more
complex behavior? - FPGA use is growing in many cases replacing
ASIC designs. Typically, such designs are 30-150x
slower than ASIC designs of the same power. Why? - Why is Timing Closure a serious issue in highly
constrained digital designs (big, fast,
low-power, lots of pins)?