ASIC Front-End Design - PowerPoint PPT Presentation

1 / 69
About This Presentation
Title:

ASIC Front-End Design

Description:

Lecture 19 ASIC Front-End Design ... CMOS technology implies that all active devices, or transistors, come in pairs of N- and PMOS transistors. – PowerPoint PPT presentation

Number of Views:158
Avg rating:3.0/5.0
Slides: 70
Provided by: tealGmuEd
Category:

less

Transcript and Presenter's Notes

Title: ASIC Front-End Design


1
ASIC Front-End Design
ECE 448 Lecture 19
2
Two competing implementation approaches
FPGA Field Programmable Gate Array
ASIC Application Specific Integrated Circuit
  • designed all the way
  • from behavioral description
  • to physical layout
  • no physical layout design
  • design ends with
  • a bitstream used
  • to configure a device
  • designs must be sent
  • for expensive and time
  • consuming fabrication
  • in semiconductor foundry
  • bought off the shelf
  • and reconfigured by
  • designers themselves

3
FPGAs vs. ASICs
FPGAs
ASICs
Off-the-shelf
High performance
Low development costs
Low power
Short time to the market
Low cost (but only in high volumes)
Reconfigurability
4
ASIC Design Example Factoring circuit/GMU
Global Memory
Local Memory
5
ASIC 130 nm vs. Virtex II 6000 Factoring/GMU
19.80 mm
51x
Area of Xilinx Virtex II 6000 FPGA (estimation
by R.J. Lim Fong, MS Thesis, VPI, 2004)
19.68 mm
2.7 mm
2.82 mm
Area of an ASIC with equivalent functionality
6
ASICs vs. FPGAs
  • Source
  • I. Kuon, J. Rose,
  • University of Toronto
  • Measuring the Gap Between
  • FPGAs and ASICs
  • IEEE Transactions on Computer-Aided
  • Design of Integrated Circuits and Systems,
  • vol. 62, no. 2, Feb 2007.

7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
Simplified ASIC Design Flow
Synthesis
Front-End Design

Timing Analysis
Floorplanning
Back-End Design
Placement
Clock Tree Synthesis
Routing
Design for Manufacturing
31
12
Major ASIC Toolsets
Cadence
Magma
13
Simplified ASIC Design Flow
Synopsys Tools
Synthesis
Design Analyzer
Front-End Design

Primetime
Timing Analysis
Floorplanning
Back-End Design
Placement
Astro
Clock Tree Synthesis
Routing
Design for Manufacturing
31
14
A Complete Placed and Routed Chip
28
15
What is Physical Layout?
Physical Layout Topography of devices and
interconnects, made up of polygons that represent
different layers of material (diffusion,
polysilicon, metal, contact, etc)
16
Process of Device Fabrication
  • Devices are fabricated vertically on a silicon
    substrate wafer by layering different materials
    in specific locations and shapes on top of each
    other
  • Each of many process masks defines the shapes and
    locations of a specific layer of material
    (diffusion, polysilicon, metal, contact, etc)
  • Mask shapes, derived from the layout view, are
    transformed to silicon via photolithographic and
    chemical processes

Wafer (cross-sectional) view
40
17
Wafer Representation of Layout Polygons
Wafer Cross-sectional View
41
18
Front-End Design Flow
19
Simplified RTL Synthesis
20
VHDL vs. Verilog
21
Logic Synthesis
VHDL description
Circuit netlist
architecture MLU_DATAFLOW of MLU is signal
A1STD_LOGIC signal B1STD_LOGIC signal
Y1STD_LOGIC signal MUX_0, MUX_1, MUX_2, MUX_3
STD_LOGIC begin A1ltA when (NEG_A'0')
else not A B1ltB when (NEG_B'0') else not
B YltY1 when (NEG_Y'0') else not
Y1 MUX_0ltA1 and B1 MUX_1ltA1 or
B1 MUX_2ltA1 xor B1 MUX_3ltA1 xnor
B1 with (L1 L0) select Y1ltMUX_0 when
"00", MUX_1 when "01", MUX_2 when
"10", MUX_3 when others end MLU_DATAFLOW
22
Logic Synthesis
23
TCL Tool Command Language
  • Created by John Ousterhout of UC Berkeley
  • Scripting Language
  • Very simple to automate routine tasks.
  • Extension Language
  • Used to customize tools with user/company
    specific aplications.
  • Nearly all of modern EDA tools have a TCL
    interface.
  • Very simple to learn and use.

24
TCL Example
  • proc rfmdIfNotDirMkdir directory
  • if ! file exists directory
  • file mkdir directory
  • if ! file isdirectory directory
  • echo "Could not make \"directory\""
  • exit 1
  • elseif ! file writable directory
  • echo " \"directory\" is not writable"
  • exit 1
  • else
  • return 1

25
TCL References
  • Practical Programming in Tcl and TK
  • Brent B. Welch
  • Ken Jones
  • TCL/TK in a Nutshell
  • Paul Raines
  • Jeff Tranter

26
Basic Synthesis Flow
27
Synthesis using Design Compiler
28
(No Transcript)
29
(No Transcript)
30
Synthesis script (1)
  • designer "Pawel Chodowiec"
  • company "George Mason University"
  • search_path
  • "./opt3/synopsys/TSMCHOME/digital/Front_End/timing
    _power/tcb013ghp_200a "
  • link_library " tcb013ghptc.db" /
    Typical case library /
  • target_library "tcb013ghptc.db "
  • symbol_library "tcb013ghp.sdb "
  • / Directory configuration /
  • src_directory /exam1/vhdl/
  • report_directory /exam1/reports/
  • db_directory /exam1/db/

31
Synthesis script (2)
  • / Packages can be only read /
  • read_file -format vhdl -rtl src_directory
    "components.vhd"
  • blocks regne, upcount, RAM_16Xn_DISTRIBUTED,
    exam1
  • foreach (block, blocks)
  • block_source src_directory block ".vhd"
  • read_file -format vhdl -rtl block_source
  • analyze -format vhdl -lib WORK block_source
  • current_design block
  • / All commands now apply to the entity "exam1"
    /

32
Synthesis script (3)
  • uniquify
  • / Creates unique instances of multiple refrenced
    entities /
  • link
  • check_design
  • / Checks the current design for consistency /
  • //
  • / apply block attributes and constraints /
  • //
  • create_clock -period 10 clk
  • / Defines that the port "clk" on the entity
    "clk"
  • is the clock for the design. Period10ns 50 duty
    cycle
  • Use -waveform option to define duty cycle other
    than 50/
  • set_operating_conditions NCCOM
  • /Normal Case Commercial Operating Conditions/

33
Synthesis script (4)
  • /
    /
  • / Apply these constraints to the top-level
    entity/
  • /
    /
  • set_max_fanout 100 block
  • set_clock_latency 0.1 find(clock, "clk")
  • set_clock_transition 0.01 find(clock, "clk")
  • set_clock_uncertainty -setup 0.1 find(clock,
    "clk")
  • set_clock_uncertainty -hold 0.1 find(clock,
    "clk")
  • set_load 0 all_outputs()
  • set_input_delay 1.0 -clock clk -max all_inputs()
  • set_output_delay -max 1.0 -clock clk
    all_outputs()
  • set_wire_load_model -library tcb013ghptc -name
    "TSMC8K_Fsg_Conservative"

34
Wireload model basics (1)
35
Wireload model basics (2)
36
Synthesis script (5)
  • set_dont_touch block
  • compile -map_effort medium
  • change_names -rules vhdl
  • vhdlout_architecture_name "sort_syn"
  • vhdlout_use_packages "IEEE.std_logic_1164"
  • write -f db -hierarchy -output db_directory
    "exam1.db"
  • /write -f vhdl -hierarchy -output db_directory
    "exam1_syn.vhd"/
  • report -area gt report_directory
    "exam1.report_area"
  • report -timing -all gt report_directory
    "exam1.report_timing"

37
Results of synthesis
38
Area report after synthesis (1)
  • report_area
  • Information Updating design information...
    (UID-85)
  • Report area
  • Design exam1
  • Version V-2003.12-SP1
  • Date Tue Nov 15 203906 2005
  • Library(s) Used
  • tcb013ghptc (File /opt3/synopsys/TSMCHOME/dig
    ital/Front_End/timing_power/
  • tcb013ghp_200a/tcb013ghptc.db)

39
Area report after synthesis (2)
  • Number of ports 75
  • Number of nets 346
  • Number of cells 107
  • Number of references 28
  • Combinational area 10593.477539
  • Noncombinational area 14295.521484
  • Net Interconnect area
  • undefined
    (Wire load has zero net area)
  • Total cell area 24888.976562
  • Total area undefined

40
Critical Path (1)
  • Critical Path The Longest Path From Outputs of
    Registers to Inputs of Registers

t logic
tCritical tFF-P tlogic tFF-setup
41
Critical Path (2)
  • Min. Clock Period Length of The Critical Path
  • Max. Clock Frequency 1 / Min. Clock Period

42
nm
nm
43
Clock Jitter
  • Rising Edge of The Clock Does Not Occur Precisely
    Periodically
  • May cause faults in the circuit

clk
44
Clock Skew
  • Rising Edge of the Clock Does Not Arrive at Clock
    Inputs of All Flip-flops at The Same Time

45
Timing report after synthesis (1)
  • Report timing
  • -path full
  • -delay max
  • -max_paths 1
  • Design exam1
  • Version V-2003.12-SP1
  • Date Tue Nov 15 203906 2005
  • Operating Conditions NCCOM Library
    tcb013ghptc
  • Wire Load Model Mode segmented

46
Timing report after synthesis (2)
  • Startpoint in_addr(1) (input port clocked by
    clk)
  • Endpoint RegSUM/Q_reg34
  • (rising edge-triggered flip-flop
    clocked by clk)
  • Path Group clk
  • Path Type max
  • Des/Clust/Port Wire Load
    Model Library
  • ------------------------------------------------
    -----------------------------------
  • exam1 TSMC8K_Fsg_Conservati
    ve tcb013ghptc
  • RAM_16Xn_DISTRIBUTED ZeroWireload
    tcb013ghptc
  • exam1_DW01_cmp2_32_0 ZeroWireload
    tcb013ghptc
  • exam1_DW01_cmp2_32_1 ZeroWireload
    tcb013ghptc
  • exam1_DW01_add_35_0 ZeroWireload
    tcb013ghptc
  • regne_1
    ZeroWireload tcb013ghptc
  • regne_2
    ZeroWireload tcb013ghptc
  • regne_n35
    ZeroWireload tcb013ghptc

47
Timing report after synthesis (3)
  • Point
    Incr Path
  • ------------------------------------------------
    ------------------------------------------------
  • clock clk (rise edge)
    0.00 0.00
  • clock network delay (ideal)
    0.10 0.10
  • input external delay
    1.00 1.10 f
  • in_addr(1) (in)
    0.00 1.10 f
  • U98/Z (CKMUX2D1)
    0.13 1.23 f
  • Memory/ADDR1 (RAM_16Xn_DISTRIBUTED) 0.00
    1.23 f
  • Memory/U41/ZN (INVD1)
    0.08 1.31 r
  • Memory/U343/Z (OR3D1)
    0.10 1.41 r
  • Memory/U338/ZN (INVD2)
    0.20 1.61 f
  • Memory/U40/ZN (MOAI22D0)
    0.17 1.78 f
  • Memory/U350/Z (OR4D1)
    0.26 2.03 f
  • Memory/DATA_OUT0 (RAM_16Xn_DISTRIBUTED) 0.00
    2.03 f

48
Timing report after synthesis (4)
  • add_96xplusxplus/B0 (exam1_DW01_add_35_0)
    0.00 2.03 f
  • add_96xplusxplus/U9/Z (AN2D0)
    0.12 2.15 f
  • add_96xplusxplus/U1_1/CO (CMPE32D1)
    0.10 2.25 f
  • add_96xplusxplus/U1_2/CO (CMPE32D1)
    0.10 2.34 f
  • add_96xplusxplus/U1_3/CO (CMPE32D1)
    0.10 2.44 f
  • add_96xplusxplus/U1_4/CO (CMPE32D1)
    0.10 2.54 f
  • add_96xplusxplus/U1_5/CO (CMPE32D1)
    0.10 2.63 f
  • add_96xplusxplus/U1_6/CO (CMPE32D1)
    0.10 2.73 f
  • add_96xplusxplus/U1_7/CO (CMPE32D1)
    0.10 2.82 f
  • add_96xplusxplus/U1_8/CO (CMPE32D1)
    0.10 2.92 f
  • add_96xplusxplus/U1_9/CO (CMPE32D1)
    0.10 3.02 f
  • add_96xplusxplus/U1_10/CO (CMPE32D1)
    0.10 3.11 f
  • add_96xplusxplus/U1_11/CO (CMPE32D1)
    0.10 3.21 f
  • add_96xplusxplus/U1_12/CO (CMPE32D1)
    0.10 3.31 f
  • add_96xplusxplus/U1_13/CO (CMPE32D1)
    0.10 3.40 f
  • add_96xplusxplus/U1_14/CO (CMPE32D1)
    0.10 3.50 f

49
Timing report after synthesis (5)
  • add_96xplusxplus/U1_15/CO (CMPE32D1)
    0.10 3.60 f
  • add_96xplusxplus/U1_16/CO (CMPE32D1)
    0.10 3.69 f
  • add_96xplusxplus/U1_17/CO (CMPE32D1)
    0.10 3.79 f
  • add_96xplusxplus/U1_18/CO (CMPE32D1)
    0.10 3.88 f
  • add_96xplusxplus/U1_19/CO (CMPE32D1)
    0.10 3.98 f
  • add_96xplusxplus/U1_20/CO (CMPE32D1)
    0.10 4.08 f
  • add_96xplusxplus/U1_21/CO (CMPE32D1)
    0.10 4.17 f
  • add_96xplusxplus/U1_22/CO (CMPE32D1)
    0.10 4.27 f
  • add_96xplusxplus/U1_23/CO (CMPE32D1)
    0.10 4.37 f
  • add_96xplusxplus/U1_24/CO (CMPE32D1)
    0.10 4.46 f
  • add_96xplusxplus/U1_25/CO (CMPE32D1)
    0.10 4.56 f
  • add_96xplusxplus/U1_26/CO (CMPE32D1)
    0.10 4.66 f
  • add_96xplusxplus/U1_27/CO (CMPE32D1)
    0.10 4.75 f
  • add_96xplusxplus/U1_28/CO (CMPE32D1)
    0.10 4.85 f
  • add_96xplusxplus/U1_29/CO (CMPE32D1)
    0.10 4.94 f
  • add_96xplusxplus/U1_30/CO (CMPE32D1)
    0.10 5.04 f
  • add_96xplusxplus/U1_31/CO (CMPE32D1)
    0.10 5.14 f

50
Timing report after synthesis (6)
  • add_96xplusxplus/U7/Z (AN2D0)
    0.10 5.24 f
  • add_96xplusxplus/U5/Z (AN2D0)
    0.08 5.32 f
  • add_96xplusxplus/U4/Z (CKXOR2D0)
    0.15 5.47 f
  • add_96xplusxplus/SUM34 (exam1_DW01_add_35_0) 0
    .00 5.47 f
  • RegSUM/R34 (regne_n35)
    0.00 5.47 f
  • RegSUM/U32/Z (AO21D0)
    0.11 5.57 f
  • RegSUM/Q_reg34/D (EDFQD1)
    0.00 5.57 f
  • data arrival time
    5.57

51
Timing report after synthesis (7)
  • clock clk (rise edge)
    10.00 10.00
  • clock network delay (ideal)
    0.10 10.10
  • clock uncertainty
    -0.10 10.00
  • RegSUM/Q_reg34/CP (EDFQD1)
    0.00 10.00 r
  • library setup time
    -0.12 9.88
  • data required time
    9.88
  • ------------------------------------------------
    -------------------------------------
  • data required time
    9.88
  • data arrival time
    -5.57
  • ------------------------------------------------
    -------------------------------------
  • slack (MET)
    4.31

52
Static Timing Analysis
53
Static Timing Analysis Review
  • Tools will calculate all paths from sequential
    start point to sequential end point.
  • The worst case path will be used for Setup
    analysis, and the best case path will be used for
    hold analysis.
  • All paths are considered for design rule checking

54
Review of Setup and Hold Checks
55
False and Multicycle paths
  • False path
  • Very slow signals like reset, test mode enable,
    that are not used under normal conditions are
    classified as false paths
  • Multicycle path
  • Paths that take more than one clock cycle are
    known as multicycle paths.
  • Have to take define the multicylce paths in the
    analyzer and it takes those constraints into
    account when synthesizing

56
Multicycle path - Example
57
Optimizationcriteria
58
Degrees of freedom and possible trade-offs
speed
area
power
testability
59
Degrees of freedom and possible trade-offs
speed
latency
area
throughput
60
VHDL Coding for Synthesis
61
Recommended rules for Synthesis
  • When implementing combinational paths do not have
    hierarchy
  • Register all outputs
  • Do not implement glue logic between blocks,
    partition them well
  • Separate designs on functional boundary
  • Keep block sizes to a reasonable size

62
Avoid hierarchical combinational blocks
The path between reg1 and reg2 is divided between
three different block Due to hierarchical
boundaries, optimization of the combinational
logic cannot be achieved Synthesis tools
(Synopsys) maintain the integrity of the I/O
ports, combinational optimization cannot be
achieved between blocks (unless grouping is
used).
63
Recommend way to handle Combinational Paths
All the combinational circuitry is grouped in the
same block that has its output connected the
destination flip flop It allows the optimal
minimization of the combinational logic during
synthesis Allows simplified description of the
timing interface
64
Register all outputs
Simplifies the synthesis design environment
Inputs to the individual block arrive within the
same relative delay (caused by wire delays) Dont
really need to specify output requirements since
paths starts at flip flop outputs. Take care of
fanouts, rule of thumb, keep the fanout to 16
(dependent on technology and components that are
being driven by the output)
65
NO GLUE LOGIC between blocks
Due to time pressures, and a bug found that can
be simply be fixed by adding some simple glue
logic. RESIST THE TEMPTATION!!! At this level in
the hierarchy, this implementation will not allow
the glue logic to be absorbed within any lower
level block.
66
Separate design with different goals
reg1 may be driven by time critical function,
hence will have different optimization
constraints reg3 may be driven by slow logic,
hence no need to constrain it for speed
67
Optimization based on design requirements
  • Use different entities to partition design blocks
  • Allows different constraints during synthesis to
    optimize for area or speed or both.

68
Separate FSM with random logic
  • Separation of the FSM and the random logic allows
    you to use FSM optimized synthesis

69
Maintain a reasonable block size
  • Partition your design such that each block is
    between 1000-10000 gates (this is strictly tools
    and technology dependent)
  • Larger the blocks, longer the run time -gt quick
    iterations cannot be done.
Write a Comment
User Comments (0)
About PowerShow.com