A SOC DSP Design Methodology Case Study - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

A SOC DSP Design Methodology Case Study

Description:

Holmdel, NJ 07733. joew_at_lucent.com. 2. Lucent Technologies. Bell Labs Innovations. Outline ... Wire load model assumes that all single fanout nets have ... – PowerPoint PPT presentation

Number of Views:147

Avg rating:3.0/5.0

Slides: 20

Provided by: josephw8

Category:

more less

Transcript and Presenter's Notes

Title: A SOC DSP Design Methodology Case Study

1
A SOC DSP Design Methodology Case Study

Joseph Williams
Room 4e-525
101 Crawfords Corner Rd.
Holmdel, NJ 07733
joew_at_lucent.com

2
Outline

Testchip 1 architecture and methodology summary
Testchip 1 design review
Testchip 2 architecture and methodology summary
Testchip 2 design review
Results comparison

3
Daytona testchip 1
32-bit RISC 64-bit SIMD
32-bit RISC 64-bit SIMD
Hardware Debug
Hardware Debug
I/O Subsystem
L1 Cache
L1 Cache
Memory Controller
SRAM
PE Controller
PE Controller
Arbiters Semaphores
Transaction Manager DMA
128-bit Split Transaction Bus
Host Interface
32-bit RISC 64-bit SIMD
32-bit RISC 64-bit SIMD
Host I/O
Hardware Debug
Hardware Debug
L1 Cache
L1 Cache
PE Controller
PE Controller
4
Vanilla Flow
Restructure RTL
Implement RTL
RTL
Cell Library
Standard Wire Load
Reoptimize
Synopsys RTL Synthesis
Netlist
Avanti Place Route
Final Netlist
Layout Parasitics
5
Chip implementation specifics
6
(No Transcript)
7
Testchip 1 back-end design debriefing

Back-end process slipped schedule endlessly
Back-end process required 9 months
Well over 18 man months of effort
The die size was several times larger than
predicted
0.35u implementation abandoned for 0.25u due to
congestion
Die utilization for final design was below 25
Timing closure was a nightmare
Initial target of 150Mhz was abandoned
Could not achieve timing closure on most blocks
without several time consuming iterations
Inter-block routing required many manual fixes
Tools required hours and days to produce results
and crashed regularly

8
What the hell happened?!!

The design had many characteristics which make
back-end difficult
Many wide busses with large fanin and fanout
Centralized state machines controlling vast
regions of logic
Timing paths which span multiple blocks
The design methodology was not sufficient to
handle a design of this complexity
Pre-layout estimates of parasitics were very
inaccurate
No mechanisms existed to predict and manage
congestion
Large design database required extensive tool
run-times
Separate designers did not understand the
implication of connections to inter-block routing
resources

9
Modify the methodology to handle large SOC designs

Invest time in the redesign of the architecture
to make the logical and physical hierarchy
similar
Partition the physical design early in the
synthesis process
Define groupings of cells small enough to be
timed accurately with wire load models (local
nets)
Identify nets which cross group boundaries
(global nets)
Invest time in multiple stages of floorplanning
Use a multipass synthesis strategy with
successive refinement of wire parasitics from
floorplan

10
Hierarchical Wire Load Models
Block D
100K_WLM
Block C
50K_WLM
Block A
Block B
10K_WLM
10K_WLM
11
Table Format Wire Load Model
wire_load_table(10K_WLM) fanout_length( 1,
0.002) fanout_length( 2, 0.005) fanout_length(
3, 0.013) fanout_length( 4, 0.022) fanout_length
( 7, 0.033) fanout_length( 11,
0.054) fanout_capacitance( 1, 0.002) fanout_capa
citance( 2, 0.005) fanout_capacitance( 3,
0.013) fanout_capacitance( 4, 0.022) fanout_capa
citance( 7, 0.033) fanout_capacitance( 11,
0.054)
fanout_resistance( 1, 0.005) fanout_resistance(
2, 0.005) fanout_resistance( 3,
0.139) fanout_resistance( 4, 0.276) fanout_resis
tance( 7, 0.550) fanout_resistance( 11,
0.785) fanout_area( 1, 0.5) fanout_area( 2,
1.0) fanout_area( 3, 1.5) fanout_area( 4,
2) fanout_area( 7, 3.5) fanout_area( 11, 5.5)
12
Wire Load Model Limitations
Block A encloses 50,000 standard cells and uses
50K_WLM
Wire A0
Wire A2
Wire A1
Wire load model assumes that all single fanout
nets have the same capacitance, resistance, and
area
13
Partition nets into two classes
SIMD Datapath Floorplan

Local nets
Typically over 99 of nets fit in this class
Wire load models give reasonably accurate
parasitics
Global nets
Typically less than 1 of nets fit in this class
Wire load models are unreasonably inaccurate for
large designs, must use floorplan data

Local Nets
Global Nets
14
Daytona testchip 2
15
Improved Flow Part 1
Restructure RTL
Modify Physical Groupings
Implement RTL
RTL
Cell Library
Standard Wire Load
Synopsys Pass1 RTL Synthesis
Netlist
Cell Clusters
Net Groups
Avanti Floorplan
Layout Parasitics
Cell Placement
16
Improved Flow Part 2
Avanti Floorplan
Cell Placement
Annotated Global Nets
Custom Wire Load
Inplace Optimization
Synopsys Pass2 Resynthesis
Netlist
Cell Clusters
Net Groups
Avanti Place Route
Layout Parasitics
Final Netlist
17
Chip implementation specifics
18
Testchip 2 back-end design debriefing

Back-end process significantly accelerated
Back-end process required 2 months
Under 2 man months of effort
The final die size matched the predicted die size
Physical cell grouping and early floorplanning
resulted in no routing congestion in the final
layout
Die utilization for final design exceeded 95
Average cell size 2-3x smaller due to average
reduction in net size
Timing closure achieved with only two iterations
per pass
No manual fixes required post-layout
Tools runtimes 5-20x faster, few crashes