Title: Ion Mandoiu
1Challenges in Design Automation for Nanoscale
VLSI and DNA Systems
- Ion Mandoiu
- ECE Department, UC San Diego
2Outline
- Challenges for nanoscale VLSI design
- New algorithmic framework for global interconnect
synthesis - New methodology for redundant interconnect
- Future research directions in VLSI and DNA design
automation
- Challenges for nanoscale VLSI design
- New algorithmic framework for global interconnect
synthesis - New methodology for redundant interconnect
- Future research directions in VLSI and DNA design
automation
3Historical Trends in VLSI Scaling
- Exponential integration rate (Moores Law)
- Exponential decrease in cost/transistor
- Tremendous economic impact
4Will these trends continue?
- Exponential integration rate expected to continue
for 1-2 decades - Moore 2003 No exponential is forever but we can
delay forever
5Will these trends continue?
- Exponential integration rate expected to continue
for 1-2 decades - Moore 2003 No exponential is forever but we can
delay forever - International Technology Roadmap for
Semiconductors (ITRS) - 800 experts from industry, academia, governments
worldwide - Sets targets for RD needs over 15 year horizon
6ITRS 2001 targets
- Dynamic Random Access Memory ½ Pitch 22 nm by
2016
7ITRS 2001 targets
- Dynamic Random Access Memory ½ Pitch 22 nm by
2016
- Physical gate length 9 nm by 2016
Production Year
8Nanoscale Integration Challenges
2.2 billion trans./cm2 by 2016!
9Nanoscale Integration Challenges
Global interconnect gets slower
Repeaters help
Local interconnect gates get faster
10Nanoscale Integration Challenges
- Signal integrity
- Power consumption
- Manufacturing reliability
- Verification and test
- Manufacturing cost
11Implications for Nanoscale Design
- Challenges must be addressed at all design phases
- - Flow integration, early planning
- Need new methodologies
- - Interconnect-centric design, design for test,
design for manufacturing - ? Need improved optimization algorithms
- - Highly scalable, predictable solution quality
12Outline
- Challenges for nanoscale VLSI design
- New algorithmic framework for global interconnect
synthesis - New methodology for redundant interconnect
- Future research directions in VLSI and DNA design
automation
13Nanoscale VLSI Context
- Timing closure signal integrity require
- Aggressive optimizations
- Buffer insertion
- Buffer sizing
- Pin assignment
- Wire sizing
- Simultaneous control of
- Routing resources
- Congestion
- Power consumption
- Need predictable scalable integrated approach
- 106 buffers / die in 50nm technology
14Buffer Planning Methodologies
15Buffer Planning Methodologies
16Global Buffered Routing Framework
- Tile graph model ? captures routing/buffer
congestion
17Global Buffered Routing Framework
- Tile graph model ? captures routing/buffer
congestion
- Given
- Tile graph G with
- wire capacity w(u,v) routing channels
between tile u and v - buffer capacity b(v) buffers sites in
tile v - Netlist (2-pin nets)
- Maximum buffer load U (in tiles)
Find Feasible buffered routing minimizing total
routing area ?(buffers) ?(total wirelength)
18Global Buffered Routing Framework
- Tile graph model ? captures routing/buffer
congestion
- Reformulation as integer multicommodity flow
problem
19Global Buffered Routing Framework
- Tile graph model ? captures routing/buffer
congestion
- Reformulation as integer multicommodity flow
problem
20Global Buffered Routing Framework
- Tile graph model ? captures routing/buffer
congestion
- Reformulation as integer multicommodity flow
problem
21Global Buffered Routing Framework
- Tile graph model ? captures routing/buffer
congestion
- Reformulation as integer multicommodity flow
problem
- RelaxRound approach
- Provably good solution quality RaghavanT87
- Key to runtime scalability approximate solution
to the fractional relaxation - Generalizes edge-capacitated MCF approximation of
GargK98, F99
22High-Level Algorithm Idea
- Iteratively construct both primal and dual
solutions - In each phase, route one unit of flow for each
commodity - Flow routed along min-weight path w.r.t. dual
variables (using Dijkstra) - Dual variables for vertices/edges on routed path
are scaled by a multiplicative factor - ? Exponential dependence on usage (often used
vertices/edges subsequently avoided)
23Extensions
- Pin assignment
- Buffer sizing
- Wire sizing
- Layer assignment
- Sink delay upper bounds (Elmore-Delay)
- - delay constrained min-weight paths
- Multi-pin nets
- Simultaneous optimization!
24Experimental Results
25Experimental Results
26Experimental Results
27Global Interconnect Summary
- Powerful algorithmic framework based on
multicommodity flows - Simultaneous consideration of wire and buffer
congestion, pin layer assignment, sizing,
timing constraints - Flexible tradeoff between runtime and solution
quality
- Ongoing work
- Further improvements in algorithm scalability
- Window vs. tile buffer constraints
28Outline
- Challenges for nanoscale VLSI design
- New algorithmic framework for global interconnect
synthesis - New methodology for redundant interconnect
- Future research directions in VLSI and DNA design
automation
29Trends in Manufacturing Reliability
- Defects difficult to control in nanoscale
processes - Interconnect defects increasingly dominant
30Previous Work
- Focused on reduction of short faults
- Conservative design rules
- Decompaction
- Routing for reliable manufacturing
- DTR Defect Tolerant Routing (Pitaksanonkul et
al. 1985) - YOR Yield Optimizing Routing (Kuo 1993)
- Reliability-aware routing costs
(Huijbregts,XueJess 1995) - Open faults become dominant due to change from
aluminum to copper interconnect - - Aluminum is etched ? short faults
- - Copper is deposited ? open faults
31POF Opens vs. Shorts
- Open faults are significantly (3x) more likely to
occur
32Techniques for Open POF Reduction
- Wire doubling
- Redundant interconnect
- Easy to integrate in current flows
(post-processing approach) - Potentially more effective use of resources
- How effective?
33Problem Formulation
- Manhattan Routed Tree Augmentation Problem
- Given
- Tree T routed in the Manhattan plane
- Feasible routing region
- Wirelength increase budget W
- Find
- Augmenting paths A within feasible region
- Such that
- Total length of augmenting paths is less than W
- Total length of biconnected edges in T?A is
maximum
- Wirelength increase budget used to balance open
POF decrease with short POF increase
34Types of Allowed Augmenting Paths
35Integer Linear Program (Type A-C Paths)
-
Total biconnected length - Subject to
-
Wirelength budget -
(u,v) biconnected if some p connects Tu Tv -
pxp1 gives augmenting paths -
eye1 gives biconnected tree edges
P set of -- at most O(n2) -- augmenting paths
36Integer Linear Program (Type A-C Paths)
-
Total biconnected length - Subject to
-
Wirelength budget -
(u,v) biconnected if some p connects Tu Tv -
pxp1 gives augmenting paths -
eye1 gives biconnected tree edges
P set of -- at most O(n2) -- augmenting paths
37Empirical Evaluation
- Compared algorithms
- Integer program solved using CPLEX
- Greedy augmentation algorithm
- Best-drop heuristic (Khuller-Raghavachari-Zhu
99) - Recent genetic algorithm (Raidl-Ljubic 2002)
- Test Cases
- Random nets nets extracted from real designs
- No routing obstacles
38Biconnectivity-Wirelength Tradeoff
Random 20-terminal nets
? 68 biconnectivity with 20 WL increase
39Max SPICE Delay (ps)
- 52-56 terminal nets, routed for min-area
40Max SPICE Delay (ps)
- 52-56 terminal nets, routed for min-area
41Max SPICE Delay (ps)
- 52-56 terminal nets, routed for min-area
42Max SPICE Delay (ps)
- 52-56 terminal nets, routed for min-area
- Redundant interconnect improves max delay
- 28 average, 62 max. improvement for 20 WL
increase
43Redundant Interconnect Summary
- New methodology for redundant interconnect
synthesis - Easy to integrate in current flows
- Significant biconnectivity increase with small
increase in wirelength
- Ongoing work
- Multiple net augmentation
- Simultaneous tree augmentation and decompaction
- Reliability with timing constraints
44Outline
- Challenges for nanoscale VLSI design
- New algorithmic framework for global interconnect
synthesis - New methodology for redundant interconnect
- Future research directions in VLSI and DNA design
automation
45Ongoing and Future Research Directions
- VLSI Design Automation
- Physical design (non-Manhattan interconnect
architectures, clock synthesis,) - Design for test, built-in self-test
- Design for manufacturing, cost optimizations
(multi-project wafers, reduced-field reticles) - Sensor and Ad Hoc Wireless Networks
- Broadcasting and routing protocols
- Power consumption
- DNA Array Design Automation
- Scalable tools for next-generation DNA arrays
- Enhanced Design Flow
46DNA Probe Arrays
- Introduced in early 90s
- Short DNA probes that hybridize to unknown
genetic material - Used in gene expression monitoring, mutation
detection, single nucleotide polymorphism (SNP)
analysis, medical diagnosis
47DNA Array Manufacturing Process
Very Large Scale Imobilized Polymer Synthesis
(VLSIPS)
48Technology Scaling Challenges
- 1,000x1,000 arrays in commercial production
today - 10,000x10,000 array sites in next generation
- Scaling effects increased unwanted illumination
49Example Probe Synthesis
50Measure of Unwanted Illumination
Unwanted illumination ? border length
51Synchronous Synthesis
- Periodic deposition sequence, e.g., (ACTG)k
- Probes grow in sync, one nucleotide per period
? Border length 2 x Hamming distance
522-D Placement Problem(Synchronous Synthesis)
Edge cost 2 x Hamming dist
53Previous Approaches
- Hubbell 90s
- Find TSP w.r.t. Hamming dist
- Thread TSP to grid row by row
54Highly-Scalable 2-D Placement
55Highly-Scalable 2-D Placement
56Highly-Scalable 2-D Placement
- Epitaxial placement
- Simulates crystal growth
- Efficient tile row versions
57Asynchronous Synthesis
- Arbitrary deposition sequence
- Probes grow at different speeds
- Border depends on embeddings into deposition
sequence - ? 3-D placement problem
58 Optimal Probe Embedding
- Dynamic programming algorithm similar to LCS
593-D Placement Algorithms
- Simultaneous placement and alignment
(asynchronous epitaxial) - Slow, poor solution quality
- Synchronous placement iterative probe embedding
- Scalable, better solution quality
- Synchronous placement asynchronous sliding
window matching - Scalable, best solution quality
60DNA Arrays Summary
- Experimental flow with fully scalable components
improves border length by 5 over industry
designs
- Ongoing and future work
- Integrated DNA array flow
- Lab-on-chip sensors
61Thank You for Your Attention!