Title: Strategies for Improving the
1ICCAD07, San Jose, CA
Strategies for Improving the Parametric
Yield and Profits of 3D ICs
Cesare Ferri Sherief Reda R. Iris Bahar
Division of Engineering Brown University
Providence, RI 02912
2Why 3D ICs?
- Optical lithography is approaching its natural
limits ? 3D integration can extend Moores law by
going vertically - Advantages
- Form factor
- Heterogeneous integration
- Better performance
- Practically limited by
- Yield
from future-fab.com
3Manufacturing steps for 3D ICs
SEM image of TSV Source RTI International
source IMEC
- Blind Through-Silicon Via (TSV), which do not go
all the way through the wafer, are created either
during transistors manufacturing or immediately
after. - TSV are coated with a layer of insulating
material before copper deposition. - Wafer is thinned from the back until the TSV
vertical wires are exposed to enable bonding to
the next layer.
4Examples of 3D-ICs
- 3D memory
- 3D sensors
- 3D processors ( stacked CPU L2 DRAM )
DRAM
L2/L3 Cache
CPU
3D Nand Memory Source Toshiba
3D processor
5Bonding techniques in 3D integration
Die-to-wafer is the most promising for
large-scale integration
die
die
Wafer 1
die
die-to-wafer
die-to-die
6Impact of defects on 3D IC functional yield
- Each wafer has its own defect map
- Wafer-to-wafer integration can result in
dramatic - yield loss
- For example, a 60 yield process will lead to
- 36 yield for 3D ICs composed of 2 die
- 7 yield for 3D ICs composed of 5 die
- Die-to-wafer/die-to-die allow to identify and
only use - the known good die
yield wafer maps
7Process variations in 2D ICs
- Process variations
- In devices gate length variations, dopant
fluctuations/profile, LER - In interconnects spatial variations (non-uniform
etch), CMP induced (dishing, erosion) - Process variations impact key electrical
parameters (speed leakage) of ICs and
consequently the parametric yield and revenues of
ICs
intra-die variations
inter-die variations
8Impact of process variations on the parametric
yield of 3D ICs
- Wafer-to-wafer integration dictates the outcome
(and final distribution) and the parametric yield
of 3D ICs - Die-to-wafer / die-to-die offer integration
flexibility that can be exploited to maximize the
parametric yield of 3D ICs
inter-die variations of wafer 1
inter-die variations of wafer 2
- What is the parametric yield if these two wafers
are stacked for 3D IC production?
9Objectives of this talk
- Develop a model to characterize the impact of
process variations on the parametric yield of 3D
ICs - Design 3D integration strategies to maximize the
parametric yield
- More complicated than 2D ICs
- Different dies that belong to the same 3D IC are
fabricated on different wafers and then
integrated together - The impact of TSVs have been taken into account
10Modeling the impact of process variations. Design
example logic-memory 3D IC
11How can we measure the speed or performance of a
3D processor?
- Possible parameter choice
- MIPS Millions of Instructions Per Second
- MIPS Clock rate / (CPI x 106)
- Decouple the physical aspects (Clock rate) from
the particular microarchitecture and application
(CPI Cycles Per Instruction) - In our case, for a given microarchitecture and
application (benchmark suite) the CPI depends on
the L2 latency
12Modeling 3D processor performance under process
variations
L2 cache access time
latencycalculator
CPU frequency
132. Designing 3D integration strategies that
maximize the performance/parametric yield
L2 Caches
1.1ns
CPUs
3.1GHz
1.1ns
3.1GHz
3.2GHz
1.2ns
1.0ns
3.2GHz
3.0GHz
1.0ns
3.1GHz
1.2ns
2.9GHz
1.2ns
1.0ns
2.9GHz
2.8GHz
1.1ns
3.1GHz
1.1ns
3.2GHz
1.0ns
- How can we exploit the flexibility in
die-to-wafer and die-to-die - integration to maximize performance/parametric
yield of 3D ICs?
14Heuristic 3D integration strategies
- Fast-Fast (FF) Sort the die in descendingly
according to speed and then integrate in order - Matches the fast with the fast and the slow with
the slow - Produces systems with extreme fast/slow
performance
- Fast-Slow (FS) Sort the die in one wafer
ascendingly according to speed and sort the die
in the other wafer descendingly - Matches the fast with the slow and the slow with
the fast - Produces systems with average performance
15Optimal integration strategies for 3D ICs with
two die
- Graph-theoretic approach
- Nodes represent die
- Edges represent potential 3D ICs
- Edge labels represent cost of 3D Ics
- The cost is the characteristic that we want
- to optimize
- In our case we want to optimize the MIPS
L2 caches from wafer 2
CPUs from wafer 1
Cost (MIPS)
Problem find an assignment strategy that assigns
each die to exactly one 3D IC, and such that
total cost is optimized.
Problem can be solved optimally in O(N3) time (N
is the number of die) using the Hungarian
algorithm
16Experimental setup
CPU parameters (1) 2-way, 3 cycle L1 cache of 16
Kbyte (2) 8-way 2MB L2 cache (3) main memory
latency is 50 cycles (4) the issue width is 4.
17Modeling the impact of process variations on
each of the CPU and L2 cache
CPU
L2 Cache
9 stage critical path with gate length variations
L2 configuration
70nm Berkley model
PRACTICS
gate length variations
SPICE
inter-die variations for 100 L2 die
inter-die variations for 100 CPU die
48
54
Average 1.46ns STD 11.06
Average 3.12GHz STD 10.33
18Values obtained for latency and the results of
Simplescalar simulations
- Simplescalar accepts as configuration parameter
the L2 Latency expressed in CPU cycles - No need to simulate each possible scenario (i.e.
NxN) - runtime can be drastically cut down
- Example
- N100 CPU Cycle Time values (ns) x N100 L2
Access Time values(ns) ? 3,4,5,6,7,8 L2
Latency Cycles
19Binning strategy
- Define four bins (according to the MIPS) and
characterize each bin with a price () - Slow (cheapest)
- Medium
- Fast
- Extreme (very expensive)
- Label each chip with the bin price they belong
to.
20Results Impact of proposed 3D integration
strategies on performance and yield
OPT and FF yield almost twice the number of
extreme processors compared to other strategies.
38
37
36
28
21
OPT reduces the number of processors in the slow
bin by almost half compared to FF.
14
14
12
8
3
21Results Impact of proposed 3D integration
strategies on performance and yield
FF produces the highest number of CPUs in the
slow bin
48
39
21
FS produces a large number of CPUs in the medium
and fast bins, but produces the fewest number of
CPUs in the extreme bin
17
3
12
10
22Price models and binning
- Use a Price_Funtion to assign a price()
according to the chip performance (MIPS) - The Price_Function is obtained by interpolation
of real market prices (e.g. INTEL processors) - We can optimize directly for price by
interchanging and MIPS or we can just solve for
MIPS
Source www.tomshardware.com
23Results Impact of proposed 3D integration
strategies on yield and sales revenues
- Optimal variation-aware 3D integration technique
improves profits by 12.5 (compared to RND
matching) - Optimizing the number of chips in the fastest
components is very profitable.
24Conclusions and Future Work
- We evaluated the impact of process variations on
3D ICs - We modeled the parametric yield using realistic
performance and price models - We proposed a number of integration strategies to
improve the parametric yield of 3D ICs - Our results show that we are able to reshape the
final distribution of produced ICs to maximize
the performance and revenues of 3D ICs - Current and future work will extend this work to
cover leakage current and temperature-aware
modeling