Title: Digital Integrated Circuits A Design Perspective
1Digital Integrated CircuitsA Design Perspective
Jan M. Rabaey Anantha Chandrakasan Borivoje
Nikolic
Semiconductor Memories
December 20, 2002
2Chapter Overview
3Semiconductor Memory Classification
Non-Volatile Read-WriteMemory
Read-Write Memory
Read-Only Memory
Random
Non-Random
EPROM
Mask-Programmed
Access
Access
2
E
PROM
Programmable (PROM)
FLASH
FIFO
SRAM
LIFO
DRAM
Shift Register
CAM
4Memory Timing Definitions
5Memory Architecture Decoders
M
bits
M
bits
S
S
0
0
Word 0
Word 0
S
1
Word 1
Word 1
A
0
S
Storage
Storage
2
Word 2
Word 2
A
cell
cell
1
words
A
N
K
1
Decoder
2
S
N
2
2
Word
N
2
Word
N
2
2
2
S
N
1
2
Word
N
1
Word
N
1
2
2
K
log
N
5
2
Input-Output
Input-Output
(
M
bits)
(
M
bits)
Intuitive architecture for N x M memory Too many
select signals N words N select signals
6Array-Structured Memory Architecture
Problem ASPECT RATIO or HEIGHT gtgt WIDTH
Amplify swing to
rail-to-rail amplitude
Selects appropriate
word
7Hierarchical Memory Architecture
Advantages
1. Shorter wires within blocks
2. Block address activates only 1 block gt power
savings
8Block Diagram of 4 Mbit SRAM
128 K Array Block 0
Subglobal row decoder
Subglobal row decoder
Global row decoder
Block 30
Block 31
Block 1
Local row decoder
Hirose90
9Contents-Addressable Memory
I/O Buffers
I/O Buffers
Commands
Commands
Validity Bits
9
2
Priority Encoder
Validity Bits
Address Decoder
9
2
Priority Encoder
Address Decoder
10Memory Timing Approaches
11Read-Only Memory Cells
BL
BL
BL
VDD
WL
WL
WL
1
BL
BL
BL
WL
WL
WL
0
GND
Diode ROM
MOS ROM 1
MOS ROM 2
12MOS OR ROM
BL
0
BL
1
BL
2
BL
3
WL
0
V
DD
WL
1
WL
2
V
DD
WL
3
V
bias
Pull-down loads
13MOS NOR ROM
V
DD
Pull-up devices
WL
0
GND
WL
1
WL
2
GND
WL
3
BL
0
BL
1
BL
2
BL
3
14MOS NOR ROM Layout
Cell (9.5l x 7l)
Programmming using the Active Layer Only
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
15MOS NOR ROM Layout
Cell (11l x 7l)
Programmming using the Contact Layer Only
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
16MOS NAND ROM
V
DD
Pull-up devices
BL
3
BL
2
BL
1
BL
0
WL
0
WL
1
WL
2
WL
3
All word lines high by default with exception of
selected row
17MOS NAND ROM Layout
Cell (8l x 7l)
Programmming using the Metal-1 Layer Only
Polysilicon
Diffusion
Metal1 on Diffusion
18NAND ROM Layout
Cell (5l x 6l)
Programmming using Implants Only
Polysilicon
Threshold-alteringimplant
Metal1 on Diffusion
19Equivalent Transient Model for MOS NOR ROM
Model for NOR ROM
- Word line parasitics
- Wire capacitance and gate capacitance
- Wire resistance (polysilicon)
- Bit line parasitics
- Resistance not dominant (metal)
- Drain and Gate-Drain capacitance
20Equivalent Transient Model for MOS NAND ROM
V
DD
Model for NAND ROM
BL
C
L
r
bit
c
bit
r
word
WL
c
word
- Word line parasitics
- Similar to NOR ROM
- Bit line parasitics
- Resistance of cascaded transistors dominates
- Drain/Source and complete gate capacitance
21Decreasing Word Line Delay
22Precharged MOS NOR ROM
V
f
DD
pre
Precharge devices
WL
0
GND
WL
1
WL
2
GND
WL
3
BL
0
BL
1
BL
2
BL
3
PMOS precharge device can be made as large as
necessary,
but clock driver becomes harder to design.
23Non-Volatile MemoriesThe Floating-gate
transistor (FAMOS)
Floating gate
Gate
Source
Drain
t
ox
t
ox
n
n
_
p
Substrate
Schematic symbol
Device cross-section
24Floating-Gate Transistor Programming
25A Programmable-Threshold Transistor
26FLOTOX EEPROM
Gate
Floating gate
I
Drain
Source
V
20
30 nm
-10 V
GD
10 V
n
1
n
1
Substrate
p
10 nm
Fowler-Nordheim I-V characteristic
FLOTOX transistor
27EEPROM Cell
BL
WL
Absolute threshold control is hard Unprogrammed
transistor might be depletion ? 2 transistor cell
28Flash EEPROM
Control gate
Floating gate
erasure
Thin tunneling oxide
1
n
source
n
1
drain
programming
p-
substrate
Many other options
29Cross-sections of NVM cells
EPROM
Flash
Courtesy Intel
30Basic Operations in a NOR Flash Memory?Erase
31Basic Operations in a NOR Flash Memory?Write
32Basic Operations in a NOR Flash Memory?Read
33NAND Flash Memory
Word line(poly)
Unit Cell
Source line (Diff. Layer)
Courtesy Toshiba
34NAND Flash Memory
Courtesy Toshiba
35Characteristics of State-of-the-art NVM
36Read-Write Memories (RAM)
Data stored as long as supply is applied
Large (6 transistors/cell)
Fast
Differential
Periodic refresh required
Small (1-3 transistors/cell)
Slower
Single Ended
376-transistor CMOS SRAM Cell
WL
V
DD
M
M
4
2
Q
M
M
6
5
M
M
1
3
BL
BL
38CMOS SRAM Analysis (Read)
WL
V
DD
M
BL
BL
4
Q
0
M
Q
1
6
M
5
V
M
V
V
DD
DD
DD
1
C
C
bit
bit
39CMOS SRAM Analysis (Read)
1.2
1
0.8
0.6
Voltage Rise (V)
0.4
0.2
Voltage rise V
0
0
0.5
1
1.2
1.5
2
2.5
3
Cell Ratio (CR)
40CMOS SRAM Analysis (Write)
41CMOS SRAM Analysis (Write)
426T-SRAM Layout
43Resistance-load SRAM Cell
WL
V
DD
R
R
L
L
Q
Q
M
M
3
4
BL
BL
M
M
1
2
44SRAM Characteristics
453-Transistor DRAM Cell
463T-DRAM Layout
471-Transistor DRAM Cell
Write C
is charged or discharged by asserting WL and BL.
S
Read Charge redistribution takes places between
bit line and storage capacitance
Voltage swing is small typically around 250 mV.
48DRAM Cell Observations
- 1T DRAM requires a sense amplifier for each bit
line, due to charge redistribution read-out. - DRAM memory cells are single ended in contrast
to SRAM cells. - The read-out of the 1T DRAM cell is destructive
read and refresh operations are necessary for
correct operation. - Unlike 3T cell, 1T cell requires presence of an
extra capacitance that must be explicitly
included in the design. - When writing a 1 into a DRAM cell, a threshold
voltage is lost. This charge loss can be
circumvented by bootstrapping the word lines to a
higher value than VDD
49Sense Amp Operation
501-T DRAM Cell
Capacitor
M
word
1
line
Cross-section
Layout
Uses Polysilicon-Diffusion Capacitance
Expensive in Area
51SEM of poly-diffusion capacitor 1T-DRAM
52Advanced 1T DRAM Cells
Word line
Capacitor dielectric layer
Cell plate
Insulating Layer
Cell Plate Si
Isolation
Transfer gate
Refilling Poly
Capacitor Insulator
Storage electrode
Storage Node Poly
Si Substrate
2nd Field Oxide
Stacked-capacitor Cell
Trench Cell
53Static CAM Memory Cell
54CAM in Cache Memory
CAM
SRAM
ARRAY
ARRAY
Hit Logic
Address Decoder
Input Drivers
Sense Amps / Input Drivers
Address
Tag
Hit
Data
R/W
55Periphery
- Decoders
- Sense Amplifiers
- Input/Output Buffers
- Control / Timing Circuitry
56Row Decoders
Collection of 2M complex logic gates Organized in
regular and dense fashion
(N)AND Decoder
NOR Decoder
57Hierarchical Decoders
Multi-stage implementation improves performance
WL
1
WL
0
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
2
3
2
3
2
3
2
3
0
1
0
1
0
1
0
1
NAND decoder using 2-input pre-decoders
A
A
A
A
A
A
A
A
2
2
3
3
0
0
1
1
58Dynamic Decoders
Precharge devices
GND
GND
WL
3
WL
3
WL
2
WL
2
WL
1
WL
1
WL
0
WL
0
V
A
A
A
A
f
DD
0
0
1
1
A
A
A
A
f
0
0
1
1
2-input NAND decoder
2-input NOR decoder
594-input pass-transistor based column decoder
2-input NOR decoder
Advantages speed (tpd does not add to overall
memory access time) Only one extra
transistor in signal path Disadvantage Large
transistor count
604-to-1 tree based column decoder
BL
BL
BL
BL
0
1
2
3
A
0
A
0
A
1
A
1
D
Number of devices drastically reduced
Delay increases quadratically with of sections
prohibitive for large decoders
buffers
Solutions
progressive sizing
combination of tree and pass transistor approaches
61Decoder for circular shift-register
62Sense Amplifiers
Idea Use Sense Amplifer
small
s.a.
transition
input
output
63Differential Sense Amplifier
V
DD
M
M
4
3
y
Out
M
M
bit
bit
1
2
M
SE
5
Directly applicable toSRAMs
64Differential Sensing ? SRAM
65Latch-Based Sense Amplifier (DRAM)
EQ
BL
BL
V
DD
SE
SE
Initialized in its meta-stable point with EQ
Once adequate voltage gap created, sense amp
enabled with SE
Positive feedback quickly forces output to a
stable operating point.
66Charge-Redistribution Amplifier
V
ref
V
V
L
S
M
1
C
small
C
M
M
large
2
3
Transient Response
Concept
67Charge-Redistribution Amplifier?EPROM
V
DD
Load
SE
M
4
Out
Cascode
C
out
M
device
V
3
casc
C
col
Column
decoder
WLC
M
2
BL
C
EPROM
M
BL
1
array
WL
68Single-to-Differential Conversion
How to make a good Vref?
69Open bitline architecture with dummy cells
EQ
L
L
L
V
R
R
L
1
0
DD
0
1
SE
BLL
BLR
C
C
C
C
C
C
S
S
S
SE
S
S
S
Dummy cell
Dummy cell
70DRAM Read Process with Dummy Cell
3
3
2
2
BL
BL
V
V
1
1
BL
BL
0
0
0
1
2
3
0
1
2
3
t
(ns)
t
(ns)
reading 0
reading 1
3
EQ
WL
2
V
SE
1
0
0
1
2
3
t
(ns)
control signals
71Voltage Regulator
V
DD
M
drive
V
V
REF
DL
Equivalent Model
V
bias
V
REF
-
M
drive
V
DL
72Charge Pump
73DRAM Timing
74RDRAM Architecture
Bus
Clocks
k
Data
k
l
3
memory
bus
array
network
mux/demux
Column
packet dec.
demux
Row
packet dec.
demux
75Address Transition Detection
V
DD
DELAY
t
A
d
0
ATD
ATD
DELAY
t
A
d
1
DELAY
t
A
d
1
N
2
76Reliability and Yield
77Sensing Parameters in DRAM
1000
C
D
(1F)
V
smax
(mv)
Q
100
S
(1C)
smax
C
S
(1F)
V
,
DD
V
,
S
10
C
,
S
Q
V
,
DD
(V)
D
Q
C
V
/
2
C
5
S
S
DD
V
Q
/
(
C
C
)
5
1
smax
S
S
D
4K
64K
1M
16M
256M
4G
64G
Memory Capacity (bits
/
chip)
From Itoh01
78Noise Sources in 1T DRam
substrate
BL
Adjacent BL
C
WBL
a
-particles
WL
leakage
C
S
electrode
C
cross
79Open Bit-line Architecture Cross Coupling
EQ
WL
WL
WL
WL
WL
WL
1
0
D
D
0
1
C
C
WBL
WBL
BL
BL
Sense
C
C
BL
BL
Amplifier
C
C
C
C
C
C
80Folded-Bitline Architecture
81Transposed-Bitline Architecture
82Alpha-particles (or Neutrons)
-particle
a
WL
V
DD
BL
SiO
2
1
n
2
1
2
1
2
2
1
1
2
1
2
1
1 Particle 1 Million Carriers
83Yield
Yield curves at different stages of process
maturity (from Veendrick92)
84Redundancy
Row
Address
Redundant
rows
Fuse
Bank
Redundant
columns
Memory
Array
Row Decoder
Column
Column Decoder
Address
85Error-Correcting Codes
Example Hamming Codes
86Redundancy and Error Correction
87Sources of Power Dissipation in Memories
V
DD
I
C
V
f
I
CHIP
5
S
D
1S
DD
i
i
DCP
nC
V
f
m
DE
INT
selected
mi
act
C
V
f
PT
INT
I
DCP
n
m(n
1)i
2
non-selected
ROW
hld
ARRAY
DEC
mC
V
f
DE
INT
PERIPHERY
COLUMN DEC
V
SS
From Itoh00
88Data Retention in SRAM
(A)
SRAM leakage increases with technology scaling
89Suppressing Leakage in SRAM
V
DD
V
V
low-threshold transistor
DD
DDL
sleep
V
sleep
DD,int
V
DD,int
SRAM
SRAM
SRAM
cell
cell
cell
SRAM
SRAM
SRAM
cell
cell
cell
V
SS,int
sleep
Reducing the supply voltage
Inserting Extra Resistance
90Data Retention in DRAM
From Itoh00
91Case Studies
- Programmable Logic Array
- SRAM
- Flash Memory
92PLA versus ROM
structured approach to random logic
two level logic implementation
NOR-NOR (product of sums)
NAND-NAND (sum of products)
IDENTICAL TO ROM!
ROM fully populated
PLA one element per minterm
Note Importance of PLAs has drastically reduced
1.
slow
2.
better software techniques (mutli-level logic
synthesis)
But
93Programmable Logic Array
Pseudo-NMOS PLA
V
DD
GND
GND
GND
GND
GND
GND
GND
V
X
X
X
f
f
X
X
X
0
0
1
0
1
1
2
2
DD
AND-plane
OR-plane
94Dynamic PLA
f
AND
V
GND
DD
f
OR
f
OR
f
AND
GND
V
X
X
X
f
f
X
X
X
0
1
DD
0
0
1
1
2
2
AND-plane
OR-plane
95Clock Signal Generation for self-timed dynamic
PLA
f
f
f
AND
Dummy AND row
f
AND
t
t
pre
eval
f
Dummy AND row
f
AND
OR
f
OR
(a) Clock signals
(b) Timing generation circuitry
96PLA Layout
974 Mbit SRAMHierarchical Word-line Architecture
98Bit-line Circuitry
Block
Bit-line
select
ATD
load
BEQ
Local
WL
Memory cell
B
/
T
B
/
T
CD
CD
CD
I
/
O
I/O line
I
/
O
Sense amplifier
99Sense Amplifier (and Waveforms)
I
/
O
I
/
O
SEQ
Block
select
ATD
BS
BS
SA
SA
SEQ
SEQ
SEQ
SEQ
DATA
De
i
BS
1001 Gbit Flash Memory
From Nakamura02
101Writing Flash Memory
Read level (4.5 V)
Number of cells
Final Distribution
Evolution of thresholds
From Nakamura02
102125mm2 1Gbit NAND Flash Memory
32 word lines x 1024 blocks
10.7mm
2kB Page buffer cache
Charge pump
16896 bit lines
11.7mm
From Nakamura02
103125mm2 1Gbit NAND Flash Memory
- Technology 0.13?m p-sub CMOS triple-well
- 1poly, 1polycide,
1W, 2Al - Cell size 0.077?m2
- Chip size 125.2mm2
- Organization 2112 x 8b x 64 page x 1k block
- Power supply 2.7V-3.6V
- Cycle time 50ns
- Read time 25?s
- Program time 200?s / page
- Erase time 2ms / block
From Nakamura02
104Semiconductor Memory Trends(up to the 90s)
Memory Size as a function of time x 4 every
three years
105Semiconductor Memory Trends(updated)
From Itoh01
106Trends in Memory Cell Area
From Itoh01
107Semiconductor Memory Trends
Technology feature size for different SRAM
generations