Title: Some Results on Codes for Flash Memory
1Some Results on Codes for Flash Memory
- Michael Mitzenmacher
- Includes work with Hilary Finucane, Zhenming Liu,
Flavio Chierichetti
2Flash Memory
- Now becoming the standard for many products and
devices. - Even flash hard drives becoming a standard.
- But flash memory works differently than
traditional memories. - New, interesting questions.
3Basics of Flash
- Data organized into cells
- Can write at the cell level
- Cells contain electrons
- Can ADD electrons at the cell level
- Typical ranges are 2-4 possible states, but may
increase 256 someday? - Cells organized into blocks
- Can only ERASE at the block level
- Blocks can be thousands/hundreds of thousands of
cells
4The Problem with Erasures
- Erasing a block is expensive
- In terms of time solve by preemptive moves of
data. - In terms of wear.
- Limited life cycles imply minimizing block
erasure an important goal.
5Basics of Flash
- Reading and one-way writing adding electrons
is easy. - Writing general values is hard.
- What should our data representation look like in
such a setting?
0 2 3 1
2 2 3 1
0 2 3 1
0 2 1 1
6Big Underlying Question
- How should flash change our underlying
algorithms, data structures, data representation? - Memory structure, hierarchy has big impact on
performance. - Algorithmists should care!
- Here focusing on basic question of data
representation.
7Some History
- Write-once memories (WOMs)
- Introduced by Rivest and Shamir, early 1980s.
- Punch cards, optical disks.
- Can turn 0s to 1s, but not back again.
- Question How many punch card bits do you need
to represent t rewrites of a k-bit value? - Starting point for this kind of analysis.
- Better schemes than the naïve kt bits.
8Floating Codes
- Data representation for flash memory.
- State is an n-ary sequence of q-ary numbers.
- Represents block of n cells each cell holds an
electric charge, q states. - State mapped to variable values.
- Gives k-ary sequence of l-ary numbers.
- State changes by increasing one or more cell
values, or reset entire block. - Resets are expensive!!!!
9Floating Codes The Problem
- As variable values change, need state to track
variables. - How do we choose the mapping function from states
to variables AND the transition function from
variable changes to state changes to maximize the
time between reset operations? - These codes do not correct errors. Just data
representation. - Errors a separate issue.
10Formal Model
- General Codes
- We usually consider limited variation one
variable changes per step.
11Example
Track k 4 bits (so l 2) with n 8 cells
having q 4 states
D
3 2 2 0 3 0 3 1
1 0 1 0
Change bit 3
R
D
3 2 2 0 3 1 3 1
1 0 0 0
Change bit 2
R
D
3 2 3 0 3 1 3 1
1 1 0 0
Change bit 1
R
D
3 3 2 0 3 1 3 1
0 1 0 0
Change bit 1
R
D
1 0 1 0 0 0 0 0
1 1 0 0
12History
- Floating codes introduced by Jiang, Bohossian,
Bruck (ISIT 2007) as model for Flash Memory. - Designed to maximize worst-case time between
resets. - New multidimensional flash codes suggested by
Yaakobi, Vardy, Siegel, Wolf in Allerton 2008. - Average case studied by Finucane, Liu,
Mitzenmacher in Allerton 2008.
13Contribution 1 New Worst-Case Codes
- Hilary Finucanes senior thesis.
- Similar codes also found simultaneously by
Yaakobi et al. - Simple construction, best known performance.
- Tracks k bits of data, for even k.
- Performance measured by deficiency.
- Max possible updates is n(q-1).
- Deficiency is smallest t such that n(q-1)-t
updates always possible.
14Mod-Based Codes
- Break block into groups of k cells.
- Each group will represent 1 bit.
- And at most one active group per bit.
- Parity of group determines value of bit.
- Increase a cell by 1 each time the bit changes.
- How do we know which bit for each group?
- Start with jth cell within a group to represent
bit j. - As cells fill go right, moving back to first cell
at end. - Either last empty cell is j - 1, or only non-full
cell is j - 1 either way, can figure out which
bit. - Maximum deficiency k2q. Independent of n!
15Examples
Track k 8 bits with cells having q 4 states
0 0 0 0 3 0 0 0
Bit 5 is 1
0 0 0 0 3 3 2 0
Bit 5 is 0
3 3 3 3 3 3 2 0
Bit 1 is 0
3 3 1 3 3 3 3 3
Bit 4 is 0
0 0 0 0 0 0 0 0
Empty block, ignore
3 3 3 3 3 3 3 3
Full block, ignore
16Further Improvements
- Can improve basic construction by being more
careful as available cells get small. - Can prove O(kq(log2k)(logqk)) deficiency.
- Use smaller blocks of cells, but explicitly write
which bit it stores, when number of cells gets
small.
17Contribution 2 Average Case
- Argument Worst-case time between resets is not
right design criterion. - Many resets in a lifetime.
- Mass-produced product.
- Potential to model user behavior.
- Statistical performance guarantees more
appropriate. - Expected time between resets.
- Time with high probability.
- Given a model.
18Specific Contributions
- Problem definition / model
- Codes for simple cases
19Formal Model Average Case
- Above when
- Cost is 0 when R moves to cell state above
previous, 1 otherwise. - Assumption variables changes given by Markov
chain. - Example ith bit changes with prob. pi
- Given D, R, gives Markov chain on cell states.
- Let ? be equilibrium on cell states.
- Goal is to minimize average cost
- Same as maximize average time between resets.
20Variations
- Many possible variations
- Multiple variables change per step
- More general random processes for values
- Rules limiting transitions
- General costs, optimizations
- Hardness results?
- Conjecture some variations NP-hard or worse.
21Building BlockCode n 2, k 2, l 2
- 2 bit values.
- 2 cells.
- Code based on striped Gray code.
- Expected time/time with high probability before
reset 2q - o(q) - Asymptotically optimal for all p, 0 lt p lt 1.
- Worst case optimal approx 3q/2.
D(0,0) 00 D(1,3) 11 R((1,0),2,1) (2,0)
22Proof Sketch
- Even cells down with probability p, right with
probability 1-p. - Odd cells right with probability p, down with
probability 1-p. - Code hugs the diagonal.
- Right/down moves approximately balance for first
2q-o(q) steps.
23A Slightly Better Code
- Changing the final corner improves things.
24Performance Results
Scheme 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
2DWC,4 0.209 0.210 0.213 0.218 0.222 0.227 0.232 0.238 0.244
2DGC,4 0.212 0.215 0.217 0.218 0.218 0.218 0.216 0.216 0.212
2DGC,4 0.176 0.183 0.187 0.190 0.191 0.190 0.187 0.183 0.176
2DWC,8 0.092 0.093 0.094 0.094 0.095 0.096 0.097 0.098 0.100
2DGC,8 0.080 0.081 0.082 0.083 0.083 0.083 0.082 0.081 0.080
2DGC,8 0.075 0.077 0.078 0.079 0.079 0.079 0.078 0.077 0.075
25Codes for k l 2
- Break into Gray code blocks larger n.
- Each bit walks along diagonal of its own Gray
code block. - At the last block, behaves like n 2, k 2, l
2 - Expected deficiency O(sqrt(q)).
26Example
Bit 1 changes recorded from the left
.
Meet somewhere in the middle, depending on rates
.
Bit 2 changes recorded from the right
27Random Codes
- Average-case analysis looks at random data
- Natural also to look at random codes
(Shannon-style arguments) - We consider random codes in the setting of
general transitions. - All k bits can change simultaneously
- Give some insights into what may be possible.
- Results in paper.
28Conclusions
- New questions arising from flash memory.
- How to store data to maximize lifetimes.
- How to code to deal with errors.
- How to optimize algorithms and data structures.
- How to optimize memory hierarchies and
variable-type memory systems. - Big question is this a core science
game-changer? - How much should we be re-thinking?