Lecturer: Moni Naor - PowerPoint PPT Presentation

About This Presentation

Title:

Lecturer: Moni Naor

Description:

Lecturer: Moni Naor. Foundations of Privacy. Informal Lecture. Anti-Persistence or History Independent Data Structures. Why hide your history? Core dumps ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 43

Provided by: wisdomWe

Category:

more less

Transcript and Presenter's Notes

Title: Lecturer: Moni Naor

1
Foundations of PrivacyInformal Lecture
Anti-Persistence or History Independent Data
Structures

Lecturer Moni Naor

2
Why hide your history?

Core dumps
Losing your laptop
The entire memory representation
of data structures is exposed
Emailing files
The editing history may
be exposed (e.g. Word)
Maintaining lists of people
Sports teams, party invitees

3
Election Day
Carol
Alice
Alice
Bob

Elections for class president
Each student whispers in Mr. Drews ear
Mr. Drew writes down the votes

Carol

ProblemMr. Drews notebook leaks sensitive
information
First student voted for Carol
Second student voted for Alice

Alice
Alice
Bob
3
4
Learning from history only whats necessary

A data structure has
A legitimate interface the set of operations
allowed to be performed on it
A memory representation
The memory representation should reveal no
information that cannot be obtained from the
legitimate interface

5
History of history independence

Issue dealt with in Cryptographic and Data
Structures communities
Micciancio (1997) history independent trees
Motivation incremental crypto
Based on the shape of the data structure, not
including memory representation
Stronger performance model!
Uniquely represented data structures
Treaps (Seidel Aragon), uniquely represented
dictionaries
Ordered hash tables (Amble Knuth 1974)

6
More History

Persistent Data Structures possible to
reconstruct all previous states of the data
structure (Sarnak and Tarjan)
We want the opposite anti-persistence
Oblivious RAM (Goldreich and Ostrovsky)

7
Overview

Definitions
History independent open addressing hashing
History independent dynamic perfect hashing
Memory Management
(Union Find)
Open problems

8
Precise Definitions

A data structure is
history independent if any two sequences of
operations S1 and S2 that yield the same content
induce the same probability distribution on the
memory representation.
strongly history independent if given any two
sets of breakpoints along S1 and S2 s.t.
corresponding points have identical contents, S1
and S2 induce the same probability distributions
on memory representation at those points.

Alternative Definition transition probability
9
Relaxations

Statistical closeness
Computational indistinguishability
Example where helpful erasing
Allow some information to be leaked
Total number of operations
n-history independent identical distributions if
the last n operations where identical as well
Under-defined data structures same query can
yield several legitimate answers,
e.g. approximate priority queue
Define identical content no suffix T such that
set of permitted results returned by S1?T is
different from the one returned by S2?T

10
History independence is easy (sort of)

If it is possible to decide the
(lexicographically) first sequence of
operations that produce a certain content, just
store the result of that
This gives a history independent version of a
huge class of data structures
Efficiency is the problem

11
Dictionaries

Operations are insert(x), lookup(x) and possibly
delete(x)
The content of a dictionary is the set of
elements currently inserted (those that have been
inserted but not deleted)
Elements x ? U some universe
Size of table/memory N

12
Goal

Find a history independent implementation of
dictionaries with good provable performance.
Develop general techniques for history
independence

13
Approaches

Unique representation
e.g. array in sorted order
Yields strong history independence
Secret randomness
e.g. array in random order
only history independence (not strongly)

14
Open addressing traditional version

Each element x has a probe sequence
h1(x), h2(x), h3(x), ...
Linear probing h2(x) h1(x)1, h3(x) h1(x)2,
...
Double hashing
Uniform hashing
Element is inserted into the first free space in
its probe sequence
Search ends unsuccessfully at a free space
Efficient space utilization
Almost all the table can be full

15
Open addressing traditional version
Not history independent later-inserted elements
move further along in their probe sequence
x
y
x arrived before y, so move y
y
y
No clash, so insert y
16
History independent version

At each cell i, decide elements priorities
independently of insertion order
Call the priority function pi(x,y).
If there is a clash, move the element of lower
priority
At each cell, priorities must form a total order

17
Insertion
x
y
x
p2(x,y)? No, so move x
x
y
x
18
Search

Same as in the traditional algorithm
In unsuccessful search, can quit as soon as you
find a lower-priority element

No deletions

Problematic in open addressing
Possible way out - clusters

19
Strong history independence

Claim
For all hash functions and priority functions,
the final configuration of the table is
independent of the order of insertion.
Conclusion
Strongly history independent

20
Proof of history independence
A static insertion algorithm (clearly history
independent)
Gather up the rejects and restart
x2
x1
p1(x2,x1) so insert x2
x2
x2
x3
x1
x3
x1
p3(x4,x5) and p3(x4,x6). Insert x4 and remove x5
x5
x5
x5
x6
x4
insert x5
x4
x4
x5
x4
x4
x5
x6
x5
x2
x2
x1
x1
x4
x3
x6
p6(x6,x4) and p6(x3,x6), so insert x3
x3
x3
21
Proof of history independence

Nothing moves further in the static algorithm
than in the dynamic one
By induction on rounds of the static alg.
Vice versa
By induction on the steps in the dynamic alg.
Strongly history independent

Alternative view Blelloch-Golovin Stable
Matching
22
Some priority functions

Global
A single priority function independent of cell
Random
Choose a random order at each cell
Youth-rules
Call an element younger if it has moved less
far along its probe sequence younger elements
get higher priority

23
Youth-rules
y
p2(x,y) because x has taken fewer steps than y
y
x

Use a tie-breaker if of steps the same
This is a priority function

y
x
24
Specifying a scheme

Priority rule
Choice of priority functions
In Youth-rules determined by probe sequence
Probe functions
How are they chosen
Maintained
Computed

25
Implementing Youth-rules

Let each hi be chosen from a pair-wise
independent collection
For any two x and y the r.v. hi(x) and hi(y) are
uniform and independent.
Let h1, h2, h3, be chosen independently
Example hi(x) (aix mod U) bi mod N
Space two elements per function
Need only log N functions

Prime
26
Performance Analysis

Based on worst-case insertion sequence
The important parameter ? - the fraction of the
table that is used ?N elements
Analysis of expected insertion time and search
time (number of probes to the table)
Have to distinguish successful and unsuccessful
search

27
Analysis via the Static Algorithm

For insertions, the total number of probes in
static and dynamic algorithm are identical
Easier to analyze the static algorithm
Key point for Youth-rules in the phase i all
unsettled elements are in the ith probe of their
sequence
Assures fresh randomness of hi (x)

28
Performance

For Youth-rules, implemented as specified
For any sequence of insertion the expected
probe-time for insertion is at most 1/(1-?)
For any sequence of insertions the expected
probe-time for successful or unsuccessful search
is at most 1/(1-?)
Analysis based on static algorithm
? is the fraction of the table that is used

29
Comparison to double hashing

Analysis of double hashing with truly random
functions Guibas Szemeredi, Lueker
Molodowitch
Can be replaced by log n wise independent
functions Schmidt Siegel
log n wise independent is relatively expensive
either a lot of space or log n time
Youth-rules is a simple and provably efficient
scheme with very little extra storage
Extra benefit of considering history independence

30
Other Priority Functions

Amble Knuth log(1/(1-?)) for global
Truly random hash functions
Experiments show about log(1/(1-?)) for most
priority functions tried
Performance is for amortized search

31
Other types of data structures

Memory management (dealing with pointers)
Memory Allocation
Other state-related issues

32
Dynamic perfect hashingFKS scheme, dynamized
Low-level tables O(n) space total. Each gets
about si2
n elements to be inserted
Top-level table O(n) space
h0
x1
x3
s0
h1
h
s1
x5
x4
x6
x2
hk
sk
The hi are perfect on their respective sets.
Rechoose h or some hi to maintain perfection and
linear space.
33
A subtle problemthe intersection bias problem

Suppose we have
a set of states ?1, ?2, ...
a set of objects h1, h2, ...
a way to decide whether hi is good for ?j.
Keep a current h as states change
Change h only if it is no longer good.
Choose uniformly from the good ones for ?.
Then this is not history independent
h is biased towards the intersection of those
good for current ? and for previous states.

34
Dynamized FKS is not history independent

Does not erase upon deletion
Uses history-dependent memory allocation
Hash functions (h, h1, h2, ...) are changed
whenever they cease to be good
Hence they suffer from the intersection bias
problem, since they are biased towards functions
that were good for previous sets of elements
Hence they leak information about past sets of
elements

35
Making it history independent

Use history independent memory allocation
Upon deletion, erase the element and rechoose the
appropriate hi. This solves the low-level
intersection bias problem.
Some other minor changes
Solve the top-level intersection bias problem...

36
Solving the top-level intersection bias problem

Cant afford a top-level rehash on every deletion
Generate two potential hs ?1 and ?2 at the
beginning
Always use the first good one
If neither are good, rehash at every deletion
If not using ?1, keep a top-level table for it
for easy goodness checking (likewise for ?2)

37
Proof of history independence

Tables state is defined by
The current set of elements
Top-level hash functions
Always the first good ?i, or rechosen each step
Low-level hash functions
Uniformly chosen from perfect functions
Arrangement of sub-tables in memory
Use history-independent memory allocation
Some other history independent things

38
Performance

Lookup takes two steps
Insertion and deletion take expected amortized
O(1) time
There is a 1/poly chance that they will take more

39
SHI and Unique Representation

Theorem Hartline et al for a reversible data
structure to be SHI, a canonical (unique)
representation for each state must be determined
during the data structures initialization.

40
SHI with Deletions

Blelloch and Golovin a dictionary based on
linear probing
Goal search in O(1) time (guaranteed)
Each cluster of size O(log n)
Can be obtained using 5-wise independence Pagh
et al., STOC 2007
Needs random oracle for high level intersection
bias

41
Open Problems

Better analysis for youth-rules as well as other
priority functions with no random oracles.
Efficient memory allocation
ours is O(s log s)
Separations
Between strong and weak history independence
Buchbinder-Petrank
Between history independent and traditional
versions
e.g. for Union Find
Can persistence and (computational) history
independence co-exist efficiently?

42
References

Moni Naor and Vanessa Teague, Anti-persistence
History Independent Data Structures, STOC, 2001.
Hartline, Hong, Mohr, Pentney and Rocke,
Characterizing History Independent Data
Structures, Algorithmica 2005
Buchbinder and Petrank, Lower and upper bounds on
obtaining history independence, Information and
Computation 2006.
Guy Blelloch and Daniel Golovin, Strongly
History-Independent Hashing with Applications,
FOCS 2007
Tal Moran, Moni Naor and Gil Segev Deterministic
History-Independent strategies for Storing
Information in Write-Once Memories, ICALP 2007