Title: Private Information Retrieval
1Private Information Retrieval
this slides are available atwww.dziemowski.net/Sl
ides
2AOL search data scandal (2006)
- 4417749
- clothes for age 60
- 60 single men
- best retirement city
- jarrett arnold
- jack t. arnold
- jaylene and jarrett arnold
- gwinnett county yellow pages
- rescue of older dogs
- movies for dogs
- sinus infection
Thelma Arnold 62-year-old widow Lilburn, Georgia
3Observation
- The owners of databases know a lot about the
users! - This poses a risk to users privacy.
- E.g. consider database with stock prices
- Can we do something about it?
- Yes, we can
- trust them that they will protect our secrecy,
- or
- use cryptography!
problematic!
Disclaimer Not yet practical...
4How can crypto help?
database D
user U
- Note this problem has nothing to do with secure
communication!
5Our settings
secure link
database D
user U
A new primitive Private Information Retrieval
(PIR)
6Plan
- Definition of PIR
- An ideal PIR doesnt exist
- Construction of a computational PIR
- Open problems
- Literature
- B. Chor, E. Kushilevitz, O. Goldreich and M.
Sudan, Private Information Retrieval, Journal
of ACM, 1998 - E. Kushilevitz and R. Ostrovsky
- Replication Is NOT Needed SINGLE Database,
- Computationally-Private Information Retrieval,
FOCS 1997
7Question
- How to protect privacy of queries?
database D
user U
wants to retrieve some data from D
shouldnt learn what U retrieved
8Lets make things simple!
?
database B
B1 B2 Bw
index i 1,,w
Bi
the user should learn Bi
each Bi ? 0,1
(he may also learn other Bis)
9Trivial solution
B1 B2 Bw
The database simply sends everything to the user!
10Non-triviality
- The previous solution has a drawback
- the communication complexity is huge!
- Therefore we introduce the following requirement
- Non-triviality
- the number of bits communicated between U and D
has to be smaller than w.
11Private Information Retrieval (PIR)
polynomial time randomized interactive algorithms
This property needs to be defined more formally!
input
B1 B2 Bw
input index i 1,,w
- at the end the user learns Bi
- the database does not learn i
- the total communication is lt w
- Note secrecy of the database is not required
12How to define secrecy of the user 1/2?
Def. T(i,B) transcript of the conversation.
i
B
13How to define secrecy of the user 2/2?
Secrecy of the user for every i,j ? 0,1
?
single-round case it is impossible to
distinguish between Q(i) and Q(j)
multi-round case it is impossible to
distinguish between T(i,B) and T(j,B) even if
the adversary is malicious
What does it mean? For now say the
distribution of Q(i) and Q(j) is the same
14PIR doesnt exists 1/4
- We now show that correctness, non-triviality and
secrecy cannot be satisfied simultaneously. - Def A transcript T is possible for (i,B) if
P(T(i,B) T) gt 0 - Take some T, and look where it is possible
T T
T T
databases B
indices i
15PIR doesnt exists 2/4
- Observation
- secrecy ? if
- T is possible for some B and i
- then
- it is possible for B and all the other is
T T T T T T T T T T T T T
T T T T T T T T T T T T T
T T
T T
databases B
indices i
16PIR doesnt exists 3/4
- non-triviality ? length(transcript) lt
length(database) - ?
- transcripts lt databases
- ?
- there has to exist T that is possible for
- two databases B0 and B1
T T T T T T T T T T T T T
T T T T T T T T T T T T T
? B0
databases B
? B1
indices i
17PIR doesnt exists 4/4
- B0 and B1 differ on at least one index i
- So, if i is the input of the user then
- correctness ? contradiction
i ?
T T T T T T T T T T T T T
T T T T T T T T T T T T T
? B0
databases B
? B1
indices i
18So PIR doesnt exist!
- How to bypass the impossibility result?
- Two ideas
- limit the computing power of a cheating database
- use a larger number of independent databases
19Computationally-secure PIR
computational-secrecy
?
For every i,j ? 0,1 it is impossible to
distinguish efficiently between T(i,B) and
T(j,B)
Formally for every polynomial-time probabilistic
algorithm A the value P(A(T(i,B)) 0)
P(A(T(j,B))0) should be negligible.
20Computational security in crpyptography
- most of the constructions in cryptography
- imply that
- P ? NP
- So, the best we can hope for is to
- construct protocols with a conjectured
computational security. - Two approaches to cryptography
- construct protocols that look secure
- base security on some well-known hardness
assumption.
This is sometimes called provable security.
21Hardness assumptions?
- A great source of hard problems is
- the number theory.
- KO97 construct PIR based on the
- Quadratic Residuosity Assumption
- We describe it on the next slides.
22Algebraic preliminaries Zm
- Fact Zm0,1,,m-1, with addition modulo m is
a group. - Is it a group also with multiplication modulo m?
- No Suppose that x ? Zm is not relatively prime
to m, and let - d gcd(x,m).
- Then for every i ? Zm we have that i x is
divisible by d, and hence - i x ? 1 mod m.
- So, x does not have an inverse!
- But every x ? Zm relatively prime to m, has an
inverse, which can be computed using the extended
Euclidean algorithm. - Hence the set Zmx x ? Zm such that x is
relatively prime to m is a multiplicative group! - Fact for any prime p the group Zp 1,...,p-1
is cyclic.
Z12
0 1 2 3 4 5 6 7 8 9 10 11
0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 1 2 3 4 5 6 7 8 9 10 11
2 0 2 4 6 8 10 12 2 4 6 8 10
3 0 3 6 9 0 3 6 9 0 3 6 9
4 0 4 8 0 4 8 0 4 8 0 4 8
5 0 5 10 3 8 1 6 11 4 9 2 7
6 0 6 0 6 0 6 0 6 0 6 0 6
7 0 7 2 9 4 11 6 1 8 3 10 5
8 0 8 4 0 8 4 0 8 4 0 8 4
9 0 9 6 3 0 9 6 3 0 9 6 3
10 0 10 8 6 4 2 0 10 8 6 4 2
11 0 11 10 9 8 7 6 5 4 3 2 1
23Favourite cryptographers group
- p,q large random primes (pq21024, say)
- RSA group
- Zn, where npq
- How to select a random prime?
- Just take a random number and test if it is
prime! - Testing primality is easy Rabin-Miller test
- By the prime number theorem
- P(random x of length t is prime) 1/ln(t)
- so primes are dense.
24Chinese remainder theorem 1/3
Z15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
i
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
i mod 5
0 1 2 0 1 2 0 1 2 0 1 2 3 1 2
i mod 3
i mod 5
0 1 2 3
4
0
1
2
3
0
6
9
12
i mod 3
1
4
7
10
13
2
5
8
11
14
25Its not always like this!
Consider p 4 and q 6
i mod 6
Z24
0 1 2 3 4 5
0 0,12 8,20 4,16
1 1,13 9,21 5,17
2 6,18 2,14 10,22
3 7,19 3,15 11,23
i mod 4
26Chinese remainder theorem 3/3
- Chinese remainder theorem (CRT)
- For n pq (where p and q are prime) a function
? Zn ? Zp Zq - defined as
- ?(i) (i mod p, i mod q)
- is a bijection.
- Proof
- If ?(i) ?(j) then
- i mod p j mod p ? p divides i-j
- and i mod q j mod q ? q divides i-j
because p and q are prime
n divides i-j
i j mod n
27? is an isomorphism
- Moreover ? Zn ? Zp Zq is an isomorphism!
- Proof
- ?(a b) (a b mod p, a b mod q)
- (a mod p b mod p, a mod q b
mod q) - ?(a) ?(b)
-
this is an operation in Zp
28Zn vs. Zn
?(i) (i mod p, i mod q)
What if we restrict ? to Zn ?
Observation 1 ? is also an isomorphism Zn?
Zp Zq.
Observation 2 Zn (p-1)(q-1)
Z5
Z15
0 1 2 3
4
0
1
2
0
3
6
9
12
Z3
1
4
7
10
13
Z15
2
5
8
11
14
29How does it look for large p and q?
mod p
Zn
Zn
mod q
30Quadratic Residues
Def. x is quadratic residue modulo m if there
exists a ? Zm such that x a2 mod m QR(m)
the set of all quadratic residues modulo
m. QNR(m) Zm \ QR(n)
Z13
1 2 3 4 5 6 7 8 9 10 11 12
a
1 4 9 3 12 10 10 12 3 9 4 1
a2
QR(13)
1 4 9 3 12 10
Observation every quadratic residue modulo 13
has exactly 2 square roots, and hence QR(13)
Z13 / 2.
31A Lemma about QRs modulo prime p
- Lemma
- For every prime p we have QR(p) (p-1)/2
- Proof
- We show that
- every quadratic residue has exactly 2 square
roots in Zp. - Suppose that a2 b2 mod p, where a,b ? Zp.
- Thus p divides a2 - b2 (a b)(a b).
- Hence either
- p divides a b ? a b, or
- p divides a b ? a p b.
- Remark
- Let g be a generator of Zp.
- Then QR(p) g0,g2,g4,g6,...,gp-3.
- QNR(p) g1,g3,g5,g7,...,gp-2.
32QRs modulo pq
Z15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
a
0 1 4 3 1 5 6 4 4 9 10 1 12 4 1
a2
QR(15)
1 4
Observation every quadratic residue modulo 15
has exactly 4 square roots, and hence QR(15)
Z15 / 4.
33A Lemma about QRs modulo pq
- Fact For npq we have QR(n) Zn / 4.
- Proof
- x ? QR(n)
- iff
- x a2 mod n, for some a
- iff (by CRT)
- x a2 mod p and x a2 mod q
- iff
- x mod p ? QR(p) and x mod q ? QR(q)
mod p
QR(p)
Zn
QR(q)
QR(n)
mod q
34QRs modulo pq an example
22 mod 5 32 mod 5
QR(5)
12 mod 5 42 mod 5
Z15
0 1 2 3
4
0
1
2
0
3
12
6
9
QR(3)
7
10
13
1
4
2
5
8
11
14
12 mod 3 22 mod 3
QR(5)
Z15
35Homomorphism of QR(pq)
1 if a ? QR(n) 0 otherwise
- Res(n,a)
- Homomorphism for all a,b ? Zn
- Res(n,ab) Res(n,a) xor Res(n,b)
- Proof
- It is enough to show it modulo a prime p
- g generator of Zp
- Recall that
- a ? QR(p) iff agv mod p where v is even.
- Hence
- ab is a QR iff
- ab is an even power of g
- iff
- (a is an even power of g) AND (b is an even
power of g) - OR (a is an odd power of g) AND (b is an odd
power of g)
both a and b are a QR
both a and b are a QNR
36Algorithmic questions about QR
- Suppose npq
- Is it easy to test membership in QR(n)?
- Fact if one knows p and q yes!
- What if one doesnt know p and q?
37Quadratic Residuosity Assumption (QRA)
?
- npq, where p and q are large primes
Note Zn is a group!
a ? Zn ?
QR(p)
Zn
QNR(p)
Zn all a ? Zn such that a mod p ?
QR(p) iff a mod q ? QR(q)
QR(q)
QR(n)
QNR(q)
Quadratic Residuosity Assumption (QRA) For a
random a ? Zn it is computationally hard to
determine if a ? QR(n). Formally for every
polynomial-time probabilistic algorithm G the
value P(G(a) Res(a)) 0.5 (where a is
random) is negligible.
38We are ready to construct PIR!
- Our PIR will work in the group Zn, where npq.
- Whats so good about this group?
- testing membership in QR(n) is hard for random
elements on Zn, unless one knows p and q. - homomorphism of Res!
39 First (wrong) idea
B1 B2 ... Bi-1 Bi Bi1 ... Bw-1 Bw
i
i ?
QR X1 QR X2 ... QR Xi-1 NQR Xi QR Xi1 ... QR Xw-1 QR Xw
Yi is a QR iff Bj0
QR Y1 QR Y2 ... QR Yi-1 Yi QR Yi1 ... QR Yw-1 QR Yw
M is a QR iff Bj0
the user checks if M is a QR
Set M Y1 Y2 ... Yw
40Problems!
- PIR from the previous slide
- correctness v
- security?
- The to learn i the database would need to
distinguish NQR from QR. v
QR X1 QR X2 ... QR Xi-1 NQR Xi QR Xi1 ... QR Xw-1 QR Xw
- non-triviality? doesnt hold!
- communication user ? database B Zn
- database ? user Zn
Call it (B, 1) - PIR
41 How to fix it?
IdeaGiven construct Suppose that
B v2 and present B as a vv-matrix
B13 B14 B15 B16
B9 B10 B11 B12
B5 B6 B7 B8
B1 B2 B3 B4
42Idea that works
v
Looks even worse communication user ?
database v2 Zn database ? user v Zn
B1 B2 B3 B4
B5 B6 B7 B8
v
B9 B10 B11 B12
B13 B14 B15 B16
The method
Let j be the column where Bi is. In every row
the user asks for the jth element So, instead of
sending v queries the user can send
one! Observe in this way the user learns all
the elements in the jth column!
j ?
Bi
43Putting things together
B1 ... Bj-1 Bj Bj1 ... Bv
Bi
... ... Bvv
i
kth row
QR X1 ... QR Xj-1 NQR Xj QR Xj1 ... QR Xv
X1 ... Xj-1 Xj Xj1 ... Xv
X1 ... Xj-1 Xj Xj1 ... Xv
M1
Mk
Mv
Y1 ... Yj-1 Yj Yj1 ... Yv
... Yvv
M1
...
Mv
Bj0 iff Mk is QR
multiply elements in each row
44So we are done!
- PIR from the previous slide
- correctness v
- non-trivialitycommunication complexity 2vB
Zn v - security?
- The to learn i the database would need to
distinguish NQR from QR. - Formally
- fromany adversary that breaks our scheme we
can construct an algorithm that breaks QRA
simulates
45Improvements
database D
user U
the user is interested just in one Mi.
Idea apply PIR recursively!
46Complexity of PIRs overview of the results
their conclusion It is the time-complexity that
matters. In real-life it is still more
practical to transmit the entire database.
- Communication
- recursive PIR of KO97
- for every c O(Bc)
- Cachin, Micali, Stadler, 1999
- poly-logarithmic in B
- Lipmaa, 2005
- O(log2B)
- For practical analysis see
- Sion, Carbunar
- On the Computational Practicality of Private
Information Retrieval.
47Extensions
- Symmetric PIR (also protect privacy of the
database). - Gertner, Ishai, Kushilevitz, Malkin. 1998
- Searching by key-words
- Chor, Gilboa, Naor, 1997
- Public-key encryption with key-word search
- Boneh, Di Crescenzo, Ostrovsky, Persiano
48Open problems
- Improve efficiency.
- Construct new extensions.
What was the key property that we
used? homomorphism of QR Holy
grail fully-homomorphic encryption
49Fully-homomorphic encryption
Observe that we constructed a 1-bit probabilistic
public-key encryption scheme
random QR if X 0
random NQR if X 1
Enc(X)
Which has the following homomorphic with respect
to xor Enc(X xor Y) Enc(X) Enc(Y)
It would be really useful to have an
encryption scheme homomorphic with respect
to conjunction and negation simultaneously.
50Thank you!
Questions?