Title: Practical Techniques for Searches on Encrypted Data
1Practical Techniques for Searches on Encrypted
Data
- Yongdae Kim
- kyd_at_cs.umn.edu
- Written by Song, Wagner, Perrig
2Contents
- Introduction
- Basic Cryptography
- Schemes
- Basic search
- Controlled Search
- Hidden query
- Final scheme
- Discussions
- Conclusion and open problems
3Introduction
- IEEE Symp. on Security and Privacy 2000
- Im not expert in database, but
- Desirable features
- Encrypted data
- Encrypted query
- Encrypted result
- Untrusted server
4Example
- Mail Server
- Fully trusted, i.e. sys admin can read my e-mail
? - Can build secure storage
- But need to sacrifice functionality
- Moving the computation to the data storage seems
to be very difficult - For example, how to search encrypted data?
5Nice Features
- Provably secure
- Controlled searching untrusted server cannot
search for a word without owners authorization - Hidden queries user may ask the untrusted server
to search for a secret word without revealing the
word - Fast and efficient
- Do not rely on public key algorithm
- Based on stream cipher
6Other Features
- Each document is divided up into words
- Assume it has same length
- Otherwise, pad or split it
- Certain computation on the ciphertext
- Search method
- Indexing
- advantageous for read-only data
- But faster search
- Sequential scan
7Basics
- Cryptography
- the study of mathematical techniques related to
aspects of information security - such as confidentiality, data integrity, entity
authentication, and data origin authentication.
8Taxonomy of Cryptographic Primitives
Arbitrary length hash functions
Unkeyed Primitives
One-way permutations
Random sequences
Block ciphers
Symmetric-key ciphers
Stream ciphers
Arbitrary length hash functions(MACs)
Security Primitives
Symmetric-key Primitives
Signatures
Pseudorandom sequences
Identification primitives
Public-key ciphers
Public-key Primitives
Signatures
Identification primitives
9Symmetric Key Encryption .
- Encryption key and decryption key are same
(mostly) - EK(M) C
- DK(C) M
- Ex. DES, AES, IDEA,
- Fast
- Based on simple operations (exor, shift,
substitute, rotate, ) - How to share a key?
10Block/Stream ciphers
- Block cipher
- breaks up the plaintext into blocks of a fixed
length, - and then encrypts one block at a time.
- Stream cipher
- takes the plaintext string and produces a
ciphertext string using keystream - M ? S C, C ? S M
- where S is a key stream, ? is a bit-wise
exclusive-or - S is generated by a key stream generator or
pseudo-random function
11Hash function/MAC
- Hash function
- computationally efficient function
- mapping binary strings of arbitrary length to
binary strings of some fixed length, - Cryptographic hash function
- One-way, collision-free
- MAC (Message authentication code)
- Keyed hash function
- Parties that share a key can check the integrity
of data - MACK(M) H(K1 H(K2, M))
12Notations
- Si i-th stream from stream cipher G, n-m bits
- Wi i-th word, n bits
- Ci i-th cipher text, n bits
- ? Bitwise exclusive-or
- Fk (x) MAC of x using key k, m bits output
13Scheme I Basic scheme
Plaintext
Wi
ciphertext
Stream Cipher
Si
FKi(Si)
FKi
- To search W
- Alice reveals ki where W may occur
- Bob checks if Wi ? Ci is of the form lts, FKi(s)gt
for some s - For unknown ki, Bob knows nothing
- To search W, either
- Alice reveal all ki, or ?
- Alice has to know where W may occur ?
14Scheme II Controlled search .
- Replace ki f k (Wi) where
- k is secret, never revealed
- f is another MAC with output size ki
- Reveal only f k (W) and W
- Bob identifies only location where W occurs
- But reveals nothing on the locations i where W !
Wi - Still does not support hidden search
15Scheme III Hidden Searches .
Plaintext
Wi
Ek
Ek(Wi)
ciphertext
Stream Cipher
Si
FKi(Si)
FKi
16Scheme III (Cntd)
- Let Xi Ek (Wi)
- After the pre-encryption, Alice has X1, , Xl
- Same as before, Ci Xi ? Ti where
- Xi Ek (Wi)
- Ti lt Si, Fki (Si) gt
- To search W, Alice queries (X, k) such that
- X Ek(W) and k fk(X)
17A problem of Scheme III
- Scheme III has a problem Guess what?
- If Alice generates ki fk(Ek(Wi)), she cannot
recover the plaintext from the ciphertext. - Ci Xi ? Ti where Ti lt Si, Fki (Si) gt
- To compute Xi from Ci, we have to know Ti
- Si can be computed easily
- How about Fki (Si)?
- The problem is ki
- To compute this, we have to know all Ek(Wi) for
all i - Ups! If you know all of these, why do you need
search?
18Scheme IV The Final Scheme .
- Fix
- Xi Ek (Wi) lt Li, Ri gt where Lin-m bits
- Tilt Si, Fki (Si) gt where kif k(Li) instead of
f k(Wi)
19Scheme IV The Final Picture
Plaintext
Wi
Ek
Ek(Wi)
ciphertext
Li
fk
Stream Cipher
ki
Si
Fki(Si)
FKi
20Practical Considerations
- Alice only needs to remember only one password k
- Supporting more advanced queries
- Boolean operations (W and W)
- Proximity queries (W near W)
- Phrase searches (W immediately precedes W)
21Dealing with variable length words
- Pick a long enough fixed-size block
- A fixed padding is required
- Inefficient in space
- Support variable length word with word length
- Instead of W, use lt lW, Wgt
- Move pointer bit by bit
- Longer scan time, but efficient space
22Index-based Search
- For large database applications
- Index contains a list of keywords
- each keyword points to documents containing it
- Methods
- Encrypt keyword and leave pointers unencrypted
- Encrypt pointers also
- Alice queries encrypted keyword, and Bob returns
encrypted pointers - Alice needs to spend extra round
- Update cost is expensive
23Conclusion and Open Problems
- Pretty efficient
- No public key operation
- Small message expansion
- Interesting, and useful ?
- Open problems
- Searching Record gt 13 ?_at_!
- Searching aa-zb needs 26 queries
24(No Transcript)