Title: StrangerDB Safe Data Management with Untrusted Servers
1StrangerDB --Safe Data Managementwith Untrusted
Servers
- Dennis Shasha (shasha_at_cs.nyu.edu)
2Goals
- Store private data in a public database backup,
concurrency control, and some query processing - Protect data from being observed (privacy)
- Make unauthorized modifications evident (safety)
- Force server to deliver a consistent picture to
all honest users or be discovered (consistency). - Dishonest users have the same effect as users who
enter bad data.
3Methods
- Encryption per user/group for privacy.
- Signatures for tamper-evidence
- SUNDR-style 1 maintenance to detect
inconsistent transaction orders. - 1. "Building secure file systems out of
Byzantine storage", David Mazieres and Dennis
Shasha, Principles of Distributed Computing,
2002. pp. 108-117.
4Database Setup for Privacy
- A record is a sequence of cleartext field values
plus an encrypted part that may encompass one or
more fields. - The encryption is known to one or more users.
- Decryption is done at the most public processor
possible users workstation or smartcard if
private to a user, workstation of group member if
decryption pertains to group-owned information. - Encryption can be private key encryption.
5Privacy Related Optimization Problems
- There are classical optimization problems to be
solved in this framework. - Example if some data is private to a user and
other data belongs to a group, do I do the group
processing first and then bring the result to the
private workstation or do I bring all the data to
the private workstation right away?
6Work Related to PrivacyConsiderations
- Hakan Hacigumus, Bala Iyer, Chen Li and Sharad
Mehrotra. "Executing SQL over Encrypted Data in
the Database Server Provider Model. ACM Sigmod
2002 advocates a field by field encryption
idea they map queries to encrypted values.
Sometimes encryption preserves order and
sometimes not. - Matthias Fischmann and Oliver Günther Privacy
Tradeoffs in Database Service Architectures,
(BIZSEC'03) points to security leaks in this
model if the adversary can ask queries. Even if
not, encrypted fields yield information about the
number of distinct values and their distribution.
7Work Related to PrivacyConsiderations
- Hippocratic Databases Rakesh Agrawal, Jerry
Kiernan, Ramakrishnan Srikant, Yirong Xu, VLDB
2002. Argues that databases should provide
mechanisms to preserve privacy including
properties like consent of information donor,
limited use, limited retention etc. - Encryption gives consent of information donor
only. Once I give you my key, I have little
further control. Changing the key does prevent
recipient from learning new data.
8Our Take on PrivacyConsiderations
- We are agnostic you encrypt what you want and
issue queries and updates to achieve your privacy
goals. - For purposes of this talka database designer
knows exactly which information is revealed to
non-owners everything that is in the clear. - Non-owners may not issue queries to or modify
your data.
9Tamper Evidence (safety)
- Every data item can be modified by exactly one
user or group. A modifier signs a
collision-resistant hash of the encrypted result
of the data after modification. - Note Need trusted public repository of public
signature keys (e.g. provided by company security
officer)
10Collision-Resistant Hash SetupInspired by
Merkle Trees
sgn_user(HASH (root), ptr)
DATA RECORD 1
DATA RECORD 2
HASH1, ptrs
HASH2, ptrs
DATA RECORD n
11Malicious servers and forks
- If a user u accesses a database at time t, the
user wants to be sure that the database is
current as of time t. - A malicious server might give u a database state
reflecting only some previous updates. - Users cannot prevent such forking attacks but
would like to discover them quickly.
12Underlying Strategy to ensure inter-user
consistency
- Periodically, every pair of users exchange their
ideas of global history. If one member of the
pair has missed an update done by the other, the
histories wont be consistent. - For this to work in practice, we need some
encoding of that global history.
13Underlying Strategy intuition
Bob Alice do you agree that Mary said X?
Alice No way. I never heard that!
Bob But here is Marys signed statement to that
effect.
Alice Well I guess the server is messing with
history then.
14Strawman ImplementationLog of Global Operations
- Imagine that we have a sequential log consisting
of every transaction that ever hit the database
and that this sequential log is signed by the
last transaction. - Ex order of transactions is T1 T2 T3 (done by
users u1 u2 u3). Log after T3 - sgn_u3(T3 sgn_u2(T2 sgn_u1(T1)))
15Ensuring Individual Consistency
- Log is held by the untrusted server.
- Every time a user appends to and signs the log,
the user first checks that the log he/she
previously signed is a prefix of log to be
signed. - Ex if u is about to commit transaction T and
has previously committed T, then u makes sure
that the log now contains as a prefix the log
from the time T was committed.
16Individual Log Consistency
Bob In my previous update, the log had (in left
to right order) Talice1, Tbob1 Now it has
Talice1, Tbob1, Tmary1. So my previous view of
the log was a prefix of the current one. Ok, so
Ill append my new transaction Talice1, Tbob1,
Tmary1, Tbob2
Alice hears nothing of all this.
17Ensuring Global Consistencyby detecting forking
attacks
- Periodically, users exchange their ideas of the
global log. Each user verifies - The signatures of all the users
- Whether one global history is a prefix of the
other or not. - If there has been a forking attack and u1 has not
seen a transaction of u2, then neither users log
will be a prefix of the other.
18Global Log Consistency
Bob Alice, here is the log of all transactions
as I see it Talice1, Tbob1, Tmary1, Tbob2
Alice Thats funny. Here is my log Talice1,
Tbob1, Tjill1. It is not a prefix of yours
because I have Jills transaction, but yours is
not a prefix of mine because you have Marys
transaction.
Bob Are all the signatures good?
Alice Absolutely. See for yourself. Server is
being naughty again.
19Semantic Objection
- Server might fail to update the data but assert
that the transactions are executed in the same
global order. - Fix associate with each transaction a
collision-resistant hash of the state of the
whole database. Call these h1, h2, h3 - sgn_u3(T3 h3 sgn_u2(T2 h2 sgn_u1(T1 h1)))
- Transactions verify global hash upon data access.
20Hashes of all the data
Bob Alice, the log says you were the last to
execute, yet when I perform a collision-resistant
hash of the database, the result Is not
consistent with that hash.
Alice Its lucky I signed the hash thatI
placed in the log. That shows the stateof the
database I think is present.
Bob Darn server has been changing the data
again!
21Practical objection space grows without bound
- In this log-based (strawman) implementation, each
user keeps log of all transactions ever done! - Alternative is to have each user update his/her
version for every access (even read-only access). - A version structure is basically a set of
user-version pairs a hash of the data of the
signer. - Space per version structure proportional to
number of users N. Because each user needs to
keep the latest version structure for each user,
the total space per user is N2
22Version Structure Detail
- Suppose user u creates the last version
structure. Then u increments his/her version
number (and no other) and signs the structure,
which containssgn_u(hash of data owned by u,
(u1, n1), (u2, n2), )where (ui, ni) means ui
is at version ni. - From now on, call hash of the data owned by u
hash(udata)
23Basic Properties of version structures
- Because of the signature, the server cannot forge
a version structure. - Because of the collision-resistant hash, each
users data can be verified to be what that user
intended. - Each user maintains a version structure list of
the most recent version structure from each user.
24Use of Version Structure List
Bob Alice, according to the version structure
list you were the last to execute, yet when I
perform a collision-resistant hash of the
database, the result Is not consistent with that
hash.
Alice Its lucky I signed the hash thatI
placed in the version structure list. That shows
the stateof my data I think is present.
Bob Darn server has been changing the data
again!
25Three incrementally related version structures
Bob sgn_Bob(hash(Bobdata), (Bob, 6), (Alice,
12), (Bill, 4))
Alice sgn_Alice(hash(Alicedata), (Bob,6),
(Alice,13), (Bill,4))
Bob sgn_Bob(hash(Bobdata), (Bob,7), (Alice, 13),
(Bill,4))
26Ordering Properties of version structures
- Define a partial order on version structuresvs1
lt vs2 if the users in vs1 are a subset of the
users in vs2 and for every user u in vs1, the
version of u in vs1 (denoted vs1u) is less than
or equal to vs2u and for at least one user v,
vs1v lt vs2v. - We say vs1 is incrementally less than vs2 if
there exists a u such that u signs vs2, vs2u
vs1u 1 and for all v, if v ! u then vs2v
vs1v.
27Version Structure Construction
- User u forms its new version structure vs_u as
followsu first examines the previous version
structure that u signed vs_u_old and sets vs_uu
vs_u_oldu1. - Next u examines the last version structure vs_v
signed by each other user v and sets vs_uv
vs_vv. - In this way u creates a version structure that
reflects the last signed version of every user.
28Signing Verification Protocol Part I
- When a user u is ready to sign the version
structure vs_u constructed as above, u checks
that 1. the highest version number for every
user v is in the last version structure vs_v
signed by v. (Other version structures may have
the same highest version number for v as well,
but they may not exceed vs_vv.) 2. There is
some ordering such that each version structure in
the list is incrementally less than the next one
on the list and vs_u is the greatest.
29Signing Verification Protocol Part II
- The set of all data belonging to a user v is
hashed to a value hash(vdata) as of the last
signed version structure of v vs_v. - When user u reads vs data, it checks vs data
against hash(vdata) to verify that vs data
hasnt been changed since the signing of vs_v. - If both the parts of the protocol succeed, then u
signs the version structure vs_u and commits the
transaction.
30Signing Verification Protocol
Bob Alice, heres the drill. You issue a
transaction. It accesses data from many people.
You check that the data you have read from each
person is consistent with the signed hash on
his/her last version structure.
Alice How do I know its that persons last
version structure?
Bob Good question. You check that all the
version structures are incrementally related to
one another. You are checking that the server is
consistent in what it tells you.
Alice Thats not enough is it?
Bob No, but forking will leave traces of guilt.
31Forking attacks on honest clients
createincomparable version structures
- If the server fails to show user v the version
structure vs_u produced by user u, the version
structure that v will sign, call it vs_v, will
have the property vs_vu lt vs_uv. Once v
signs, vs_vv gt vs_uv. - So vs_u and vs_v will be unordered by lt.
- The signing verification protocol will still
succeed. So we need a global protocol.
32Forking creates incomparable version structures
Bob sgn_Bob(hash(Bobdata), (Bob, 6), (Alice,
12), (Bill, 4))
Alice sgn_Alice(hash(Alicedata), (Bob,6),
(Alice,13), (Bill,4))Server forks and doesnt
show Bob this.
Bob sgn_Bob(hash(Bobdata), (Bob,7), (Alice, 12),
(Bill,4)) Now, Bob and Alices last version
structures are incomparable, i.e. unordered by lt.
33Version Structure Exchange I
- Users periodically perform a global version
structure exchange protocol. Let us say that such
a protocol begins at global time t. Every user u
sends the most recent version structure that u
signed before time t to every other user. Call
that vs_u. - When a user v receives vs_u from user u, then
user v performs a well-formedness test v
compares its most recent version structure signed
before t, call it vs_v, with vs_u. They should be
ordered by lt and vs_vv gt vs_uv and vs_vu
lt vs_uu.
34 Version Structure Exchange II
- If v performs a well-formedness test for every
user in U and the version structures from those
users are all ordered by lt, then v declares those
version structures to be all well-formed. - If every user v in some set of users U declares
the version structures it receives from users U
to be well-formed, then the global structure
exchange is said to succeed for U.
35Global Version Exchange Protocol
Bob Alice, from time to time, a global version
exchange protocol begins. Lets say an instance
of the protocol starts at time t. Every user
sends its latest version structure preceding t to
all other users. Sending is done without
mediation by server.
Alice Then what?
Bob Each user checks that the version
structures are well-formed.
Alice What if some user does not send?
Bob No problem. Validate the ones that do send.
36Version Structures and Serializability
- Serializability will be based on version
structure order. - That is, transactions will serialize in the lt
order of version structures.
37Role of concurrency controlin correctness
- Locking is merely a heuristic that the server
uses to delay transactions and therefore to give
a serial order to version structures. - If the server cheats and allows accesses that
violate locks, then the version structures wont
be ordered by lt. - Later caught by the signing verification protocol
or the global exchange.
38The interesting case ofmultiversion read
consistency
- Effectively, a multiversion read consistent
transaction should make its version structure
reflect its start time. So, the user associated
with such a transaction signs its version
structure when it starts, then starts reading. - If that transaction never commits (because some
data has changed and transaction detects this by
looking at a hash), there is no damage because
the database wont change and the application
issuing the transaction will receive a failure as
it should.
39Proof Strategy
- If all users are honest, but the server may not
be, then the theorems are not that hard to prove. - If some users could be dishonest, then we could
have major problems, e.g. they could corrupt the
data. But this is like any data corruptor. - So, we quarantine them in our proofs we concern
ourselves only with honest users having no data
dependency on dishonest ones. - We call those virtuous users.
40Serializability Lemma
- Lemma If T1 ? T2 (conflict edge from T1 to T2),
vs1 is the version structure signed by user u1
for T1, vs2 is the version structure signed by
user u2 for T2, all version structures among some
set of virtuous users U including u1 and u2 are
ordered, then vs1 lt vs2. - Proof Suppose user u1 issues T1 and user u2
issues T2. For any conflict, there is some data
item x such that op1(x) precedes op2(x).
41Lemma continued
- write-read there is an x such that W1(x)
precedes R2(x). Therefore R2(x) must occur after
vs1 has been signed by u1, because u2 will verify
hash (u1data), so vs1u1 lt vs2u1. Moreover,
the temporal ordering implies that vs1u2 lt
vs2u2. Finally, because vs1 and vs2 are
ordered, vs1 lt vs2. - write-write very similar to the write-read case.
- read-write there exists an x such that R1(x)
precedes W2(x). Therefore vs1u2 lt vs2u2.
Otherwise R1(x) would either read from W2(x) or
from some later value. Because version structures
are ordered, vs1 lt vs2. Done.
42Total Ordering Lemma
- Lemma Suppose the global version structure
exchange begins at t and ends successfully at t
for some set of virtuous users U. Assuming every
user in U has been following the signing
verification protocol up to time t, then all
version structures among U are ordered up to
time t.
43Total Ordering Lemma I
- Prove the contrapositive Consider version
structures vs1 signed by user u1 and vs2 signed
by user u2 before time t, where both u1 and u2
belong to U such that vs1u1 gt vs2u1 and
vs1u2 lt vs2u2. - That is, the version structures are incomparable.
- Then either the signing verification protocol of
some user or the global version structure
exchange that begins at time t will be
unsuccessful.
44Total Ordering Lemma II
- Consider the next version structure signed by u1,
call it vs1. At that moment u1 will know that
there has been a fork if u1 sees vs2 during the
signing verification protocol (because vs1u2 lt
vs2u2). So, assume the server will not show vs2
to u1 and hence vs1 gt vs2 will be false.
45Total Ordering Lemma III
- If servers forking not yet discovered, then vs1
and vs2 are unordered, so the argument of the
last slide holds for any subsequent version
structure vs1 signed by u. Symmetrically, no
subsequent version structure signed by u2 such as
vs2 will have the property that vs2 gt vs1. - Therefore when the global version structure
exchange occurs, u1 and u2 will discover a lack
of well-formedness. Done.
46Notes on Implementation
- Server avoids being framed
- Concurrency control
- Version structure commits how to make them
efficient? - Supporting cryptographic assumptions and global
verification protocol. - View maintenance
- Read-write asymmetry.
- Indexing
47Server is framed
Bob (good) sgn_Bob(hash(Bobdata), (Bob, 6),
(Alice, 12), (Bill, 4))
Alice (good) sgn_Alice(hash(Alicedata),
(Bob,6), (Alice,13), (Bill,4))Server shows this
to Bob. No fork.
Bob (bad) sgn_Bob(hash(Bobdata), (Bob,7),
(Alice, 12), (Bill,4)) Bob pretends server has
forked.Upon global exchange, server is framed.
48Server can avoid being framed
- Server signs the version structures from users if
it agrees they are legitimate. In the case of
previous figure, server will refuse to sign Bobs
second version structure. - Bob and the server can present their evidence
before security officer.
49Server Proves Innocence
Bob (good) sgn_server(sgn_Bob(hash(Bobdata),
(Bob, 6), (Alice, 12), (Bill, 4)))
Alice (good) sgn_server( sgn_Alice(hash(Alicedata
), (Bob,6), (Alice,13), (Bill,4)))Server shows
this to Bob. No fork.
Bob (bad) sgn_Bob(hash(Bobdata), (Bob,7),
(Alice, 12), (Bill,4)) Server refuses to sign.
Shows that Bob is bad.
50Concurrency Control
- Accesing vs data can be done by locking hash
(vdata). - To increase concurrency, partition vs rows into
k parts, each with its own hash. The user u would
then write k hashdata values in the version
structure. - Transactions will lock the appropriate hash
values.
51Making Version Structures Fast
- Verifying, signing, and sending in a version
structure may take time. - The verification protocol involves many version
structures so is expensive part. - To make this faster, server sends signed version
structures as they appear to users that have
subscribed to this list so most computation can
be done asynchronously. - A user sends in its signed version structure when
user is ready and all is verified.
52Controlling Version Structure Size
- As it stands, as the number of users increases,
the size of storage increases as the square. - Fortunately, it is seldom the case that so many
people trust one another. - It is fine to have several subpopulations each
with shared version structure lists. - Subpopulations could overlap but at some cost to
commits.
53Implementation of Cryptographic Assumptions
- Public key infrastructure to hold public
signature keys. - Smartcards are used to authenticate user and
perhaps as a final private processing node. - If attached to a cell phone, smartcards can be
used for the server-independent communication in
the global version exchange protocol.
54View Maintenance
- Deferred view maintenance (or equivalently
triggers) is usually done by the server on behalf
of a transaction. In this case it must be done by
a user that has the right to modify the data. - If user u changes private data but sends a
summary of those updates to a user v who
maintains that view, then u must use a public key
of v to do so.
55Read Write AsymmetryUsing Public Key Cryptography
- Current use case is each user or group wants to
implement its own data. - Sometimes, as in view maintenance, one user wants
to share some data with another (e.g. between
patient and hospital). - Use ssh-style protocol with public keys. Data is
owned by patient but read by hospital.
56Indexes
- Suppose a user wants not all of his/her data but
only the people in some subset (e.g. in his/her
department). - Indexes are simply merkle tree like structures.
So start at the root, get the next page verifying
the contents based on the hash. Decrypt that page
and proceed. - Modifications return back up the tree.
57StrangerDB achieves
- Use virtues of servers
- Reliability, availability, historical backup
- But avoid their vices they might cheat.
- Encryption protects privacy.
- Signatures makes tampering evident.
- Histories/version structures global exchange
make forking evident. - Result serializability, no forks, privacy.
58Questions
- Questions and criticism welcome.
- If Ive missed a significant reference, please
let me know now and by email. - Thank you!
59How to Access Data
- The data available to a user u is the data to
which that user has a decryption key. - For each user v whose data u has access to, u
fetches that data using the index beginning with
pointer associated with hash(vdata). - Decrypt the part of the index needed (i.e.
decrypt the root, then necessary child etc.) - This gives a set of row ids.
- Fetch the rows with those rowids from database.
60Modifying data
- User u modifies certain rows that u owns.
- Reflects modification in the indexes.
- Encrypts.
- Updates hash(udata) as appropriate.
- Does signing verification protocol.
- Commits the transaction.