Title: SUNDR: Secure Untrusted Data Repository
1SUNDR Secure Untrusted Data Repository
- Jinyuan Li N.Y.U.
- Maxwell Krohn M.I.T.
- David Mazières N.Y.U.
- Dennis Shasha N.Y.U.
- Slides modified for CS739
OSDI 2004
2Motivation
- File system integrity is critical
- sourceforge.net 90,000 projects, including
kernel projects
3 Goal
- Prevent undetected tampering with your files!
4Current approaches
- Trust system administrator to do a good job
- Keep up with latest security patches
- Restrict accesses as much as possible
-
5Not always reliable
6SUNDRs approach
- SUNDR is a network f/s designed for running on
untrusted, or even compromised server - Place trust in users authorized to modify
particular files, not in server or admins
maintaining server - Malicious user who gains control cant convince
clients to accept altered contents because lacks
write permission - SUNDR properties
- Unauthorized operations will be immediately
detected - By whom??
- If server drops operations, can be caught
eventually - What doesnt SUNDR protect against?
7Talk Outline
- Motivation
- Design
- A strawman file system
- SUNDR design
- Implementation
- Evaluation
8Setting New Slide
Separates administration of servers from
administration of file system Single server hosts
multiple file systems Create file system Create
Superuser with public/private key pair Server
knows only public key Users Public/private key
pair When move between clients, remember private
key and last operation
9Ideal File system semantics
- File system calls can be mapped to fetch/modify
operations - Fetch client downloads new data
- Modify client makes new change visible to
others - In their implementation, when does client call
fetch and modify? - Fetch-modify consistency A fetch reflects
exactly the modifications that happen before it -
- Impossible without online trusted parties
- Goal Get as close to possible to fetch-modify
consistency without online trusted parties
10 Strawman File System
A echo A was here gtgt /share/aaa
File server
A Modify f2 sig3
B cat /share/aaa
B Fetch f2 sig4
11An ordering relation
As latest log
LogA
We define relation LogA LogB iff LogA is
prefix of LogB
Bs latest log
LogB
12Detecting attacks by the server
File server
A Modify f2 sig3
B cat /share/aaa
(stale result!)
B Fetch f2 sig3
13Detecting attacks by the server
As latest log
LogA
sig1
sig2
As log and Bs log can no longer be
ordered LogA LogB, LogB LogA
sig3a
Bs latest log
A Modify f1 sig1
B Fetch f4 sig2
B Fetch f2 sig3b
LogB
14Properties of Strawman FS
- What problems exist with strawman approach?
- High overhead (bw to send history), no
concurrency - What cant a bad server do?
- Bad server cant make up operations users didnt
perform - Cant corrupt data
- What can a bad server do?
- Bad server can conceal users operations from
each other, however, will be detected if users
check with each other - Call this property fork consistency
- Can refuse to serve data
15 Fork Consistency A tale of two worlds
File Server
As view
Bs view
16Implications of fork consistency
- Closest possible consistency to fetch-modify
without online trusted parties - Can be leveraged with online trusted parties to
detect violations of fetch-modify consistency - What are different ways this detection can be
done? - users periodically gossip to check violations
- How do you find out about other users?
- or deploy a trusted online timestamp box
- Periodically writes to one globally readable file
- Possible?
- ABTATA
- ABTBTB
17Talk Outline
- Motivation
- Design
- Strawman FS
- SUNDR approach
- Implementation
- Evaluation
18SUNDR architecture
SUNDR server-side
SUNDR client
userA
Untrusted Network
block server
userB
consistency server
SUNDR client
- block server stores blocks retrievable by
content-hash - consistency server orders all events
19SUNDR data structures (I)
- Strawman FS problem
- High bandwidth and storage consumption
- Need to reconstruct the whole file system
- Dont ship entire history instead ship digest of
fs state - Each file is writable by one user or group
- Partition inodes by allowed writer
- Hash each partition down to a 20-byte digest
(I-handle) - Fast way to check all files covered by hash are
correct
20 Hash tree (1) File handle
- Each file is hashed into a 20-byte value using a
hash tree - Blocks are stored and indexed by their
content-hash on the block server
data1
i-node
data2
Metadata
20-byte File HandleI-hash
H(data1)
data3
iblk1
H(data2)
H(data3)
H(iblk1)
data4
H(data4)
What is this?
21 Hash tree (2) FS digest
- Hash all files writable by each user/group to a
20-byte digest - From this digest, client can retrieve and verify
any block of any file (SFSRO, CFS, Pond )
i-node 2
i-table
i-num
I-handle
20-byte digest
i-node 3
i-node 4
22Hash tree (3) Directories (New Slide)
- What do directory blocks contain?
- Maps name to ltuser/group owner, I-numbergt
- Not I-hash --gt Must use I-table
- Why not point to I-node directly?
- Would have to modify directories (recursively)
when file data changes - If change data block, what structures change?
- Contents of one I-node, contents of I-table,
I-handle (not directory)
23Hash tree (4) Groups(New Slide)
- Why have groups?
- Allow multiple users to write same file
- Need digest guaranteed to represent file
- What does group I-table do?
- Maps I-num to (user,I-num)
- Updates group table to point to last writer
- Why not point to I-node directly?
- Performance Fewer changes to group I-handle when
same user modifies file
24 SUNDR FS
Superuser
SUNDR State
UserA
digest
How to fetch /share/aaa?
UserB
25SUNDR data structure (II)
- Want server to order users fetch/modify
operations - Goal Expose servers failure to order
operations properly - Sign version vector along with digest
- Version vectors will expose ordering failures
26Version structure
Version vector
A
A - 1 B - 1 G - 1
B
A - 1 B - 2 G - 2
VSTA
VSTB
Digest A
Digest B
I-handle
Signature A
Signature B
- Each user has its own version structure (VST)
- Consistency server keeps latest VSTs of all users
- Clients fetch all other users VSTs (VSL) from
server before each operation and cache them - We order VSTA VSTB iff all the version numbers
in VSTA are less than or equal in VSTB
27Version structure (New Slide)
Version vector
A
A - 1 B - 1 G - 1
B
A - 1 B - 2 G - 2
VSTA
VSTB
Digest A
Digest B
I-handle
Signature A
Signature B
- Get lock
- Get VSL from server
- Update your VST
- Compute new handles of anything that changes
(self and groups) - Update version vector to match last updates
according to VSL - Increment version appropriately
- Perform consistency check
- Check old VST in VSL
- Ensure total global ordering for VSL
- Sign VST and send back to server
28Updating VST An example
Consistency Server
A-0 B-0
A
A
DigA
A echo A was here gtgt /share/aaa
B
A-0 B-1
B
DigB
B cat /share/aaa
29Detecting attacks
A
Consistency Server
A-0 B-0
A
DigA
A-0 B-1
B
DigA
A-1 B-1
A
A echo A was here gtgt /share/aaa
B
DigA
A-0 B-1
B
A-1 B-1
A
DigB
DigA
B cat /share/aaa (stale!)
30 Supporting concurrent operations
- Two clients may issue operations concurrently
- Problem if two clients increment version numbers
concurrently - Cannot order VSTs!
- How does second client know what vector to sign?
- If operations dont conflict, can just include
first users forthcoming version number in VST - But how do you know if concurrent operations
conflict? - Solution Pre-declare operations in signed
updates - Server returns latest VSTs and all pending
updates, thereby ordering them before current
operation - User computes new VST including pending updates
- User signs and commits new VST
31Concurrent update of VSTs
Consistency Server
A
B
A-0 B-0
A
DigA
A-0 B-1
B
DigB
A echo A was here gtgt/share/aaa
B cat /share/bbb
32Talk Outline
- Motivation
- Design
- Straw-man FS
- SUNDR approach
- Implementation
- Evaluation
33SUNDR Implementation
SUNDR server-side setup
Client Machine Domain
SUNDR client daemon
FS operations
User
block server
Kernel
xfs.ko redirector
VFS
consistency server
34Block server implementation
bstore
Index System
Stripe 1
Stripe 2
Stripe 3
SCSI
Permanent Log
EIDE
Temporary Log
(cf. Venti)
35Evaluation
- Running on FreeBSD 4.9
- PentiumIV 3G, 3G RAM, LAN
- Two configurations
- SUNDR write updates to disk synchronously
- SUNDR/NVRAM simulates effects of NVRAM
- Block server benchmark
- STORE 18.4MB/s (peak), 11.9MB/s (sustained)
- FETCH 1.2MB/s (random), 25.5MB/s (sequential)
- Esign cryptographic overhead
- Sign 155us
- Verify 100us
-
36LFS small file benchmark
Seconds
37LFS multiple clients CREATE
Seconds
Number of users
38Emacs installation performance
Seconds
39Emacs multiple clients untar
Seconds
Number of users
40CVS Experiments (New Slide)
Phase SUNDR SUNDR NVRAM NSF3 SSH
Import 13.0 10.0 4.9 7.0
Checkout 13.5 11.5 11.6 18.2
Commit 38.9 32.8 15.7 11.5
Update 19.1 15.9 13.3 11.5
Total 84.5 70.2 45.5 48.2
41Conclusion
- SUNDR provides file system integrity with
untrusted servers - Users detect unauthorized operations immediately
- Users can detect consistency violations
eventually - How does this compare to using replication given
untrusted (Byzantine) components? - Yes, SUNDR is a practical file system
- Performance is close to NFS Agree?
- Useful techniques to remember
- Merkle hash trees
- Version vectors
42A working example (3)
Server
/share
A-1 B-1
A
A-1 B-1
B
A-2 B-1
A-2 B-2