Title: A Framework for the Analysis of Mix-Based Steganographic File Systems
1A Framework for the Analysis of Mix-Based
Steganographic File Systems
- Claudia Diaz, Carmela Troncoso, Bart Preneel
- K.U.Leuven / COSIC
- Cambridge, January 28, 2009
2Motivation
- Problem we want to keep stored information
secure (confidential) - Encryption protects against the unwanted
disclosure of information - but reveals the fact that hidden information
exists! - User can be threatened / tortured / coerced to
disclose the decryption keys (coercion attack) - We need to hide the existence of files
- Property plausible deniability
- Allow users to deny believably that any further
encrypted data is located on the storage device - If password is not known, not possible to
determine the existence of hidden files
3Attacker model one snapshot
- Attacker has never inspected the users computer
before coercion - Ability to coerce the user at any point in time
- User produces some keys
- Attacker inspects user computer
- Game If attacker is able to determine that the
user has not provided all her keys, the attacker
wins
4Anderson, Needham Shamir (1998)
- Use cover files such that a linear combination
(XOR) of them reveals the information - Password subset of files to combine
- Hierarchy (various levels of security)
- User can show some low security levels while
hiding high security levels - Not possible to know whether she has revealed the
keys to all existing levels - Drawbacks
- File read operations have high cost
- Needs a lot of cover files to be secure
(computationally infeasible to try all
combinations) - Assumes adversary knows nothing about the
plaintext
5Anderson, Needham Shamir (1998)
- Real files hidden in encrypted form in
pseudo-random locations amongst random data - Location derived from the name of the file and a
password - Collisions (birthday paradox) overwrite data
- Use only small part of the storage capacity ( lt
) - Replication
- All copies of a block need to be overwritten to
lose the data - Linear hierarchy higher security levels need
more replication
6StegFS McDonald Kuhn (1999)
- Implemented as extension of the Linux file system
(Ext2fs) - Hidden files are placed into unused blocks of a
normal partition - Normal files are overwritten with random data
when deleted - Attacker cannot distinguish a deleted normal file
from an encrypted hidden file - Block allocation table with one entry per block
on the partition - Used blocks entry encrypted with same key as
data block - Unused blocks random data
- The table helps locating data and detecting
corrupted blocks (lower security levels can still
overwrite higher ones)
7Attacker model continuous observation
- What if attacker can observe accesses to the
store? - Remote or shared semi-trusted store
- Distributed P2P system
- Same game as before
- Ability to coerce the user at any point in time
- User produces keys to some security levels
- Attacker inspects user computer
- If attacker is able to determine that the user
has not provided all her keys, the attacker wins - BUT now the adversary has prior information
(which blocks have been accessed/modified) - Previous systems do not provide plausible
deniability against this adversary model
8Previous work where this adversary is relevant
P2P
- Distributed (P2P) steganographic file systems
- Mnemosyne Hand and Roscoe (2002)
- Mojitos Giefer and Letchner (2002)
- Propose dummy traffic to hide access patterns (no
details provided)
9Previous work where this adversary is relevant
Semi-trusted remote store
- Semi-trusted remote store Zhou et al. (2004)
- Use of constant rate cover traffic (dummy
accesses) to disguise file accesses - Every time a block location is accessed, it is
overwritten with different data (re-encrypted
with different IV) - Block updates no longer indicate file
modifications - Every time a file block is accessed, it is moved
to another (empty) location - Protects against simple access frequency analysis
- Relocations are low-entropy
- Broken by Troncoso et al. (2007) with traffic
analysis attacks that find correlations between
sets of accesses - Multi-block files are found prior to coercion if
they are accessed twice - One-block files are found if accessed a few times
10How it is broken (simplified version)
-
- 10
- 100
- 20
- 200
- 30
- 300
- 40
- 400
11- Can we provide plausible deniability against
an adversary who monitors the store prior to
coercion?
12System model
- Files are stored on fixed-size blocks
- Blocks containing (encrypted) file data are
undistinguishable from empty blocks containing
random data - Several levels of security (we assume
hierarchical) - User discloses keys to some of these levels while
keeping others hidden - Data persistence erasure codes for redundancy
(impact on plausible deniability) - Traffic analysis resistance
- Constant rate dummy traffic
- High entropy block relocation
Process user file requests Generate dummy traffic
(uniform)
13User Login
- User logs in with security level s, by providing
key uks - Agent trial-decrypts every entry in the table
- Files in security levels s or lower can be found
in the table - Files in higher security levels are
indistinguishable from random (empty) - Agent starts making block accesses (either dummy
or to retrieve files requested by the user) - For each block, the agent performs an access cycle
Table
14Block encryption
- Block containing a file in security level s
- User key uks
- (One time) block key bki
- Empty block, or containing a file in security
level higher than s
data
random
15Access cycle
Table
16Attack methodology
- Attacker profiles the system to extract
- Typical access sequences when the user is idle
(dummy traffic) - Typical access sequences when the user is
accessing a file - Attacker monitors accesses and looks for
sequences that look like file accesses - Attacker coerces the user when sequence indicates
possible file access (worst case scenario) - Attacker obtains some user keys and inspects
computer - Attacker combines the evidence obtained before
and after coercion to try to determine if there
are more user keys the user has not provided - If the probability of undisclosed keys is high,
deniability is low, and vice versa.
17Extracting information from the sequence of
accesses to the store I
- Attacker profiles the system to extract typical
access sequences when the user is accessing a file
x
x
x
5
8
4
3
8
9
4
7
2
3
7
9
1
MixSFS
4
8
5
1
7
3
2
9
18Extracting information from the sequence of
accesses to the store II
- Attacker profiles the system to extract
- Typical access sequences when the user is idle
(dummy traffic) - Establish a baseline for dummy traffic
- Analyze accesses to store and find strong
correlations (unlikely to be generated by dummy
traffic) - For big files, the area that goes over the
baseline is much bigger than for dummy traffic
(i.e., distinguishable)
19Security metrics unobservability
- Prior to coercion
- we define unobservability (U) as the probability
of a file operation being undetectable by the
adversary i.e., the sequence of store accesses
generated by a file operation is considered by
the adversary as dummy traffic
20Security metrics deniability
- After coercion
- Percentage of empty blocks in pool compared to
the percentage in the whole store - Worst case scenario coercion occurs immediately
after a hidden file access large number of
empty blocks in the pool - We define deniability (D) as the probability that
the evidence collected by the adversary (before
and after coercion) has been generated by dummy
traffic (i.e., no evidence of hidden files).
21Conclusions and open questions
- Conclusions
- Hard to protect against traffic analysis, even
using constant rate dummy traffic - Hard to conceal file accesses with dummy traffic
that selects locations uniformly at random - When files occupy more blocks, access to them is
harder to conceal - Open questions
- More sophisticated pattern recognition algorithms
may extract more info from the sequence of
accesses - Design of smarter traffic analysis strategies
- Can such a system be implemented in practice?
22