Data Collection With SelfEnforcing Privacy - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Data Collection With SelfEnforcing Privacy

Description:

Data Collection With SelfEnforcing Privacy – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 35
Provided by: pgo1
Category:

less

Transcript and Presenter's Notes

Title: Data Collection With SelfEnforcing Privacy


1
Data Collection WithSelf-Enforcing Privacy
  • Philippe Golle, PARC
  • Frank McSherry, Microsoft Research
  • Ilya Mironov, Microsoft Research

2
Roadmap
  • Problem
  • Solutions
  • Scheme 1 no data disclosure
  • Scheme 2 randomized response
  • Scheme 3 accurate data, interactive process
  • Future research directions

3
A pollster conducts a survey
4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
Data quality - accuracy - participation
better privacy
perception of
8
From the horses mouth
  • Survey of the Census Bureaus field staff
  • 18 believe that ACCIDENTAL release of
    confidential data may occur within the next 5-10
    years
  • 19 believe that MALICIOUS release of
    confidential data may occur within the next 5-10
    years
  • Source T. Mayer, Interviewer attitudes about
    privacy and confidentiality, 2001

9
Solutions?
  • Encryption (TLS)
  • Stops an eavesdropper, not the pollster
  • Solemn pledge (aka privacy policy)
  • Why should we believe it?
  • Privacy-preserving datamining and disclosure
  • Assumes honest pollster
  • Randomized response (aka lying)
  • May hurt utility, only limited privacy

10
Threat Model
  • Pollster may be corrupt!
  • Privacy goal
  • Deter corrupt pollster from releasing any
    sensitive information submitted by individual
    respondents

11
Solution self-enforcing privacy
  • Basic idea punish the pollster if it leaks
    sensitive information
  • A mechanism for submitting data to an
    untrustworthy pollster such that
  • Leakage of sensitive data can be caught and
    publicly verified
  • If sensitive data is not leaked, the probability
    of wrongly indicting the pollster is negligible

PRIVACY FOR RESPONDENTS
SECURITY FOR THE POLLSTER
12
Self-enforcing privacy solutions?
  • Auditors to check pollsters compliance with
    privacy policy
  • But audits are expensive and incomplete
  • Audits do not help with post-mortem or forensic
    evidence
  • Tainted data
  • Users submit data they can easily recognize
  • e.g., use a unique e-mail address and monitor it
  • But cannot prove misbehavior to third party
  • Our approach
  • A kind of publicly verifiable tainted data

13
bond
pollster
bounty-hunter
respondents
14
bond
bait
bait
bait
pollster
bounty-hunter
respondents
15
Homomorphic encryption
  • Public-key encryption
  • E(M), E(N) ? E(MN)
  • E(M), a ? E(aM)
  • ElGamal of gM


? E(M)
E(M)
16
Scheme 1 Self-enforcing privacy
011 01
0 1 1 0
1 0
17
Scheme 1 Self-enforcing privacy
0 1 1 1
Alice Bob Charlie David
0 1 1
0
18
Scheme 1 Privacy for respondents
0 1 1 1
Alice Bob Charlie David
19
Theorem
  • If
  • k secret bits
  • pollster adds ½-e noise
  • a-fraction are baits
  • Then
  • with more than k/(ae2) leaked bitsthe bond can
    be claimed

20
Example
  • If
  • 160 secret bits
  • pollster adds 10 noise
  • 10-fraction are baits
  • Then
  • with more than 8,000 leaked bitsthe bond can be
    claimed

21
Security for the pollster
0 1 1 1
Alice Bob Charlie David
22
1600 Pennsylvania ave, DC
23
Scheme 1
  • OK except NO meaningful release of data

24
Randomized response Warner 65
  • A method for getting honest responses to
    sensitive questions
  • Assume the respondent must answer a binary
    (Yes/No) question
  • Example Did you cut the cherry tree?
  • The respondent flips a biased coin
  • Answers truly with probability p gt .5
  • George Washington Yes
  • Lies with probability 1 - p
  • George Washington No
  • The respondent does not reveal the outcome of the
    coin flip

25
Randomized response
0 0 0 1

-
0
0 0 1 1
Alice Bob Charlie David
0
1
1
26
Differential privacy
  • Differential privacy a privacy definition that
  • guarantees uncertainty about any one record
  • permits disclosure of aggregate information

Details Cynthia Dwork, Differential Privacy,
ICALP 2006
27
Randomized response
0 0 0 1

f(e)-differential privacy
0
0 0 1 1
Alice Bob Charlie David
0
1
1
e-differential privacy
28
Scheme 2 (randomized response)
  • release of aggregate data
  • imprecise responses

29
Precise answers
0 1 0
0 1 1 1
Alice Bob Charlie David
f(r)
1
dense (RSA)
30
Indictment process
Your honor, Exhibit 1 r1, f(r1) decrypts to
0 Exhibit 100 r100, f(r100) decrypts to 1
No! f(r1) decrypts to 1!
Guilty as charged
Not guilty
no contest
30
31
Analysis
  • Privacy of the respondents
  • Security for the pollster
  • differential privacy definition

32
Scheme 3
  • release of aggregate data
  • accurate responses
  • interactive indictment process

33
Three schemes
34
Research directions
  • Achieve all three properties
  • release of aggregate data
  • accurate responses
  • non-interactive indictment process
  • Better schemes
  • assume some coordination?
  • Tighter analysis of disclosure policies
  • variants/alternative to differential privacy?
  • Rational adversary game theory connection
Write a Comment
User Comments (0)
About PowerShow.com