Title: Efficient Privacy Preserving Protocols for Visual Computation
1Efficient Privacy PreservingProtocols forVisual
Computation
- Maneesh Upmanyu
- Advisors C. V. Jawahar , Anoop M. Namboodiri,
Kannan Srinathan, - Center for Visual Information Technology
- Center for Security, Theory Algorithmic
Research - IIIT- Hyderabad
2Security and Privacy of Visual Data
Broad Objective
- Development of secure computational algorithms in
computer vision and related areas. - To develop highly-secure solutions
- To develop computationally efficient solutions
- To develop solutions to problems with immediate
impact
Project Web-Page http//cvit.iiit.ac.in/projects/
SecureVision
3Research Directions
- Private Content Based Image Retrieval (PCBIR)
- Blind Authentication A Secure Crypto-Biometric
Verification Protocol
- Efficient Privacy Preserving Video Surveillance
Publication Maneesh Upmanyu, Anoop M.
Namboodiri, K. Srinathan and C.V. Jawahar
Efficient Privacy Preserving Video Surveillance
Proceedings of the 12th International Conference
on Computer Vision (ICCV 2009)
Publication Maneesh Upmanyu, Anoop M.
Namboodiri, K. Srinathan and C.V. Jawahar Blind
Authentication - A Secure Crypto-Biometric
Verification Protocol Appears in
IEEE-Transactions on Information Forensics and
Security (IEEE-TIFS), June 2010
- Publication Shashank J, Kowshik P, Kannan
Srinathan and C.V. Jawahar Private Content Based
Image Retrieval In Proceedings of Computer
Vision and Pattern Recognition (CVPR 2008)
4Our Security Goal
- What is meant by Privacy?
- Design protocols to limit the information
leakage through what is learned in addition to
the designated output. - What is the Adversary Model?
- Semi-honest vs. Malicious adversary
- Analysis outline
- Correctness
- Security
- Complexity
5Assumptions
- Reliable and secure communication channel
- Players are passively corrupt, that is, honest
but curious. - Players are computationally bounded.
- Players do not collude.
6Thesis Objective
- Traditional Approaches uses highly interactive
protocols. - Limitation massive datasets
- Example Blind Vision
- Paradigm Shift
- Compute directly in encrypted domain.
- Encrypt -gt Communicate -gt Compute -gt Decrypt
- Domain specific encryption schemes.
- PKC is data independent and generic.
- Can the paradigm be generic yet efficient?
7Contribution of Thesis
- A method that provides provable security, while
allowing efficient computations for generic
vision algorithms have remained elusive.
- We show that, one can exploit certain properties
inherent to visual data to break this seemingly
impenetrable barrier.
8Dilemma of Privacy vs. Accuracy
9What is Blind Authentication?
- A biometric authentication protocol that does not
reveal any - information about the biometric samples to the
authenticating server. - information regarding the classifier, employed by
the server, to the user or client
10Biometric Authentication System
11Primary Concerns in a Biometric System
- Template Protection
- Non-Repudiable
- Network and Client-side Security
- Revocability
12Previous Work
A template protection scheme with provable
security and acceptable recognition performance
has thus far remained elusive. A.K. Jain,
Eurasip 2008
13Homomorphic Encryption
- An encryption scheme using which some algebric
operation , like addition or multiplication, can
be directly done on the cipher text.
Let x1 20 and x2 22, to compute x1x2 42
Use an encryption scheme, for example E(x) ex
Server stores E(x1) e20 and E(x2) e22
Compute using encrypted data y E(x1) E(x2)
e20.e22 e42
Decrypt z D(y) ln(y) z D(y) ln (e42) 42
14User Enrollment
15Authentication using a Linear Kernel
16Extensions to Kernels Neural Networks
- Kernel based classifier uses a discriminating
function like - Similarly, in Neural Network the basic units are
for example perceptron or sigmoid - Model above functions as arithmetic circuits
consisting of add and multiplication gates over a
finite domain. - Consider two encryptions E and E
17Implementation and Analysis
- Experiments designed to evaluate the efficiency
and accuracy of proposed approach. - For evaluation, an SVM based verifier based on
client-server architecture was implemented. - Accuracy as no assumptions are made, accuracy
remains same. - Verified this on various public domain (UCI,
Statlog) datasets.
18- Case study shows that matching using fixed length
feature representation is comparable to variable
length methods such as dynamic warping.
19Security, Privacy and Trust
- Server Security
- Template database security
- Hacker sitting in server
- Client Security
- Hacker has users key or biometric
- Passive attacks at client end
- Network Security
- Network is susceptible to snooping attacks
20Advantages of Blind Authentication
- Fast and Provably Secure authentication without
trading off accuracy. - Supports generic classifiers such as Neural
Network and SVMs. - Useful with wide variety of fixed-length
biometric-traits. - Ideal for applications such as biometric ATMs,
login from public terminals.
21Proposed Surveillance System
How do we carry out surveillance on
Randomized images ?
22Motivation
Can we do surveillance without seeing the
original video ?
23Paradigm Shift
We use the paradigm of secret sharing to achieve
private and efficient surveillance.
24Protocol in a nutshell
Propose a Cloud-Computing based solution using
kgt2 non-colluding servers
25Secret Sharing
- A method of distributing a secret among a group
of servers, such that - Each server on its own has no meaningful
information - Secret is reconstructed only when all shares
combine together - Existing methods are highly inefficient
- Asmuth-Bloom overcomes this limitation by working
in Residue Number System (RNS).
26Example to do Addition in RNS
RNS ( m1 37, m2 49 M m1 x m2 1813)
X 973(m1, m2) (x1, x2) (11, 42)
Y 678(m1, m2) (y1, y2) (12, 41)
Shatter f(x) (x.Sh) mod mi
Merge m(xi, mi) CRT(xi, mi) /S
Z 1651
27Data Properties
- While general purpose secure computation appears
inherently complex and oftentimes impractical. - We show certain properties of the data can be
used to ensure efficiency while ensuring privacy. - Following properties are of interest to us.
- Limited and Fixed Range
- Scale Invariant
- Approximate Nature
- Non-General Operands
28Characteristics of the System
29Implementation Challenges
- Representation of negative numbers Use an
Implicit sign representation. - Use (0, M/2) as positive and rest as negative.
- Sign conversion is carried out using additive
inversion of Z. - Overflow and Underflow Operations are valid and
correct as long as range of data is (-M/2, M/2). - Integer Division and Thresholding RNS domain is
finite and hence not all divisions are defined. - Dividing integer A by B is defined as A/B
(ai.bi-1) mod mi - Defining Equivalent operations For every f(x),
we need to define f(x) such that merging f(xi)
would give f(x).
30Experimental Results
31(No Transcript)
32Properties of the Protocol
- Servers are un-trusted and the network may be
insecure. - Near loss-less data encoding (PSNR51).
- No compromise in accuracy.
- Inexpensive capture device, and a unidirectional
data flow. - Negligible overheads to make private computation
practical. - Secure as long as servers do not collude.
Contribution
Our approach shows that privacy and efficiency
co-exists in the domain of visual data
33K-Means Clustering
- Data clustering is one of the most important
techniques for discovery of patterns in a
dataset. - K-Means clustering is a simple and extensively
used technique that automatically partitions a
dataset into k clusters. - The technique becomes more effective with larger
amount of data such as when multiple businesses
share their data to carry out the clustering
together. - However, the data may contain sensitive
information.
34Secure K-Means Algorithms
- Trusted Third Party (TTP) based solutions
- Dwork et al. ( Crypto 2004)
- Very Efficient
- No TTP in Real World, Possible security
compromise - Data Perturbation techniques
- Stanley et al. (BSD 03), Kargupta et al. (ICDM
03) - Negligible communication overhead
- Partial security, Non-invertible transformations
used - Those employing Multiparty Computations
- Vaidya et al. (KDD 03), Jha et al. (ESORICS 05)
Wright et al. (KDD 05), Inan et
al (DKE 07) - Complete privacy
- Highly in-efficient
35Our Distributed Solution
- We simulate TTP on a set of un-trusted servers
over an in-secure network. - Secret Sharing is a method of distributing a
secret among a group of servers.
36Proposed Protocol
- Protocol consists of two phases
- Phase One Secure Data Distribution
- Phase Two Secure K-Means
- Phase One Secure Storage of data at servers
- Selection of an optimal RNS.
- Shattering of the users private data.
- Privacy Server stores only the shattered shares
of data. - Phase Two Secure K-Means
- Initialization
- Lloyd Step
- Knowledge Revelation
37Phase Two Secure K-Means
- Clusters are initialized using the shattered
shares - Lloyd Step involves iteratively computing the
closest centers in a Euclidean space - Secure protocols for division and comparison
- Securely evaluate the termination criteria
- Send the shattered cluster centers to users who
uses the Merge function on it - Privacy No information is leaked to the servers
- Data for operations such as division secured
using randomization - Randomization done so as to secure against
possible GCD and factorization based attacks
38Overview of the Protocol
User 1
User 2
39Analysis
- Overheads calculated over the naïve TTP based
protocol. - Division and Comparison operations introduce
communication overhead. - Limited to one round per operation
- Traditional approaches uses SMC for this.
- Based on OT, a communicational intensive
protocol. - O(n2) communication overhead to multiply two
vectors (length n) - Limited data expansion
- Eg 32bit data shattered into 5 shares requires
54bits while traditional SS requires 160bits.
40Algorithm Properties
- We have proposed a highly secure framework using
paradigm of secret sharing. - Negligible overheads in simulating algebraic
operations. - Achieve efficiency by exploiting the data
properties. - Solution does not demand any trust and the
clustering is carried out directly on the
encrypted data.
41Conclusion
Broad Objective
- Development of secure computational algorithms in
computer vision and related areas. - To develop highly-secure solutions
- To develop computationally efficient solutions
- To develop solutions to problems with immediate
impact
- The traditional methods of ensuring privacy are
communication and computation expensive. - We show that domain specific knowledge can be
incorporated to ensure efficiency while retaining
privacy. - Moreover, our methods do not trade off accuracy.
42Related Publications
Maneesh Upmanyu, Anoop M. Namboodiri, K.
Srinathan and C.V. Jawahar Blind
Authentication - A Secure Crypto-Biometric
Verification Protocol In IEEE-Transactions on
Information Forensics and Security (IEEE-TIFS,
June 2010) Efficient Biometric Verification in
Encrypted Domain In Proceedings of 3rd
International Conference on Biometrics (ICB
2009) Efficient Privacy Preserving Video
Surveillance Proceedings of the 12th
International Conference on Computer Vision
(ICCV 2009) Efficient Privacy Preserving
K-Means Clustering Proceedings of the Pacific
Asia Workshop on Intelligence and Security
Informatics (PAISI 2010)
43Thank you for your attention
44RNS CRT
- Residue Number System (RNS) is an integer using a
set of smaller integers. - RNS is defined by a set of k integer constants.
m1, m2, m3, , mk - Secret A is represented by k smaller integers.
a1, a2, a3, , ak where ai A modulo mi - This representation is valid as long as 0 lt A lt
M, where M is LCM of mis - Chinese Remainder Theorem (CRT) is the method of
recovering the integer value from a given set of
smaller integers. - Define Mi M/mi
- Compute ci Mi x (Mi-1 mod mi)
- The above equation is always valid in our system,
therefore unique solution exists
45Shatter Merge Functions
- Shatter function Compute and store the
secret shares of the private data. - Where xi is the ith secret share, and ? is a
uniform randomness - Merge function Reconstruct the secret.
- Given for different
primes Pis, secret is recovered using CRT