Secure and HighlyAvailable Aggregation Queries via Set Sampling - PowerPoint PPT Presentation

About This Presentation

Title:

Secure and HighlyAvailable Aggregation Queries via Set Sampling

Description:

Haifeng Yu, National University of Singapore. 2. Secure Aggregation Queries in Sensor Networks ... Haifeng Yu, National University of Singapore. 8. Reduce the ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 32

Provided by: compN

Category:

more less

Transcript and Presenter's Notes

Title: Secure and HighlyAvailable Aggregation Queries via Set Sampling

1
Secure and Highly-Available Aggregation Queries
via Set Sampling

Haifeng Yu
National University of Singapore

2
Secure Aggregation Queries in Sensor Networks

Multi-hop sensor network with trusted base
station
With the presence of malicious (byzantine)
sensors
Goal Count the of sensors sensing smoke (i.e.,
satisfying a certain predicate)
Sum, Avg, and other aggregates are similar see
paper
Type-1 attack Malicious sensors report fake
readings
If malicious sensor is small damage is
limited
Not the focus of our work

Haifeng Yu, National University of Singapore
2
3
Secure Aggregation Queries in Sensor Networks

Type-2 attack Malicious sensors (indirectly)
corrupt the readings of other sensors much
larger damage
E.g., in tree based aggregation
Focus of most research on secure aggregation
our focus too

3
6
base station
1
4
malicious
Haifeng Yu, National University of Singapore
3
4
State-of-Art and Our Goal

Active area in recent years (e.g. Chan et
al.06, Frikken et al.08, Roy et al.06,
Nath et al.09)
All these approaches focus on detection (i.e.,
safety only)
Will detect if the result is corrupted
But will not produce a correct result when under
attack

Our Goal
Detecting attacks ? Tolerating attacks
Safety only ? Safety
Liveness System made harmless ? System made useful
Haifeng Yu, National University of Singapore
4
5
Our Approach to Tolerating Attacks

Previous approaches Fix the security holes in
tree-based aggregation
Dilemma in in-network processing
Our novel approach Use sampling
With MACs on each sample, security comes almost
automatically

3
6
1
4
Haifeng Yu, National University of Singapore
5
6
Our Approach to Tolerating Attacks

Previous approaches Fix the security holes in
tree-based aggregation
Dilemma in in-network processing
Our novel approach Use sampling
With MACs on each sample, security comes almost
automatically

Cannot modify the result
0
0
0
0
0
0
0
0
0
sampled
flood the sample result (with a MAC)
Challenge with sampling Potentially large
overhead
Haifeng Yu, National University of Singapore
6
7
Background Estimate Count via Sampling

n sensors, b sensors sensing smoke (called black
sensors)
Goal Output (?, ?) approximation b such that
E.g. Sample 10 sensors and 5 are black
? b 0.5n
Classic result sensors needed to sample is

(Prohibitively) expensive for small b
Haifeng Yu, National University of Singapore
7
8
Reduce the Overhead via Set Sampling

Challenges with small b
Need many samples to encounter black sensors
Set sampling Sample a set of sensors together
Binary result will tell whether any sensor in the
set is black (but not how many)
Efficient implementation in sensor networks
later
Should be easier to hit sets containing black
sensors

How effective will this be? (How many sets do we
need to sample to estimate count?)
Haifeng Yu, National University of Singapore
8
9
Our Results

Novel algorithm for estimating count using set
sampling
Defines randomized and inter-related sets, and
sample them adaptively
sets needed to sample
Previously without set sampling

of samples reduced from polynomial to
polylogarithmic
(can be further reduced see paper)
Haifeng Yu, National University of Singapore
9
10
Our Results

Per-sensor msg complexity
Comparable to some detection-only protocols Roy
et al.06
Similar msg sizes
See paper for time complexity
See paper for other aggregates (sum, avg)
Set sampling novel algorithms using set
sampling ? Enables secure aggregation queries
despite adversarial interference

Haifeng Yu, National University of Singapore
10
11
Outline of This Talk

Background, goal, and summary of results
Simple implementation of set sampling in sensor
networks
Main technical results Novel algorithm for
estimating count via set sampling

Haifeng Yu, National University of Singapore
11
12
Implementing Set Sampling Non-Secure Version
Goal O(1) per-sensor msg complexity for sampling
a set

Example sample the set A, B, C, D
Request flooded from the base station O(log n)
bits
We use only O(n) (instead of O(2n)) random sets ?
O(log n) bits to name a set
Reply Single bit
Flood back from all black sensors in the set
e.g., A and C
Each sensor only forwards the first message
received
Base station sees binary answer
Multiple samples can be taken in one flooding
Our algorithm takes samples in O(log n)
sequential stages ? Only O(log n) times of
flooding

Haifeng Yu, National University of Singapore
12
13
Implementing Set Sampling Secure Design

Each set Some distinct symmetric key K
Preload K onto all sensors in the set
Each sensor should be only be in a small number
of sets O(log n) in our protocol
Request ?name of K, nonce?
Reply ?MAC_K(nonce)?
Only sensors holding K can generate
DoS attacks possible
Can be avoided with improved design see paper

Haifeng Yu, National University of Singapore
13
14
Outline of This Talk

Background, goal, and summary of results
Implement set sampling in sensor networks
Main technical meat Novel algorithm for
estimating count via set sampling
For now assume all sensors are honest
Security follows from the clean security
guarantees of sampling, though some minor
modifications needed see paper

Haifeng Yu, National University of Singapore
14
15
Random Sets on the Sampling Tree

Basic approach
Construct (related) randomized sets of different
sizes and adaptively sample them
Base station internally created a sampling tree
A complete binary tree with 4n leaves
Each tree node A distinct symmetric key Some
set of sensors
Sampling tree is an internal data structure and
not network topology

Haifeng Yu, National University of Singapore
15
16
A
B
K1, K3, K6, K12 loaded onto the sensor B
K1, K2, K5, K10 loaded onto the sensor A
Each sensor is associated with a uniformly random
leaf (independently)
Each tree node corresponds to a set containing
all the sensors in its subtree
Haifeng Yu, National University of Singapore
16
17
Properties of the Sampling Tree

A sensor is black if it satisfies the predicate
A key is black iff the corresponding set contains
black sensor
fraction of black keys at level i

Haifeng Yu, National University of Singapore
17
18

is monotonic as we go down the tree
Decrease by a factor of at most 2 per level
At the top (assuming at least one
black sensor)
At the bottom (4n leaves!)
Lemma There exists a level ? with

Haifeng Yu, National University of Singapore
18
19
Why Level ? Helps

not too small ? Efficient estimation of
via naïve sampling
samples on level ? yields an
(?, ?) approximation for
not too large ? Can potentially estimate
final count directly from
Chernoff-type occupancy tail bound for balls into
bins
See paper for details

Haifeng Yu, National University of Singapore
19
20
Additional Issues Too Few Keys on Level ?

Challenge
To estimate final count based on , the number
of keys on level ? needs to be large enough
If not, need to track down to lower levels
Need to leverage other interesting properties on
the sampling tree
See paper

Haifeng Yu, National University of Singapore
20
21
Additional Issues Finding Level ?

Binary search on the O(log(n)) levels
On each level i examined, sample a small number
of random keys to roughly estimate
Extremely efficient
Challenges
The binary search operates on estimated values
(with error and may not be monotonic)
When is small, the estimation only has
error guarantee on one side
See paper

Haifeng Yu, National University of Singapore
21
22
Example Numerical Results

n 10,000 and count result (b) range from 0 to
10,000
Overhead
5-15 sequential stages of sampling
Total 250-300 samples
Avg approximation error (10.08)
Hard to get better accuracy even in trusted
environments (Nath et al.09)
Naive sampling 300 samples gives same accuracy
only when b gt 2,000

23
Conclusions

Making aggregation queries secure is critical for
many sensor network applications
Contribution Detecting attacks ? Tolerating
attacks
Safety only ? Safety Liveness
Our approach
Abandon in-network processing and use sampling
Use novel set sampling to reduce the overhead
Polynomial overhead ? Logarithmic overhead

Haifeng Yu, National University of Singapore
23
24
Related Work to Set Sampling

Decision tree complexity for threshold-t
functions (i.e., whether b ? t) Ben-Asher and
Newman95 Aspnes09
Most results are for error-free deterministic
protocols
Large lower bound ?(t) (implying ?(b) for count)
No prior results for general Monte Carlo
randomized algorithm

Haifeng Yu, National University of Singapore
24
25
Tolerating Attacks is Difficult

Example Byzantine consensus
Detection substantially easier than tolerance
n ? 3f 1 lower bound only applies to tolerance
and not detection
Pinpointing / revoking malicious sensors is hard
E.g., due to lack of public-key authentication
Active research area by itself

Haifeng Yu, National University of Singapore
25
26
System Model

Multi-hop sensor network with trusted base
station
Performance metric Time complexity see paper
Performance metric Per-sensor msg complexity
Max number of msgs sent/received by an single
sensor (captures loading balance)
msg size is either 8 bytes (size of a MAC) of
log(n) bits
Collision ignored as in all prior work
Or one can apply existing algorithms

Haifeng Yu, National University of Singapore
26
27
Implementing Set Sampling Non-Secure Version
Goal O(1) per-sensor msg complexity for sampling
a set
Request flooding every sensor sends/receives
one msg

Request size We use at most O(n) (random) sets ?
O(log(n)) bits to name a set

Haifeng Yu, National University of Singapore
27
28
Implementing Set Sampling Non-Secure Version
Goal O(1) per-sensor msg complexity for sampling
a set
A
B, C, D satisfies the predicate, A does
not Reply flooding Only the first reply is
forwarded
B
D
C

Reply Single bit

This is why set sampling is designed to be binary
Haifeng Yu, National University of Singapore
28
29
(The overhead of sampling a set needs to be
properly controlled will discuss later.)
Haifeng Yu, National University of Singapore
29
30
Translating to b

We now have a good estimation for
Need to produce a good estimation for b
Let number of keys on level be n
Throw b balls into n bins
The fraction of occupied bins has the same
distribution as
This distribution is highly concentrated near its
mean (Chernoff-type occupancy tail bound),
assuming
not too close to 1
n not too small

Haifeng Yu, National University of Singapore
30
31
Summary of Techniques to Achieve the Results