Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks

About This Presentation
Title:

Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks

Description:

Motivation: Sensor Nets and In-Network Query Processing. Many Sensor Network Applications are Data Oriented ... Group Eviction ... –

Number of Views:35
Avg rating:3.0/5.0
Slides: 48
Provided by: Sam34
Learn more at: https://db.csail.mit.edu
Category:

less

Transcript and Presenter's Notes

Title: Supporting Aggregate Queries Over Ad-Hoc Wireless Sensor Networks


1
Supporting Aggregate Queries Over Ad-Hoc Wireless
Sensor Networks
  • Samuel Madden
  • UC Berkeley

With Robert Szewczyk, Michael Franklin, and David
Culler
WMCSA June 21, 2002
2
Motivation Sensor Nets and In-Network Query
Processing
  • Many Sensor Network Applications are Data
    Oriented
  • Queries Natural and Efficient Data Processing
    Mechanism
  • Easy (unlike embedded C code)
  • Enable optimizations through abstraction
  • Aggregates Common Case
  • E.g. Which rooms are in use?
  • In-network processing a must
  • Sensor networks power and bandwidth constrained
  • Communication dominates power cost
  • Not subject to Moores law!

3
Overview
  • Background
  • Sensor Networks
  • Our Approach Tiny Aggregation (TAG)
  • Overview
  • Expressiveness
  • Illustration
  • Optimizations
  • Grouping
  • Current Status Future Work

4
Overview
  • Background
  • Sensor Networks
  • Our Approach Tiny Aggregation (TAG)
  • Overview
  • Expressiveness
  • Illustration
  • Optimizations
  • Grouping
  • Current Status Future Work

5
Background Sensor Networks
  • A collection of small, radio-equipped, battery
    powered, networked microprocessors
  • Typically Ad-hoc Multihop Networks
  • Single devices unreliable
  • Very low power tiny batteries power for months
  • Apps Environment Monitoring, Personal Nets,
    Object Tracking
  • Data processing plays a key role!

6
Berkeley Mica Motes TinyOS
  • TinyOS operating system (services)
  • 4Mhz Processor
  • 4K RAM, 512K EEPROM, 128K code space
  • Single channel CSMA half-duplex radio _at_ 40kbits
  • Lossy 20 loss _at_ 5ft in Ganesan et al.
  • Communication Very Expensive 800 instrs/bit

7
Overview
  • Background
  • Sensor Networks
  • Our Approach Tiny Aggregation (TAG)
  • Overview
  • Expressiveness
  • Illustration
  • Optimizations
  • Grouping
  • Current Status Future Work

8
The Tiny Aggregation (TAG) Approach
  • Push declarative queries into network
  • Impose a hierarchical routing tree onto the
    network
  • Divide time into epochs
  • Every epoch, sensors evaluate query over local
    sensor data and data from children
  • Aggregate local and child data
  • Each node transmits just once per epoch
  • Pipelined approach increases throughput
  • Depending on aggregate function, various
    optimizations can be applied

9
SQL Primer
SELECT AVG(light) FROM sensors WHERE sound lt
100 GROUP BY roomNo HAVING AVG(light) lt 50
  • SQL is an established declarative language not
    wedded to it
  • Some extensions clearly necessary, e.g. for
    sample rates
  • We adopt a basic subset
  • sensors relation (table) has
  • One column for each reading-type, or attribute
  • One row for each externalized value
  • May represent an aggregation of several
    individual readings

SELECT aggn(attrn), attrs FROM
sensors WHERE selPreds GROUP BY
attrs HAVING havingPreds EPOCH DURATION s
10
Aggregation Functions
  • Standard SQL supports the basic 5
  • MIN, MAX, SUM, AVERAGE, and COUNT
  • We support any function conforming to

Aggnfmerge, finit, fevaluate Fmergelta1gt,lta2gt
? lta12gt finita0 ? lta0gt Fevaluatelta1gt ?
aggregate value (Merge associative, commutative!)
Partial Aggregate
Example Average AVGmerge ltS1, C1gt, ltS2, C2gt ?
lt S1 S2 , C1 C2gt AVGinitv ?
ltv,1gt AVGevaluateltS1, C1gt ? S1/C1
11
Query Propagation
  • TAG propagation agnostic
  • Any algorithm that can
  • Deliver the query to all sensors
  • Provide all sensors with one or more duplicate
    free routes to some root
  • Paper describes simple flooding approach
  • Query introduced at a root rebroadcast by all
    sensors until it reaches leaves
  • Sensors pick parent and level when they hear
    query
  • Reselect parent after k silent epochs

Query
1
P0, L1
2
3
P1, L2
P1, L2
4
P2, L3
6
P3, L3
5
P4, L4
12
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Depth d
13
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 1
1
Sensor
1 2 3 4 5
1 1 1 1 1 1
1
1
1
Epoch
1
14
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 2
3
Sensor
1 2 3 4 5
1 1 1 1 1 1
2 3 1 2 2 1
1
2
2
Epoch
1
15
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 3
4
Sensor
1 2 3 4 5
1 1 1 1 1 1
2 3 1 2 2 1
3 4 1 3 2 1
1
3
2
Epoch
1
16
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 4
5
Sensor
1 2 3 4 5
1 1 1 1 1 1
2 3 1 2 2 1
3 4 1 3 2 1
4 5 1 3 2 1
1
3
2
Epoch
1
17
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 5
5
Sensor
1 2 3 4 5
1 1 1 1 1 1
2 3 1 2 2 1
3 4 1 3 2 1
4 5 1 3 2 1
5 5 1 3 2 1
1
3
2
Epoch
1
18
Discussion
1
  • Result is a stream of values
  • Ideal for monitoring scenarios
  • One communication / node / epoch
  • Symmetric power consumption, even at root
  • New value on every epoch
  • After d-1 epochs, complete aggregation
  • Given a single loss, network will recover after
    at most d-1 epochs
  • With time synchronization, nodes can sleep
    between epochs, except during small communication
    window

2
3
4
5
19
Simulation Result
  • Simulation Results
  • 2500 Nodes
  • 50x50 Grid
  • Depth 10
  • Neighbors 20

Some aggregates require dramatically more state!
20
Optimization Channel Sharing
  • Insight Shared channel enables optimizations
  • Suppress messages that wont affect aggregate
  • E.g., in a MAX query, sensor with value v hears a
    neighbor with value v, so it doesnt report
  • Applies to all such exemplary aggregates
  • Learn about query advertisements it missed
  • If a sensor shows up in a new environment, it can
    learn about queries by looking at neighbors
    messages.
  • Root doesnt have to explicitly rebroadcast query!

21
Optimization Hypothesis Testing
  • Insight Root can provide information that will
    suppress readings that cannot affect the final
    aggregate value.
  • E.g. Tell all the nodes that the MIN is
    definitely lt 50 nodes with value 50 need not
    participate.
  • Works for any linear aggregate function
  • How is hypothesis computed?
  • Blind guess
  • Statistically informed guess
  • Observation over first few levels of tree /
    rounds of aggregate

22
Optimization Use Multiple Parents
  • For duplicate insensitive (e.g. MAX), or
    partitionable (e.g. COUNT) aggregates,
  • Send (part of) aggregate to all parents
  • Decreases variance
  • Dramatically, when there are lots of parents
  • No extra cost, since all messages broadcast

23
Grouping
  • Value-based, complete partitioning of records
  • If query is grouped, sensors apply predicate to
    local readings on each epoch
  • Aggregate records tagged with group
  • When a child record (with group) is received
  • If it belongs to a stored group, merge with
    existing record for that group
  • If not, just store it
  • At the end of each epoch, transmit one record per
    group

24
Overview
  • Background
  • Sensor Networks
  • Our Approach Tiny Aggregation (TAG)
  • Overview
  • Expressiveness
  • Illustration
  • Optimizations
  • Grouping
  • Current Status Future Work

25
Status Future Work
  • Status
  • Simple simulator
  • Complete set of experiments, including behavior
    of algorithms in the face of loss
  • Generalization of algorithms beyond complete
    pipelining
  • Taxonomy of aggregates to allow optimizations on
    functional properties
  • Basic implementation (shown in demo)
  • Future work
  • Expressiveness issues
  • Aggregates over temporal data
  • Nested queries, e.g MAX(AVG(1000 readings) _at_ each
    node)
  • Correctness Issues in The Face Of Loss
  • How does the user know which nodes are and are
    not included in an aggregate?

26
Summary
  • Declarative queries for aggregates
  • Straightforward, familiar interface
  • Enables optimizations
  • Snooping techniques for exemplary aggregates
  • Multiple parents for partitionable aggregates
  • Pipelined, epoch based algorithm
  • Streaming Results
  • Symmetric communication
  • Low-power friendly

27
Questions?
28
Grouping
  • GROUP BY expr
  • expr is an expression over one or more attributes
  • Evaluation of expr yields a group number
  • Each reading is a member of exactly one group
  • Example SELECT max(light) FROM sensors
  • GROUP BY TRUNC(temp/10)

Result
Sensor ID Light Temp Group
1 45 25 2
2 27 28 2
3 66 34 3
4 68 37 3
Group max(light)
2 45
3 68
29
Having
  • HAVING preds
  • preds filters out groups that do not satisfy
    predicate
  • versus WHERE, which filters out tuples that do
    not satisfy predicate
  • Example
  • SELECT max(temp) FROM sensors
  • GROUP BY light
  • HAVING max(temp) lt 100
  • Yields all groups with temperature under 100

30
Group Eviction
  • Problem Number of groups in any one iteration
    may exceed available storage on sensor
  • Solution Evict!
  • Choose one or more groups to forward up tree
  • Rely on nodes further up tree, or root, to
    recombine groups properly
  • What policy to choose?
  • Intuitively least popular group, since dont
    want to evict a group that will receive more
    values this epoch.
  • Experiments suggest
  • Policy matters very little
  • Evicting as many groups as will fit into a single
    message is good

31
Simulation Environment
  • Java-based simulation visualization for
    validating algorithms, collecting data.
  • Coarse grained event based simulation
  • Sensors arranged on a grid, radio connectivity by
    Euclidian distance
  • Communication model
  • Lossless All neighbors hear all messages
  • Lossy Messages lost with probability that
    increases with distance
  • Symmetric links
  • No collisions, hidden terminals, etc.

32
Simulation Screenshot
33
Experimental Results
  • Experiments with simulator
  • Performance of basic TAG
  • Benefits of hypothesis testing
  • Effect of loss
  • Most experiments in terms of bytes or messages
    sent, since message transmission is the dominant
    cost
  • Depends on radio being turned off between epochs
    and aggregation functions being cheap

34
Experiment Basic TAG
  • Dense Packing, Ideal Communication

35
Experiment Hypothesis Testing
  • Uniform Value Distribution, Dense Packing, Ideal
    Communication

36
Experiment Effects of Loss
37
Experiment Benefit of Cache
38
Pipelined Aggregates
Value from 2 produced at time t arrives at 1 at
time (t1)
  • After query propagates, during each epoch
  • Each sensor samples local sensors once
  • Combines them with PSRs from children
  • Outputs PSR representing aggregate state in the
    previous epoch.
  • After (d-1) epochs, PSR for the whole tree output
    at root
  • d Depth of the routing tree
  • If desired, partial state from top k levels could
    be output in kth epoch
  • To avoid combining PSRs from different epochs,
    sensors must cache values from children

Value from 5 produced at time t arrives at 1 at
time (t3)
39
Pipelining Example
SID Epoch Agg.
SID Epoch Agg.
SID Epoch Agg.
40
Pipelining Example
Epoch 0
SID Epoch Agg.
1 0 1
SID Epoch Agg.
2 0 1
4 0 1
lt4,0,1gt
lt5,0,1gt
SID Epoch Agg.
3 0 1
5 0 1
41
Pipelining Example
Epoch 1
SID Epoch Agg.
1 0 1
1 1 1
2 0 2
SID Epoch Agg.
2 0 1
4 0 1
2 1 1
4 1 1
3 0 2
lt2,0,2gt
lt4,1,1gt
lt3,0,2gt
lt5,1,1gt
SID Epoch Agg.
3 0 1
5 0 1
3 1 1
5 1 1
42
Pipelining Example
SID Epoch Agg.
1 0 1
1 1 1
2 0 2
1 2 1
2 0 4
lt1,0,3gt
Epoch 2
SID Epoch Agg.
2 0 1
4 0 1
2 1 1
4 1 1
3 0 2
2 2 1
4 2 1
3 1 2
lt2,0,4gt
lt4,2,1gt
lt3,1,2gt
lt5,2,1gt
SID Epoch Agg.
3 0 1
5 0 1
3 1 1
5 1 1
3 2 1
5 2 1
43
Pipelining Example
SID Epoch Agg.
1 0 1
1 1 1
2 0 2
1 2 1
2 0 4
lt1,0,5gt
Epoch 3
SID Epoch Agg.
2 0 1
4 0 1
2 1 1
4 1 1
3 0 2
2 2 1
4 2 1
3 1 2
lt2,1,4gt
lt4,3,1gt
lt3,2,2gt
lt5,3,1gt
SID Epoch Agg.
3 0 1
5 0 1
3 1 1
5 1 1
3 2 1
5 2 1
44
Pipelining Example
Epoch 4
lt1,1,5gt
lt2,2,4gt
lt4,4,1gt
lt3,3,2gt
lt5,4,1gt
45
Optimization Delta Compression
  • If a sensors reading is unchanged from previous
    epoch, it need not transmit.
  • Parents assume value is unchanged
  • Leverage child value cache
  • Periodic heartbeats to handle disconnection
  • Extension if a sensors reading is unchanged by
    more than some threshold, it need not transmit
  • Similar to hypothesis testing with AVERAGE
  • Really future work See C. Olsten, Best-Effort
    Cache Synchronization, SIGMOD 2002.

46
Taxonomy of Aggregates
  • TAG insight classifying aggregates according to
    various functional properties
  • Yields a general set of optimizations that can
    automatically be applied

Property Examples Affects
Partial State MEDIAN unbounded, MAX 1 record Effectiveness of TAG
Duplicate Sensitivity MIN dup. insensitive, AVG dup. sensitive Routing Redundancy
Exemplary vs. Summary MAX exemplary COUNT summary Applicability of Sampling, Effect of Loss
Monotonic COUNT monotonic AVG non-monotonic Hypothesis Testing, Snooping
47
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com