Querying Sensor Networks

About This Presentation

Title:

Querying Sensor Networks

Description:

HAVING AVG(volume) 200. Rooms w/ volume 200. General Declarative ... System is free to explore different algorithms, locations, orders for operations ... – PowerPoint PPT presentation

Number of Views:146

Avg rating:3.0/5.0

Slides: 79

Provided by: Sam34

Learn more at: https://db.csail.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: Querying Sensor Networks

1
Querying Sensor Networks

Sam Madden
UC Berkeley
October 2, 2002 _at_ UCLA

2
Introduction

Programming Sensor Networks Is Hard
Especially if you want to build a real
application
Declarative Queries Are Easy
And, can be faster and more robust than most
applications!

3
Overview

Overview of Declarative Systems
TinyDB
Features
Demo
Challenges Research Issues
Language
Optimizations
The Next Step

4
Overview

Overview of Declarative Systems
TinyDB
Features
Demo
Challenges Research Issues
Language
Optimizations
The Next Step

5
Declarative Queries SQL

SQL is the traditional declarative language used
in databases
SELECT sel-list
FROM tables
WHERE pred
GROUP BY pred
HAVING pred

SELECT dept.name, AVG(emp.salary) FROM
emp,dept WHERE emp.dno dept.dno AND
(dept.nameAccounting OR dept.nameMarketing)
GROUP BY dept.name
6
Declarative Queries for Sensor Networks

Examples
SELECT nodeid, light
FROM sensors
WHERE light 400
SAMPLE PERIOD 1s

1
7
General Declarative Advantages

Data Independence
Not required to specify how or where, just what.
Of course, can specify specific addresses when
needed
Transparent Optimization
System is free to explore different algorithms,
locations, orders for operations

8
Data Independence In Sensor Networks

Vastly simplifies execution for large networks
Since locations are described by predicates
Operations are over groups
Enables tolerance to faults
Since system is free to choose where and when
operations happen

9
Optimization In Sensor Networks

Optimization Goal Power!
Where to process data
In network
Outside network
Hybrid
How to process data
Predicate Join Ordering
Index Selection
How to route data
Semantically Driven Routing

10
Overview

Overview of Declarative Systems
TinyDB
Features
Demo
Challenges Research Issues
Language
Optimizations
The Next Step

11
TinyDB

A distributed query processor for networks of
Mica motes
Available today!
Goal Eliminate the need to write C code for
most TinyOS users
Features
Declarative queries
Temporal spatial operations
Multihop routing
In-network storage

12
TinyDB _at_ 10000 Ft
(Almost) All Queries are Continuous and Periodic

Written in SQL
With Extensions For
Sample rate
Offline delivery
Temporal Aggregation

13
TinyDB Demo
14
Applications Early Adopters

Some demo apps
Network monitoring
Vehicle tracking
Real future deployments
Environmental monitoring _at_ GDI (and James
Reserve?)
Generic Sensor Kit
Parking Lot Monitor

Demo!
15
TinyDB Architecture (Per node)
SelOperator
AggOperator

TupleRouter
Fetches readings (for ready queries)
Builds tuples
Applies operators
Deliver results (up tree)

TupleRouter

AggOperator
Combines local neighbor readings

Network

SelOperator
Filters readings

Radio Stack
Schema
TinyAllloc

Schema
Catalog of commands attributes (more later)

TinyAlloc
Reusable memory allocator!

16
TinyAlloc

Handle Based Compacting Memory Allocator
For Catalog, Queries

Handle h call MemAlloc.alloc(h,10) (h)0
Sam call MemAlloc.lock(h) tweakString(h) cal
l MemAlloc.unlock(h) call MemAlloc.free(h)
User Program
Compaction
17
Schema

Attribute Command IF
At INIT(), components register attributes and
commands they support
Commands implemented via wiring
Attributes fetched via accessor command
Catalog API allows local and remote queries over
known attributes / commands.
Demo of adding an attribute, executing a command.

18
Overview

Overview of Declarative Systems
TinyDB
Features
Demo
Challenges Research Issues
Language
Optimizations
Quality

19
3 Questions
?
?
?
?
?
?
?

Is this approach expressive enough?
Can this approach be efficient enough?
Are the answers this approach gives good enough?

20
Q1 Expressiveness

Simple data collection satisfies most users
How much of what people want to do is just simple
aggregates?
Anecdotally, most of it
EE people want filters simple statistics
(unless they can have signal processing)
However, wed like to satisfy everyone!

21
Query Language

New Features
Joins
Event-based triggers
Via extensible catalog
In network nested queries
Split-phase (offline) delivery
Via buffers

22
Sample Query 1

Bird counter
CREATE BUFFER birds(uint16 cnt)
SIZE 1
ON EVENT bird-enter()
SELECT b.cnt1
FROM birds AS b
OUTPUT INTO b
ONCE

23
Sample Query 2

Birds that entered and left within time t of each
other
ON EVENT bird-leave AND bird-enter WITHIN t
SELECT bird-leave.time, bird-leave.nest
WHERE bird-leave.nest bird-enter.nest
ONCE

24
Sample Query 3

Delta compression
SELECT light
FROM buf, sensors
WHERE s.light buf.light t
OUTPUT INTO buf
SAMPLE PERIOD 1s

25
Sample Query 4

Offline Delivery Event Chaining
CREATE BUFFER equake_data( uint16 loc, uint16
xAccel, uint16 yAccel)
SIZE 1000
PARTITION BY NODE
SELECT xAccel, yAccel
FROM SENSORS
WHERE xAccel t OR yAccel t
SIGNAL shake_start()
SAMPLE PERIOD 1s
ON EVENT shake_start()
SELECT loc, xAccel, yAccel
FROM sensors
OUTPUT INTO BUFFER equake_data(loc, xAccel,
yAccel)
SAMPLE PERIOD 10ms

26
Event Based Processing

Enables internal and chained actions
Language Semantics
Events are inter-node
Buffers can be global
Implementation plan
Events and buffers must be local
Since n-to-n communication not (well) supported
Next operator expressiveness

27
Operator Expressiveness Aggregate Framework

Standard SQL supports the basic 5
MIN, MAX, SUM, AVERAGE, and COUNT
We support any function conforming to

Aggnfmerge, finit, fevaluate Fmerge,
? finita0 ? Fevaluate ?
aggregate value (Merge associative, commutative!)
Partial Aggregate
Example Average AVGmerge , ?
AVGinitv ?
AVGevaluate ? S1/C1
From Tiny AGgregation (TAG), Madden, Franklin,
Hellerstein, Hong. OSDI 2002 (to appear).
28
Isobar Finding
29
Temporal Aggregates

TAG was about spatial aggregates
Inter-node, at the same time
Want to be able to aggregate across time as well
Two types
Windowed AGG(size,slide,attr)
Decaying AGG(comb_func, attr)
Demo!

size 4
slide 2
R1 R2 R3 R4 R5 R6
30
Expressiveness Review

Internal nested queries
With logging of results for offline delivery
Event based processing
Extensible aggregates
Spatial temporal
On to Question 2 What about efficiency?

31
Q2 Efficiency

Metric power consumption
Goal reduce communication, which dominates cost
800 instrs/bit!
Standard approach in-network processing,
sleeping whenever you can

32
But thats not good enough

What else can we do to bring down costs?
Sleep Even More?
Events are key
Apply automatic optimization!
Semantically driven routing
and topology construction
Operator placement ordering
Adaptive data delivery

33
TAG

In-network processing
Reduces costs depending on type of aggregates
Exploitation of operator semantics

Tiny AGgregation (TAG), Madden, Franklin,
Hellerstein, Hong. OSDI 2002 (to appear).
34
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Depth d
35
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 1
1
Sensor
1
1
1
Epoch
1
36
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 2
3
Sensor
1
2
2
Epoch
1
37
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 3
4
Sensor
1
3
2
Epoch
1
38
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 4
5
Sensor
1
3
2
Epoch
1
39
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 5
5
Sensor
1
3
2
Epoch
1
40
Simulation Result

Simulation Results
2500 Nodes
50x50 Grid
Depth 10
Neighbors 20

Some aggregates require dramatically more state!
41
Taxonomy of Aggregates

TAG insight classify aggregates according to
various functional properties
Yields a general set of optimizations that can
automatically be applied

42
Optimization Channel Sharing

Insight Shared channel enables optimizations
Suppress messages that wont affect aggregate
E.g., in a MAX query, sensor with value v hears a
neighbor with value v, so it doesnt report
Applies to all exemplary, monotonic aggregates
Learn about query advertisements it missed
If a sensor shows up in a new environment, it can
learn about queries by looking at neighbors
messages.
Root doesnt have to explicitly rebroadcast query!

43
Optimization Hypothesis Testing

Insight Root can provide information that will
suppress readings that cannot affect the final
aggregate value.
E.g. Tell all the nodes that the MIN is
definitely
participate.
Depends on monotonicity
How is hypothesis computed?
Blind guess
Statistically informed guess
Observation over first few levels of tree /
rounds of aggregate

44
Optimization Use Multiple Parents

For duplicate insensitive aggregates
Or aggregates that can be expressed as a linear
combination of parts
Send (part of) aggregate to all parents
Decreases variance
Dramatically, when there are lots of parents

45
TAG Summary

In Query Processing A Win For Many Aggregate
Functions
By exploiting general functional properties of
operators, many optimizations are possible
Requires new aggregates to be tagged with their
properties
Up next non-aggregate query processing
optimizations a flavor of things to come!

46
Attribute Driven Topology Selection

Observation internal queries often over local
area
Or some other subset of the network
E.g. regions with light value in 10,20
Idea build topology for those queries based on
values of range-selected attributes
Requires range attributes, connectivity to be
relatively static

Heideman et. Al, Building Efficient Wireless
Sensor Networks With Low Level Naming. SOSP, 2001.
47
Attribute Driven Query Propagation
SELECT WHERE a 5 AND a Precomputed intervals Query Dissemination
Index
4
1,10
20,40
7,15
1
2
3
48
Attribute Driven Parent Selection
Even without intervals, expect that sending to
parent with closest value will help
1
2
3
1,10
20,40
7,15
3,6 ? 1,10 3,6 3,7 ? 7,15 ø 3,7 ?
20,40 ø
4
3,6
49
Hot off the press
50
Operator Placement Ordering

Observation Nested queries, triggers, and joins
can often be re-ordered
Ordering can dramatically affect the amount of
work you do
Lots of standard database tricks here

51
Operator Ordering Example 1

SELECT light, mag
FROM sensors
WHERE pred1(mag)
AND pred2(light)
SAMPLE INTERVAL 1s

Cost (in J) of sampling mag cost of sampling
light
Correct ordering (unless pred1 is very
selective)
1. Sample light
2. Apply pred2
3. Sample mag
4. Apply pred1

52
Operator Ordering Example 2
Every time an event occurs that satisfies pred1,
sample lights once every 5 seconds for 30 seconds
and report the samples that satisfy pred2

ON EVENT bird-enter()
WHERE pred1(event)
SELECT light
WHERE pred2(light)
FROM sensors
SAMPLE INTERVAL 5s
FOR 30s

Note makes all samples in phase in sample window
Sample light once every 5 seconds. For every
sample that satisfies pred2, check and see if any
events that satisfy pred1 have occurred in the
last 30 seconds.
SELECT s.light FROM bird-enter-events30s AS
e, sensors AS s WHERE e.time pred1(e) AND pred2(s.light) SAMPLE INTERVAL 5s
53
Adaptivity For Contention

Observation Under high contention, radios
deliver fewer total packets than under low
contention.
Insight Dont allow radios to be highly
contested. Drop or aggregate instead.
Higher throughput
Choice over what gets lost
Based on semantics!

54
Adaptivity for Power Conservation

For many applications, exact sample rate doesnt
matter
But network lifetime does!
Idea adaptively adjust sample rate extent of
aggregation based on lifetime goal and observed
power consumption

55
Efficiency Summary

Power is the important metric
TAG
In-network processing
Exploit semantics of network and operators
Channel sharing
Hypothesis testing
Using multiple parents
Indexing for dissemination collection of data
Placement and Operator Ordering
Adaptive Sampling

56
Q3 Answer Quality

Lots of possibilities for improving quality
Multi-path routing
When applicable
Transactional delivery
a.k.a. custody transfer
Link-layer retransmission
Caching
Failure still possible in all modes
Open question whats the right quality metric?

57
Diffusion as TinyDB Foundation?

Claim diffusion is an infrastructure upon which
TinyDB could be built
Via declarative language, TinyDB is able to
provide semantic guarantees and transparent
optimization
Operators can be reordered
Any tuple can be routed to any operator
No (important) duplicates will be produced
At what cost? Diffusion can
Adjust better to loss
Exploit well-connected networks
Provide n-m routing, instead of n-1 routing
Might allow global buffers, events, etc.

58
Summary

Declarative queries are the right interface for
data collection in sensor nets!
In network processing and optimization make
approach viable
Big query language improvements coming soon
Event driven internal queries
Adaptive sampling query indexes for
performance!
TinyDB Available Today
http//telegraph.cs.berkeley.edu/tinydb

59
Questions?
60
Grouping

GROUP BY expr
expr is an expression over one or more attributes
Evaluation of expr yields a group number
Each reading is a member of exactly one group
Example SELECT max(light) FROM sensors
GROUP BY TRUNC(temp/10)

Result
61
Having

HAVING preds
preds filters out groups that do not satisfy
predicate
versus WHERE, which filters out tuples that do
not satisfy predicate
Example
SELECT max(temp) FROM sensors
GROUP BY light
HAVING max(temp)
Yields all groups with temperature under 100

62
Group Eviction

Problem Number of groups in any one iteration
may exceed available storage on sensor
Solution Evict!
Choose one or more groups to forward up tree
Rely on nodes further up tree, or root, to
recombine groups properly
What policy to choose?
Intuitively least popular group, since dont
want to evict a group that will receive more
values this epoch.
Experiments suggest
Policy matters very little
Evicting as many groups as will fit into a single
message is good

63
Simulation Environment

Java-based simulation visualization for
validating algorithms, collecting data.
Coarse grained event based simulation
Sensors arranged on a grid, radio connectivity by
Euclidian distance
Communication model
Lossless All neighbors hear all messages
Lossy Messages lost with probability that
increases with distance
Symmetric links
No collisions, hidden terminals, etc.

64
Simulation Screenshot
65
Experiment Basic TAG

Dense Packing, Ideal Communication

66
Experiment Hypothesis Testing

Uniform Value Distribution, Dense Packing, Ideal
Communication

67
Experiment Effects of Loss
68
Experiment Benefit of Cache
69
Pipelined Aggregates

After query propagates, during each epoch
Each sensor samples local sensors once
Combines them with PSRs from children
Outputs PSR representing aggregate state in the
previous epoch.
After (d-1) epochs, PSR for the whole tree output
at root
d Depth of the routing tree
If desired, partial state from top k levels could
be output in kth epoch
To avoid combining PSRs from different epochs,
sensors must cache values from children

Value from 2 produced at time t arrives at 1 at
time (t1)
Value from 5 produced at time t arrives at 1 at
time (t3)
70
Pipelining Example
71
Pipelining Example
Epoch 0

72
Pipelining Example
Epoch 1

73
Pipelining Example

Epoch 2

74
Pipelining Example

Epoch 3

75
Pipelining Example
Epoch 4

76
Our Stream Semantics

One stream, sensors
We control data rates
Joins between that stream and buffers are allowed
Joins are always landmark, forward in time, one
tuple at a time
Result of queries over sensors either a single
tuple (at time of query) or a stream
Easy to interface to more sophisticated systems
Temporal aggregates enable fancy window
operations

77
Formal Spec.

ON EVENT ... WITHIN
SELECT agg()temporalag
g() FROM sensors
events WHERE GROUP BY
HAVING ACTION
WHERE BUFFER
SIGNAL () (SELECT ...
) INTO BUFFER SAMPLE PERIOD
FOR INTERPOLATE
COMBINE temporal_agg()
ONCE

78
Buffer Commands