Querying Sensor Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Querying Sensor Networks

Description:

HAVING AVG(volume) 200. Rooms w/ volume 200. General Declarative ... System is free to explore different algorithms, locations, orders for operations ... – PowerPoint PPT presentation

Number of Views:146
Avg rating:3.0/5.0
Slides: 79
Provided by: Sam34
Learn more at: https://db.csail.mit.edu
Category:

less

Transcript and Presenter's Notes

Title: Querying Sensor Networks


1
Querying Sensor Networks
  • Sam Madden
  • UC Berkeley
  • October 2, 2002 _at_ UCLA

2
Introduction
  • Programming Sensor Networks Is Hard
  • Especially if you want to build a real
    application
  • Declarative Queries Are Easy
  • And, can be faster and more robust than most
    applications!

3
Overview
  • Overview of Declarative Systems
  • TinyDB
  • Features
  • Demo
  • Challenges Research Issues
  • Language
  • Optimizations
  • The Next Step

4
Overview
  • Overview of Declarative Systems
  • TinyDB
  • Features
  • Demo
  • Challenges Research Issues
  • Language
  • Optimizations
  • The Next Step

5
Declarative Queries SQL
  • SQL is the traditional declarative language used
    in databases
  • SELECT sel-list
  • FROM tables
  • WHERE pred
  • GROUP BY pred
  • HAVING pred

SELECT dept.name, AVG(emp.salary) FROM
emp,dept WHERE emp.dno dept.dno AND
(dept.nameAccounting OR dept.nameMarketing)
GROUP BY dept.name
6
Declarative Queries for Sensor Networks
  • Examples
  • SELECT nodeid, light
  • FROM sensors
  • WHERE light 400
  • SAMPLE PERIOD 1s

1
7
General Declarative Advantages
  • Data Independence
  • Not required to specify how or where, just what.
  • Of course, can specify specific addresses when
    needed
  • Transparent Optimization
  • System is free to explore different algorithms,
    locations, orders for operations

8
Data Independence In Sensor Networks
  • Vastly simplifies execution for large networks
  • Since locations are described by predicates
  • Operations are over groups
  • Enables tolerance to faults
  • Since system is free to choose where and when
    operations happen

9
Optimization In Sensor Networks
  • Optimization Goal Power!
  • Where to process data
  • In network
  • Outside network
  • Hybrid
  • How to process data
  • Predicate Join Ordering
  • Index Selection
  • How to route data
  • Semantically Driven Routing

10
Overview
  • Overview of Declarative Systems
  • TinyDB
  • Features
  • Demo
  • Challenges Research Issues
  • Language
  • Optimizations
  • The Next Step

11
TinyDB
  • A distributed query processor for networks of
    Mica motes
  • Available today!
  • Goal Eliminate the need to write C code for
    most TinyOS users
  • Features
  • Declarative queries
  • Temporal spatial operations
  • Multihop routing
  • In-network storage

12
TinyDB _at_ 10000 Ft
(Almost) All Queries are Continuous and Periodic
  • Written in SQL
  • With Extensions For
  • Sample rate
  • Offline delivery
  • Temporal Aggregation

13
TinyDB Demo
14
Applications Early Adopters
  • Some demo apps
  • Network monitoring
  • Vehicle tracking
  • Real future deployments
  • Environmental monitoring _at_ GDI (and James
    Reserve?)
  • Generic Sensor Kit
  • Parking Lot Monitor

Demo!
15
TinyDB Architecture (Per node)
SelOperator
AggOperator
  • TupleRouter
  • Fetches readings (for ready queries)
  • Builds tuples
  • Applies operators
  • Deliver results (up tree)

TupleRouter
  • AggOperator
  • Combines local neighbor readings

Network
  • SelOperator
  • Filters readings

Radio Stack
Schema
TinyAllloc
  • Schema
  • Catalog of commands attributes (more later)
  • TinyAlloc
  • Reusable memory allocator!

16
TinyAlloc
  • Handle Based Compacting Memory Allocator
  • For Catalog, Queries

Handle h call MemAlloc.alloc(h,10) (h)0
Sam call MemAlloc.lock(h) tweakString(h) cal
l MemAlloc.unlock(h) call MemAlloc.free(h)
User Program
Compaction
17
Schema
  • Attribute Command IF
  • At INIT(), components register attributes and
    commands they support
  • Commands implemented via wiring
  • Attributes fetched via accessor command
  • Catalog API allows local and remote queries over
    known attributes / commands.
  • Demo of adding an attribute, executing a command.

18
Overview
  • Overview of Declarative Systems
  • TinyDB
  • Features
  • Demo
  • Challenges Research Issues
  • Language
  • Optimizations
  • Quality

19
3 Questions
?
?
?
?
?
?
?
  • Is this approach expressive enough?
  • Can this approach be efficient enough?
  • Are the answers this approach gives good enough?

20
Q1 Expressiveness
  • Simple data collection satisfies most users
  • How much of what people want to do is just simple
    aggregates?
  • Anecdotally, most of it
  • EE people want filters simple statistics
    (unless they can have signal processing)
  • However, wed like to satisfy everyone!

21
Query Language
  • New Features
  • Joins
  • Event-based triggers
  • Via extensible catalog
  • In network nested queries
  • Split-phase (offline) delivery
  • Via buffers

22
Sample Query 1
  • Bird counter
  • CREATE BUFFER birds(uint16 cnt)
  • SIZE 1
  • ON EVENT bird-enter()
  • SELECT b.cnt1
  • FROM birds AS b
  • OUTPUT INTO b
  • ONCE

23
Sample Query 2
  • Birds that entered and left within time t of each
    other
  • ON EVENT bird-leave AND bird-enter WITHIN t
  • SELECT bird-leave.time, bird-leave.nest
  • WHERE bird-leave.nest bird-enter.nest
  • ONCE

24
Sample Query 3
  • Delta compression
  • SELECT light
  • FROM buf, sensors
  • WHERE s.light buf.light t
  • OUTPUT INTO buf
  • SAMPLE PERIOD 1s

25
Sample Query 4
  • Offline Delivery Event Chaining
  • CREATE BUFFER equake_data( uint16 loc, uint16
    xAccel, uint16 yAccel)
  • SIZE 1000
  • PARTITION BY NODE
  • SELECT xAccel, yAccel
  • FROM SENSORS
  • WHERE xAccel t OR yAccel t
  • SIGNAL shake_start()
  • SAMPLE PERIOD 1s
  • ON EVENT shake_start()
  • SELECT loc, xAccel, yAccel
  • FROM sensors
  • OUTPUT INTO BUFFER equake_data(loc, xAccel,
    yAccel)
  • SAMPLE PERIOD 10ms

26
Event Based Processing
  • Enables internal and chained actions
  • Language Semantics
  • Events are inter-node
  • Buffers can be global
  • Implementation plan
  • Events and buffers must be local
  • Since n-to-n communication not (well) supported
  • Next operator expressiveness

27
Operator Expressiveness Aggregate Framework
  • Standard SQL supports the basic 5
  • MIN, MAX, SUM, AVERAGE, and COUNT
  • We support any function conforming to

Aggnfmerge, finit, fevaluate Fmerge,
? finita0 ? Fevaluate ?
aggregate value (Merge associative, commutative!)
Partial Aggregate
Example Average AVGmerge , ?
AVGinitv ?
AVGevaluate ? S1/C1
From Tiny AGgregation (TAG), Madden, Franklin,
Hellerstein, Hong. OSDI 2002 (to appear).
28
Isobar Finding
29
Temporal Aggregates
  • TAG was about spatial aggregates
  • Inter-node, at the same time
  • Want to be able to aggregate across time as well
  • Two types
  • Windowed AGG(size,slide,attr)
  • Decaying AGG(comb_func, attr)
  • Demo!

size 4
slide 2
R1 R2 R3 R4 R5 R6
30
Expressiveness Review
  • Internal nested queries
  • With logging of results for offline delivery
  • Event based processing
  • Extensible aggregates
  • Spatial temporal
  • On to Question 2 What about efficiency?

31
Q2 Efficiency
  • Metric power consumption
  • Goal reduce communication, which dominates cost
  • 800 instrs/bit!
  • Standard approach in-network processing,
    sleeping whenever you can

32
But thats not good enough
  • What else can we do to bring down costs?
  • Sleep Even More?
  • Events are key
  • Apply automatic optimization!
  • Semantically driven routing
  • and topology construction
  • Operator placement ordering
  • Adaptive data delivery

33
TAG
  • In-network processing
  • Reduces costs depending on type of aggregates
  • Exploitation of operator semantics

Tiny AGgregation (TAG), Madden, Franklin,
Hellerstein, Hong. OSDI 2002 (to appear).
34
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Depth d
35
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 1
1
Sensor
1
1
1
Epoch
1
36
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 2
3
Sensor
1
2
2
Epoch
1
37
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 3
4
Sensor
1
3
2
Epoch
1
38
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 4
5
Sensor
1
3
2
Epoch
1
39
Illustration Pipelined Aggregation
SELECT COUNT() FROM sensors
Epoch 5
5
Sensor
1
3
2
Epoch
1
40
Simulation Result
  • Simulation Results
  • 2500 Nodes
  • 50x50 Grid
  • Depth 10
  • Neighbors 20

Some aggregates require dramatically more state!
41
Taxonomy of Aggregates
  • TAG insight classify aggregates according to
    various functional properties
  • Yields a general set of optimizations that can
    automatically be applied

42
Optimization Channel Sharing
  • Insight Shared channel enables optimizations
  • Suppress messages that wont affect aggregate
  • E.g., in a MAX query, sensor with value v hears a
    neighbor with value v, so it doesnt report
  • Applies to all exemplary, monotonic aggregates
  • Learn about query advertisements it missed
  • If a sensor shows up in a new environment, it can
    learn about queries by looking at neighbors
    messages.
  • Root doesnt have to explicitly rebroadcast query!

43
Optimization Hypothesis Testing
  • Insight Root can provide information that will
    suppress readings that cannot affect the final
    aggregate value.
  • E.g. Tell all the nodes that the MIN is
    definitely
    participate.
  • Depends on monotonicity
  • How is hypothesis computed?
  • Blind guess
  • Statistically informed guess
  • Observation over first few levels of tree /
    rounds of aggregate

44
Optimization Use Multiple Parents
  • For duplicate insensitive aggregates
  • Or aggregates that can be expressed as a linear
    combination of parts
  • Send (part of) aggregate to all parents
  • Decreases variance
  • Dramatically, when there are lots of parents

45
TAG Summary
  • In Query Processing A Win For Many Aggregate
    Functions
  • By exploiting general functional properties of
    operators, many optimizations are possible
  • Requires new aggregates to be tagged with their
    properties
  • Up next non-aggregate query processing
    optimizations a flavor of things to come!

46
Attribute Driven Topology Selection
  • Observation internal queries often over local
    area
  • Or some other subset of the network
  • E.g. regions with light value in 10,20
  • Idea build topology for those queries based on
    values of range-selected attributes
  • Requires range attributes, connectivity to be
    relatively static

Heideman et. Al, Building Efficient Wireless
Sensor Networks With Low Level Naming. SOSP, 2001.
47
Attribute Driven Query Propagation
SELECT WHERE a 5 AND a Precomputed intervals Query Dissemination
Index
4
1,10
20,40
7,15
1
2
3
48
Attribute Driven Parent Selection
Even without intervals, expect that sending to
parent with closest value will help
1
2
3
1,10
20,40
7,15
3,6 ? 1,10 3,6 3,7 ? 7,15 ø 3,7 ?
20,40 ø
4
3,6
49
Hot off the press
50
Operator Placement Ordering
  • Observation Nested queries, triggers, and joins
    can often be re-ordered
  • Ordering can dramatically affect the amount of
    work you do
  • Lots of standard database tricks here

51
Operator Ordering Example 1
  • SELECT light, mag
  • FROM sensors
  • WHERE pred1(mag)
  • AND pred2(light)
  • SAMPLE INTERVAL 1s
  • Cost (in J) of sampling mag cost of sampling
    light
  • Correct ordering (unless pred1 is very
    selective)
  • 1. Sample light
  • 2. Apply pred2
  • 3. Sample mag
  • 4. Apply pred1

52
Operator Ordering Example 2
Every time an event occurs that satisfies pred1,
sample lights once every 5 seconds for 30 seconds
and report the samples that satisfy pred2
  • ON EVENT bird-enter()
  • WHERE pred1(event)
  • SELECT light
  • WHERE pred2(light)
  • FROM sensors
  • SAMPLE INTERVAL 5s
  • FOR 30s

Note makes all samples in phase in sample window
Sample light once every 5 seconds. For every
sample that satisfies pred2, check and see if any
events that satisfy pred1 have occurred in the
last 30 seconds.
SELECT s.light FROM bird-enter-events30s AS
e, sensors AS s WHERE e.time pred1(e) AND pred2(s.light) SAMPLE INTERVAL 5s
53
Adaptivity For Contention
  • Observation Under high contention, radios
    deliver fewer total packets than under low
    contention.
  • Insight Dont allow radios to be highly
    contested. Drop or aggregate instead.
  • Higher throughput
  • Choice over what gets lost
  • Based on semantics!

54
Adaptivity for Power Conservation
  • For many applications, exact sample rate doesnt
    matter
  • But network lifetime does!
  • Idea adaptively adjust sample rate extent of
    aggregation based on lifetime goal and observed
    power consumption

55
Efficiency Summary
  • Power is the important metric
  • TAG
  • In-network processing
  • Exploit semantics of network and operators
  • Channel sharing
  • Hypothesis testing
  • Using multiple parents
  • Indexing for dissemination collection of data
  • Placement and Operator Ordering
  • Adaptive Sampling

56
Q3 Answer Quality
  • Lots of possibilities for improving quality
  • Multi-path routing
  • When applicable
  • Transactional delivery
  • a.k.a. custody transfer
  • Link-layer retransmission
  • Caching
  • Failure still possible in all modes
  • Open question whats the right quality metric?

57
Diffusion as TinyDB Foundation?
  • Claim diffusion is an infrastructure upon which
    TinyDB could be built
  • Via declarative language, TinyDB is able to
    provide semantic guarantees and transparent
    optimization
  • Operators can be reordered
  • Any tuple can be routed to any operator
  • No (important) duplicates will be produced
  • At what cost? Diffusion can
  • Adjust better to loss
  • Exploit well-connected networks
  • Provide n-m routing, instead of n-1 routing
  • Might allow global buffers, events, etc.

58
Summary
  • Declarative queries are the right interface for
    data collection in sensor nets!
  • In network processing and optimization make
    approach viable
  • Big query language improvements coming soon
  • Event driven internal queries
  • Adaptive sampling query indexes for
    performance!
  • TinyDB Available Today
  • http//telegraph.cs.berkeley.edu/tinydb

59
Questions?
60
Grouping
  • GROUP BY expr
  • expr is an expression over one or more attributes
  • Evaluation of expr yields a group number
  • Each reading is a member of exactly one group
  • Example SELECT max(light) FROM sensors
  • GROUP BY TRUNC(temp/10)

Result
61
Having
  • HAVING preds
  • preds filters out groups that do not satisfy
    predicate
  • versus WHERE, which filters out tuples that do
    not satisfy predicate
  • Example
  • SELECT max(temp) FROM sensors
  • GROUP BY light
  • HAVING max(temp)
  • Yields all groups with temperature under 100

62
Group Eviction
  • Problem Number of groups in any one iteration
    may exceed available storage on sensor
  • Solution Evict!
  • Choose one or more groups to forward up tree
  • Rely on nodes further up tree, or root, to
    recombine groups properly
  • What policy to choose?
  • Intuitively least popular group, since dont
    want to evict a group that will receive more
    values this epoch.
  • Experiments suggest
  • Policy matters very little
  • Evicting as many groups as will fit into a single
    message is good

63
Simulation Environment
  • Java-based simulation visualization for
    validating algorithms, collecting data.
  • Coarse grained event based simulation
  • Sensors arranged on a grid, radio connectivity by
    Euclidian distance
  • Communication model
  • Lossless All neighbors hear all messages
  • Lossy Messages lost with probability that
    increases with distance
  • Symmetric links
  • No collisions, hidden terminals, etc.

64
Simulation Screenshot
65
Experiment Basic TAG
  • Dense Packing, Ideal Communication

66
Experiment Hypothesis Testing
  • Uniform Value Distribution, Dense Packing, Ideal
    Communication

67
Experiment Effects of Loss
68
Experiment Benefit of Cache
69
Pipelined Aggregates
  • After query propagates, during each epoch
  • Each sensor samples local sensors once
  • Combines them with PSRs from children
  • Outputs PSR representing aggregate state in the
    previous epoch.
  • After (d-1) epochs, PSR for the whole tree output
    at root
  • d Depth of the routing tree
  • If desired, partial state from top k levels could
    be output in kth epoch
  • To avoid combining PSRs from different epochs,
    sensors must cache values from children

Value from 2 produced at time t arrives at 1 at
time (t1)
Value from 5 produced at time t arrives at 1 at
time (t3)
70
Pipelining Example
71
Pipelining Example
Epoch 0


72
Pipelining Example
Epoch 1




73
Pipelining Example

Epoch 2




74
Pipelining Example

Epoch 3




75
Pipelining Example
Epoch 4





76
Our Stream Semantics
  • One stream, sensors
  • We control data rates
  • Joins between that stream and buffers are allowed
  • Joins are always landmark, forward in time, one
    tuple at a time
  • Result of queries over sensors either a single
    tuple (at time of query) or a stream
  • Easy to interface to more sophisticated systems
  • Temporal aggregates enable fancy window
    operations

77
Formal Spec.
  • ON EVENT ... WITHIN
    SELECT agg()temporalag
    g() FROM sensors
    events WHERE GROUP BY
    HAVING ACTION
    WHERE BUFFER
    SIGNAL () (SELECT ...
    ) INTO BUFFER SAMPLE PERIOD
    FOR INTERPOLATE
    COMBINE temporal_agg()
    ONCE

78
Buffer Commands
  • AT
  • CREATE BUFFER ()
  • PARTITION BY
  • SIZE ,
  • AS SELECT ...
  • SAMPLE PERIOD
  • DROP BUFFER
Write a Comment
User Comments (0)
About PowerShow.com