Programming Sensor Networks

About This Presentation

Title:

Programming Sensor Networks

Description:

An abstraction that guarantees reliable results. Largely autonomous, long lived network ... Leader election occurs at all the internal nodes of the tree ... – PowerPoint PPT presentation

Number of Views:57

Avg rating:3.0/5.0

Slides: 83

Provided by: lecs4

Category:

more less

Transcript and Presenter's Notes

Title: Programming Sensor Networks

1
Programming Sensor Networks

An amalgamation of slides from Indranil Gupta,
Robbert van Renesse, Kenneth Birman, Harold
Abelson, Don Allen, Daniel Coore, Chris Hanson,
George Homsy, Thomas F. Knight, Jr., Radhika
Nagpal, Erik Rauch, Gerald Jay Sussman, Ron
Weiss, Samuel Madden, Robert Szewczyk, Michael J.
Franklin, David Culler, Philippe Bonnet, Johannes
Gehrke, Praveen Seshadri

2
Outline

High Level
What we can learn from database research?
How does it relate to sensor networks?
Sensor Database Overview
Distributed Computing Prospective
Data-Centric Storage Approach
Amorphous Computing
My Research

3
Why databases?

Sensor networks should be able to
Accept queries for data
Respond with results
Users will need
An abstraction that guarantees reliable results
Largely autonomous, long lived network

4
Why databases?

Sensor networks are capable of producing massive
amounts of data
Efficient organization of nodes and data will
extend network lifetime
Database techniques already exist for efficient
data storage and access

5
Differences between databases and sensor networks

Database
Static data
Centralized
Failure is not an option
Plentiful resources
Administrated

Sensor Network
Streaming data
Large number of nodes
Multi-hop network
No global knowledge about the network
Frequent node failure
Energy is the scarce
Resource, limited memory
Autonomous

6
Bridging the Gap

What is needed to be able to treat a sensor
network like a database?
How should sensors be modeled?
How should queries be formulated?

7
Sensor Database Overview
8
Traditional Approach Warehousing

Data is extracted from sensors and stored on a
front-end server
Query processing takes place on the front-end.

Warehouse
Front-end
Sensor Nodes
9
What Wed Like to DoSensor Database System

Sensor Database System supports distributed query
processing over a sensor network

SensorDB
SensorDB
SensorDB
SensorDB
SensorDB
Front-end
SensorDB
SensorDB
SensorDB
Sensor Nodes
10
Sensor Database System

Characteristics of a Sensor Network Streams of
data, uncertain data, large number of nodes,
multi-hop network, no global knowledge about the
network, failure is the rule, energy is the
scarce resource, limited memory, no
administration,
Can existing database techniques be reused in
this new context? What are their limitations?
What are the new problems? What are the new
solutions?

11
Issues

Representing sensor data
Representing sensor queries
Processing query fragments on sensor nodes
Distributing query fragments
Adapting to changing network conditions
Dealing with site and communication failures
Deploying and Managing a sensor database system

12
Performance Metrics

High accuracy
Distance between ideal answer and actual answer?
Ratio of sensors participating in answer?
Low latency
Time between data is generated on sensors and
answer is returned
Limited resource usage
Energy consumption

13
Representing Sensor Data and Sensor Queries

Sensor Data
Output of signal processing functions
Time Stamped values produced over a given
duration
Inherently distributed
Sensor Queries
Conditions on time and space
Location dependent queries
Constraints on time stamps or aggregates over
time windows
Event notification

14
Early Work in Sensor Databases

Towards Sensor Database System
Querying the Physical World
Phillipe Bonnet, Johannes Gehrke, Praveen Seshdri

15
Fjording the Stream An Architecture for Queries
over Streaming Sensor Data

How can existing database querying methods be
applied to streaming data?
How can we combine real-time sensor data with
stored historical data?
What architecture is appropriate for supporting
simultaneous queries?
How can we lower sensor power consumption, while
still supporting a wide range of query types?

16
Traditional Database Operators

Are implemented using pull mechanisms.
Block on incoming data.
Most require all the data to be read first. (i.e.
sort, average)
Optimized for classic IO.
Usually implemented as separate threads.

17
Hardware Architecture

Centralized data processing.
Sensor proxies read and configure sensors.
Query processor interacts with proxies to request
and get sensor data.
Sensor proxies support multiple simultaneous
queries, multiplexing the data.

18
Operators

Implemented as state machines.
Support transition(state) method, which causes
the operator to optionally read from input
queues, write to output queue, and change state.
Multiple operators per thread, called by a
scheduler. (Round robin in the experiments)
Allows fine-grained tuning of processing time
allocated to each operator.

19
Sensor Sensitive Operators

Certain operations are impossible to perform on
continuous data streams. (sum, average, sort)
Can be performed on partial data windows.
Joins can be implemented by hashing tuples.
Can provide aggregation based on current data,
with continuous updates to parent operators.??

20
Sensor Proxy

Responsible for configuring sensor that belong to
it, setting sensing frequency, aggregation
policies, etc..
To save power, each sensor only listens for
commands from proxy during short intervals.
Handles incoming data from sensors, and pushes it
into appropriate queues.
Stores a copy to disk for historical queries.
Performs data filtering, which it can sometimes
offload to the sensors.

21
Building a Fjord

For all sensor data sources, locate the proxy for
the sensor, and install a query on it to deliver
tuples at a certain rate to a push queue.
For non-sensor data sources, set up a pull queue
to scan for data.
Pipe the data through the operators specified by
the query.

22
Query

Find average car speeds during time window (w),
for all segments the user is interested in
(knownSegments)
More complicated queries are possible, with joins
of streaming sensor data and historical data
stored in a normal database fashion.

23
Dataflow for the Query

Data is pushed from the sensors to the user,
through the filter operator set up by the query.
Multiple similar queries can be added to an
existing fjord, instead of creating one per query.

24
Experiment Sensors

32 of CalTrans inductive loop sensors equipped
with wireless radio links.
Sensors consist of sixteen pairs of sensors
(referred to as upstream and downstream),
with one pair on either side of the freeway on
eight distinct segments of I-80 .
Collect data at 60Hz and relay it back to a
server, where it is distributed to various
database sources, such as the implemented Fjords.

25
A Traffic Application

Traffic engineers want to know the speed and
length of cars on a freeway.
Two sensors are placed less than 1 car length
apart
The pair of sensors will perform computation
together

26
Contd.

Four time measurements are taken
The speed and length of the car are deduced by
the two sensors
The results are relayed back to the proxy

27
Contd.

To measure a cars length within 1 foot, assuming
a maximum speed of 60 mph sensors are sampled at
180 Hz
Sensors collaborate locally to find car speed and
length
Results are sent to base station

28
Power Usage
29
Conclusion

Fjords allow sensors to be treated as database
sources for querying, with little change in the
overall architecture.
Proxies can optimize energy consumption of
individual sensors based on user queries, and
multiplex data from sensors to multiple queries.
Processing is centralized, but can sometimes be
offloaded to the sensors, to lower energy
consumed by radio transmissions.

30
Aggregation
31
A Look at Aggregation

Supporting Aggregate Queries Over Ad-Hoc
Wireless Sensor Networks - Samuel Madden, Robert
Szewczyk, Michael J. Franklin, David Culler
Explores aggregation techniques that are
application independent
Count
Min
Max
Sum
Average

32
At A Glance

Trying to minimize the number of messages sent
All aggregation is done by building a tree

33
Tricks of the Trade

How do you ensure an aggregate is correct?
Compute it multiple times.
How do you reduce the message overhead of
redistributing queries?
Piggy back the query along with data messages.
Is there anyway to further reduce the messaging
overhead?
Child nodes only report their aggregates if
theyve changed.
Nodes can take advantage of multiple parents for
redundancy reasons.

34
A Different Perspective on Aggregation

Scalable Fault-Tolerant Aggregation in Large
Process Groups - Indranil Gupta, Robbert van
Renesse, Ken Birman
Large process groups inherently need to
communicate to accomplish a higher level task
Higher level tasks are usually driven by
aggregation

Kenneth Birman, Cornell University
35
Goal

Develop a protocol that allows accurate
estimation of global aggregate function
Each group member should be able to calculate the
global aggregate

Kenneth Birman, Cornell University
36
Assumption

Asynchronous communication medium
Unreliable message delivery
Globally unique identifiers
A routing layer capable of point-to-point
communication
The protocol is initiated at all members
simultaneously
No energy constraints

Kenneth Birman, Cornell University
37
Metrics

Protocol message complexity
Protocol time complexity
Completeness of the final result

Kenneth Birman, Cornell University
38
A Note on Composable Functions

If f is a composable global function then
f(W1, W2) g( f(W1), f(W2) )
where W1 and W2 are disjoint sets and g is a
known function
Example Let f and g be Max
Max(W1, W2) Max( Max(W1), Max(W2) )

Kenneth Birman, Cornell University
39
Straw Man 1 Fully Distributed Solution

Each member sends their vote to each other member
O(N2) message complexity
O(N) time complexity
Completeness of final result will depend highly
on the mediums loss rate.

Kenneth Birman, Cornell University
40
Straw Man 2 Centralized Solution

Each member sends its vote to a single leader
which calculates the aggregate and disseminates
the result
O(N) message complexity
O(N) time complexity
Additional overhead for election of leaders and
coordination between them

Kenneth Birman, Cornell University
41
Straw Man 3 Hierarchical Solution

Grid Box Hierarchy
Divide members into N/k grid boxes
Assign each grid box a unique base-k identifier
Those grid boxes with identifiers matching the
first i base-k digits form a sub tree of height i

Kenneth Birman, Cornell University
42
Straw Man 3 Hierarchical Solution Continued

Global Aggregate Computation
Performed bottom up
Requires logKN phases
Possible due to the composable nature of the
global aggregate function

Kenneth Birman, Cornell University
43
How is the Grid Box Hierarchy Built?

Using a hash function
Member ID is mapped into 0, 1
A Member M would belong to grid box
H(M) (N/K) (written in base-K)
Any member can calculate the grid box of another
member
Hash function can mirror the geographical/network
topology

Kenneth Birman, Cornell University
44
Hierarchy Approach with Leader Election

Leader election occurs at all the internal nodes
of the tree
Leaders calculate the global aggregate for their
sub-tree (recursive)
The root then disseminates the result to all nodes

Kenneth Birman, Cornell University
45
Hierarchy Approach with Leader Election Continued

Message complexity ? O(N)
Time complexity ? O(logN)
Completeness ? This method is not fault tolerant

Kenneth Birman, Cornell University
46
The Gossiping Approach

Adds fault tolerance to Hierarchical Approach
Gossiping is used to aggregate data instead of
leader election
Algorithm is started simultaneously at all
members
Algorithm requires logKN phases

Kenneth Birman, Cornell University
47
The Gossiping Approach Continued

Phase 1
Every member M randomly selects members in its
own grid box once per gossip round
M then sends each selected member one randomly
selected vote
After KlogN gossip rounds, M applies the
aggregate function and moves to Phase 2

Kenneth Birman, Cornell University
48
The Gossiping Approach Continued

Phase 2
For i from 2 to (logKN) 1
Each member M randomly selects some members
belonging to the same subtree of height i
M then sends these selected members a randomly
selected aggregate from an (i-1) subtree
After collecting enough of the (i-1) subtree
aggregates (or some timeout) loop restarts

Kenneth Birman, Cornell University
49
The Gossiping Approach Continued

Phase 3
Each member M should now have an estimate of the
global aggregate function
Time Complexity ? O(log2N)
Message Complexity ? O(Nlog2N)
Completeness ? Probability that a random members
vote is included in the final aggregation is
lower bounded by (1 - 1/N)

Kenneth Birman, Cornell University
50
Simulation Results

Scalability and fault-tolerance of protocol
Default Parameters
N 200 members, K 4
2 gossip targets per gossip round
floor(log2N) gossip rounds per phase
25 message loss rate
.1 member failure rate per gossip round
Metric
Incompleteness 1 Completeness
risk of excluding a member vote in a final
aggregate estimate

Kenneth Birman, Cornell University
51
Kenneth Birman, Cornell University
52
Kenneth Birman, Cornell University
53
Kenneth Birman, Cornell University
54
Conclusion

Aggregation of global properties in large process
groups
Time and message complexity, Completeness
Traditional solutions dont scale
Hierarchical gossiping approach
Scalability
Good fault-tolerance

Kenneth Birman, Cornell University
55
Data-Centric Storage in Sensornets S.
Ratnasamy, D. Estrin, R. Govindan, B. Karp, S.
Shenker

Motivation for Data-Centric Storage
In data-rich networks data-centric algorithms
seem to be energy efficient
Data-Centric routing has been shown to be energy
efficient
Data-Centric storage could act as a companion to
data-centric routing to save even more energy

56
Data-Centric Storage Applicability

Assumptions
Ad-hoc deployment over a known area
Nodes can communicate with several neighbors via
short range radio
Nodes know their own location
Energy is scarce (Gasp!)
Data enters/leaves the sensornet via access
point(s)
Network and communication topology is largely
static

57
Data-Centric Storage Applicability

Definitions
Observations low-level readings of basic sensors
Ex temperature, light, humidity et al.
Event an interesting collection of low level
observations
May combine several modalities
Event notifications contain the location of the
event, making observations available

58
Data-Centric Storage Applicability

More Definitions
Task what a user specifies the sensornet to do.
Action what a node should do upon observing an
event
Query how a user specifies data of interest

59
Data-Centric Storage Applicability

Three types of actions
External Store
Data is sent out of the network for processing
Message Cost O( sqrt(n) )
Local Store
Data is stored at the event source
Query Cost O( n )
Response Cost O( sqrt(n) )
Data-Centric Store
Data is sent to a specific node
Storage Cost O( sqrt(n) )
Query / Reponses Cost O( sqrt(n) )

60
Data-Centric Storage Applicability

The Scenario
Event locations are not known in advance
Event locations are random
Tasks are long-lived
Only one access point
Detecting events requires much more energy then
ongoing monitoring of data
Users may only be interested in event summaries

61
Data-Centric Storage Applicability

Scenario Parameters
n nodes in the network
T unique event types
Dtotal is the total number of events detected
Q denotes the number of event types (out of T)
for which queries are issued
Dq is the number of events detected

62
Data-Centric Storage Applicability

Costs
External Storage
Total Dtotal sqrt(n)
Hotspot Dtotal
Local Storage
Total Q n Dq sqrt(n)
Hotspot Q Dq
Data-Centric Storage
Total (list) Q sqrt(n) Dtotal sqrt(n)
Dq sqrt(n)
Total (summary) Q sqrt(n) Dtotal sqrt(n)
Q sqrt(n)
Hotspot (list) Q Dq
Hotspot (summary) 2 Q

63
Data-Centric Storage Applicability

Observations
As n gets large local storage costs the most
External storage always incurs a lower total
message count
With summarized events data-centric storage has
the smallest load
With listed events local and data-centric storage
have significantly lower access loads compared to
external storage

64
Data-Centric Storage Mechanisms

Distributed hash-table
Put( key, value )
Get( key )
Implementation details are left to one of several
P2P computing schemes

65
Data-Centric Storage Mechanisms

Greedy Perimeter Stateless Routing (almost)
In GPSR a message is dropped if no node exists at
the specified location
Data-Centric Storage routes a message to the node
closest to the specified location
To find an event the tuples that describe it are
used as inputs to the hash function.
The query is then routed to the node
corresponding to the hash functions output

66
Data-Centric Storage Mechanisms

Robustness
Refresh periodically the data-cache sends a
refresh to the event source
If a node closer to the key receives a refresh
then it becomes the new data-cache
Local Replication
Any node hearing a refresh caches the associate
data

67
Data-Centric Storage Mechanisms

Scalability
Structured Replication
Events are stored at the closest mirror
Reduces storage cost by a factor of 2d
d is dependant upon the number of mirrors
Queries must be routed to all mirrors

68
The Future of Sensor Networks?

Amorphous Computing
Draws heavily from biological and physical
metaphors
The Setup
Vast number of unreliable components
Asynchronous
Irregularly placed, but very dense
Interconnects are unknown and/or unreliable
The Goal
How can we engineer coherent behavior?

69
Amorphous Computing

Programming Paradigms
Nodes are all identical
Same program
Can store local state
Can generate random numbers
No knowledge of position or orientation
Can communicate with all nodes within a radius R

70
Amorphous Computing

Wave Propagation
Simulates chemical diffusion amongst cells
Chemicals alter the state of nodes
Growing-Point Language

71
Amorphous Computing Example

A growing point diffuses pheromone
Pheromone is specified to diffuse for H hops

72
Amorphous Computing Example

A growing point diffuses pheromone
Pheromone is specified to diffuse for H hops
Because of dense deployment, a circle of radius
RH is created

73
Wave Propagation Example

Growing a line
One GP diffuses a pheromone (blue)

74
Wave Propagation Example

Growing a line
One GP diffuses a pheromone (blue)
Green GP diffuses a pheromone that is accepted
only by nodes that have a higher red pheromone
concentration then previous hop.

75
Amorphous Computing

Proven to be able to produce any planar graph.
Global behavior emerges from local interaction
Proposes models for using biological components
as computational elements

76
Amorphous Computing

Fault Tolerance
Redundancy?
Abstractly structuring systems to produce the
right answer with high probability

77
A Little Bit About My Work

I was born in Santa Monica, CA
The ultimate goal is Complex Tasking of Sensor
Networks
Currently
Efficient, in-network algorithms for identifying
contours, gradients, and regions of interest

78
Contour/Gradient/Region Finding

A first stab at in-network processing
Useful to many applications
Topology
Marine biology
Geology
Chemical Concentrations
and much much more!

79
In the Future

Sensor Networks should be Autonomous
Questions
What sort of infrastructure makes pattern finding
(and in-network processing) more efficient?
Goal
To Program or Task the system efficiently

80
My Class Project

Using Mica Testbench to collect real sensornet
data
With this data I plan to perform simulations with
the goal of algorithmic development

81
Acknowledgements

DARPA Sensit Program
http//www.darpa.mil/ito/research/sensit/
Many thanks to Steve Beck, Richard Brooks, Jason
Hill, Bill Kaiser, Donald Kossman, Sri Kumar,
Tobias Mayr, Kris Pister, Joe Paradiso

82
Acknowledgements

Scalable Fault-Tolerant Aggregation in Large
Process Groups Indranil Gupta, Robbert van
Renesse, Kenneth Birman
Fjording the Stream An Architecture for Queries
over Streaming Sensor Data Samuel Madden,
Michael J. Franklin
Supporting Aggregate Queries Over Ad-Hoc
Wireless Sensor Networks Samuel Madden, Robert
Szewczyk, Michael J. Franklin, David Culler
Amorphous Computing Harold Abelson, Don
Allen, Daniel Coore, Chris Hanson, George Homsy,
Thomas F. Knight, Jr., Radhika Nagpal, Erik
Rauch, Gerald Jay Sussman, Ron Weiss