Programming Sensor Networks - PowerPoint PPT Presentation

1 / 82
About This Presentation
Title:

Programming Sensor Networks

Description:

An abstraction that guarantees reliable results. Largely autonomous, long lived network ... Leader election occurs at all the internal nodes of the tree ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 83
Provided by: lecs4
Category:

less

Transcript and Presenter's Notes

Title: Programming Sensor Networks


1
Programming Sensor Networks
  • An amalgamation of slides from Indranil Gupta,
    Robbert van Renesse, Kenneth Birman, Harold
    Abelson, Don Allen, Daniel Coore, Chris Hanson,
    George Homsy, Thomas F. Knight, Jr., Radhika
    Nagpal, Erik Rauch, Gerald Jay Sussman, Ron
    Weiss, Samuel Madden, Robert Szewczyk, Michael J.
    Franklin, David Culler, Philippe Bonnet, Johannes
    Gehrke, Praveen Seshadri

2
Outline
  • High Level
  • What we can learn from database research?
  • How does it relate to sensor networks?
  • Sensor Database Overview
  • Distributed Computing Prospective
  • Data-Centric Storage Approach
  • Amorphous Computing
  • My Research

3
Why databases?
  • Sensor networks should be able to
  • Accept queries for data
  • Respond with results
  • Users will need
  • An abstraction that guarantees reliable results
  • Largely autonomous, long lived network

4
Why databases?
  • Sensor networks are capable of producing massive
    amounts of data
  • Efficient organization of nodes and data will
    extend network lifetime
  • Database techniques already exist for efficient
    data storage and access

5
Differences between databases and sensor networks
  • Database
  • Static data
  • Centralized
  • Failure is not an option
  • Plentiful resources
  • Administrated
  • Sensor Network
  • Streaming data
  • Large number of nodes
  • Multi-hop network
  • No global knowledge about the network
  • Frequent node failure
  • Energy is the scarce
  • Resource, limited memory
  • Autonomous

6
Bridging the Gap
  • What is needed to be able to treat a sensor
    network like a database?
  • How should sensors be modeled?
  • How should queries be formulated?

7
Sensor Database Overview
8
Traditional Approach Warehousing
  • Data is extracted from sensors and stored on a
    front-end server
  • Query processing takes place on the front-end.

Warehouse
Front-end
Sensor Nodes
9
What Wed Like to DoSensor Database System
  • Sensor Database System supports distributed query
    processing over a sensor network

SensorDB
SensorDB
SensorDB
SensorDB
SensorDB
Front-end
SensorDB
SensorDB
SensorDB
Sensor Nodes
10
Sensor Database System
  • Characteristics of a Sensor Network Streams of
    data, uncertain data, large number of nodes,
    multi-hop network, no global knowledge about the
    network, failure is the rule, energy is the
    scarce resource, limited memory, no
    administration,
  • Can existing database techniques be reused in
    this new context? What are their limitations?
  • What are the new problems? What are the new
    solutions?

11
Issues
  • Representing sensor data
  • Representing sensor queries
  • Processing query fragments on sensor nodes
  • Distributing query fragments
  • Adapting to changing network conditions
  • Dealing with site and communication failures
  • Deploying and Managing a sensor database system

12
Performance Metrics
  • High accuracy
  • Distance between ideal answer and actual answer?
  • Ratio of sensors participating in answer?
  • Low latency
  • Time between data is generated on sensors and
    answer is returned
  • Limited resource usage
  • Energy consumption

13
Representing Sensor Data and Sensor Queries
  • Sensor Data
  • Output of signal processing functions
  • Time Stamped values produced over a given
    duration
  • Inherently distributed
  • Sensor Queries
  • Conditions on time and space
  • Location dependent queries
  • Constraints on time stamps or aggregates over
    time windows
  • Event notification

14
Early Work in Sensor Databases
  • Towards Sensor Database System
  • Querying the Physical World
  • Phillipe Bonnet, Johannes Gehrke, Praveen Seshdri

15
Fjording the Stream An Architecture for Queries
over Streaming Sensor Data
  • How can existing database querying methods be
    applied to streaming data?
  • How can we combine real-time sensor data with
    stored historical data?
  • What architecture is appropriate for supporting
    simultaneous queries?
  • How can we lower sensor power consumption, while
    still supporting a wide range of query types?

16
Traditional Database Operators
  • Are implemented using pull mechanisms.
  • Block on incoming data.
  • Most require all the data to be read first. (i.e.
    sort, average)
  • Optimized for classic IO.
  • Usually implemented as separate threads.

17
Hardware Architecture
  • Centralized data processing.
  • Sensor proxies read and configure sensors.
  • Query processor interacts with proxies to request
    and get sensor data.
  • Sensor proxies support multiple simultaneous
    queries, multiplexing the data.

18
Operators
  • Implemented as state machines.
  • Support transition(state) method, which causes
    the operator to optionally read from input
    queues, write to output queue, and change state.
  • Multiple operators per thread, called by a
    scheduler. (Round robin in the experiments)
  • Allows fine-grained tuning of processing time
    allocated to each operator.

19
Sensor Sensitive Operators
  • Certain operations are impossible to perform on
    continuous data streams. (sum, average, sort)
  • Can be performed on partial data windows.
  • Joins can be implemented by hashing tuples.
  • Can provide aggregation based on current data,
    with continuous updates to parent operators.??

20
Sensor Proxy
  • Responsible for configuring sensor that belong to
    it, setting sensing frequency, aggregation
    policies, etc..
  • To save power, each sensor only listens for
    commands from proxy during short intervals.
  • Handles incoming data from sensors, and pushes it
    into appropriate queues.
  • Stores a copy to disk for historical queries.
  • Performs data filtering, which it can sometimes
    offload to the sensors.

21
Building a Fjord
  • For all sensor data sources, locate the proxy for
    the sensor, and install a query on it to deliver
    tuples at a certain rate to a push queue.
  • For non-sensor data sources, set up a pull queue
    to scan for data.
  • Pipe the data through the operators specified by
    the query.

22
Query
  • Find average car speeds during time window (w),
    for all segments the user is interested in
    (knownSegments)
  • More complicated queries are possible, with joins
    of streaming sensor data and historical data
    stored in a normal database fashion.

23
Dataflow for the Query
  • Data is pushed from the sensors to the user,
    through the filter operator set up by the query.
  • Multiple similar queries can be added to an
    existing fjord, instead of creating one per query.

24
Experiment Sensors
  • 32 of CalTrans inductive loop sensors equipped
    with wireless radio links.
  • Sensors consist of sixteen pairs of sensors
    (referred to as upstream and downstream),
    with one pair on either side of the freeway on
    eight distinct segments of I-80 .
  • Collect data at 60Hz and relay it back to a
    server, where it is distributed to various
    database sources, such as the implemented Fjords.

25
A Traffic Application
  • Traffic engineers want to know the speed and
    length of cars on a freeway.
  • Two sensors are placed less than 1 car length
    apart
  • The pair of sensors will perform computation
    together

26
Contd.
  • Four time measurements are taken
  • The speed and length of the car are deduced by
    the two sensors
  • The results are relayed back to the proxy

27
Contd.
  • To measure a cars length within 1 foot, assuming
    a maximum speed of 60 mph sensors are sampled at
    180 Hz
  • Sensors collaborate locally to find car speed and
    length
  • Results are sent to base station

28
Power Usage
29
Conclusion
  • Fjords allow sensors to be treated as database
    sources for querying, with little change in the
    overall architecture.
  • Proxies can optimize energy consumption of
    individual sensors based on user queries, and
    multiplex data from sensors to multiple queries.
  • Processing is centralized, but can sometimes be
    offloaded to the sensors, to lower energy
    consumed by radio transmissions.

30
Aggregation
31
A Look at Aggregation
  • Supporting Aggregate Queries Over Ad-Hoc
    Wireless Sensor Networks - Samuel Madden, Robert
    Szewczyk, Michael J. Franklin, David Culler
  • Explores aggregation techniques that are
    application independent
  • Count
  • Min
  • Max
  • Sum
  • Average

32
At A Glance
  • Trying to minimize the number of messages sent
  • All aggregation is done by building a tree

33
Tricks of the Trade
  • How do you ensure an aggregate is correct?
  • Compute it multiple times.
  • How do you reduce the message overhead of
    redistributing queries?
  • Piggy back the query along with data messages.
  • Is there anyway to further reduce the messaging
    overhead?
  • Child nodes only report their aggregates if
    theyve changed.
  • Nodes can take advantage of multiple parents for
    redundancy reasons.

34
A Different Perspective on Aggregation
  • Scalable Fault-Tolerant Aggregation in Large
    Process Groups - Indranil Gupta, Robbert van
    Renesse, Ken Birman
  • Large process groups inherently need to
    communicate to accomplish a higher level task
  • Higher level tasks are usually driven by
    aggregation

Kenneth Birman, Cornell University
35
Goal
  • Develop a protocol that allows accurate
    estimation of global aggregate function
  • Each group member should be able to calculate the
    global aggregate

Kenneth Birman, Cornell University
36
Assumption
  • Asynchronous communication medium
  • Unreliable message delivery
  • Globally unique identifiers
  • A routing layer capable of point-to-point
    communication
  • The protocol is initiated at all members
    simultaneously
  • No energy constraints

Kenneth Birman, Cornell University
37
Metrics
  • Protocol message complexity
  • Protocol time complexity
  • Completeness of the final result

Kenneth Birman, Cornell University
38
A Note on Composable Functions
  • If f is a composable global function then
  • f(W1, W2) g( f(W1), f(W2) )
  • where W1 and W2 are disjoint sets and g is a
  • known function
  • Example Let f and g be Max
  • Max(W1, W2) Max( Max(W1), Max(W2) )

Kenneth Birman, Cornell University
39
Straw Man 1 Fully Distributed Solution
  • Each member sends their vote to each other member
  • O(N2) message complexity
  • O(N) time complexity
  • Completeness of final result will depend highly
    on the mediums loss rate.

Kenneth Birman, Cornell University
40
Straw Man 2 Centralized Solution
  • Each member sends its vote to a single leader
    which calculates the aggregate and disseminates
    the result
  • O(N) message complexity
  • O(N) time complexity
  • Additional overhead for election of leaders and
    coordination between them

Kenneth Birman, Cornell University
41
Straw Man 3 Hierarchical Solution
  • Grid Box Hierarchy
  • Divide members into N/k grid boxes
  • Assign each grid box a unique base-k identifier
  • Those grid boxes with identifiers matching the
    first i base-k digits form a sub tree of height i

Kenneth Birman, Cornell University
42
Straw Man 3 Hierarchical Solution Continued
  • Global Aggregate Computation
  • Performed bottom up
  • Requires logKN phases
  • Possible due to the composable nature of the
    global aggregate function

Kenneth Birman, Cornell University
43
How is the Grid Box Hierarchy Built?
  • Using a hash function
  • Member ID is mapped into 0, 1
  • A Member M would belong to grid box
  • H(M) (N/K) (written in base-K)
  • Any member can calculate the grid box of another
    member
  • Hash function can mirror the geographical/network
    topology

Kenneth Birman, Cornell University
44
Hierarchy Approach with Leader Election
  • Leader election occurs at all the internal nodes
    of the tree
  • Leaders calculate the global aggregate for their
    sub-tree (recursive)
  • The root then disseminates the result to all nodes

Kenneth Birman, Cornell University
45
Hierarchy Approach with Leader Election Continued
  • Message complexity ? O(N)
  • Time complexity ? O(logN)
  • Completeness ? This method is not fault tolerant

Kenneth Birman, Cornell University
46
The Gossiping Approach
  • Adds fault tolerance to Hierarchical Approach
  • Gossiping is used to aggregate data instead of
    leader election
  • Algorithm is started simultaneously at all
    members
  • Algorithm requires logKN phases

Kenneth Birman, Cornell University
47
The Gossiping Approach Continued
  • Phase 1
  • Every member M randomly selects members in its
    own grid box once per gossip round
  • M then sends each selected member one randomly
    selected vote
  • After KlogN gossip rounds, M applies the
    aggregate function and moves to Phase 2

Kenneth Birman, Cornell University
48
The Gossiping Approach Continued
  • Phase 2
  • For i from 2 to (logKN) 1
  • Each member M randomly selects some members
    belonging to the same subtree of height i
  • M then sends these selected members a randomly
    selected aggregate from an (i-1) subtree
  • After collecting enough of the (i-1) subtree
    aggregates (or some timeout) loop restarts

Kenneth Birman, Cornell University
49
The Gossiping Approach Continued
  • Phase 3
  • Each member M should now have an estimate of the
    global aggregate function
  • Time Complexity ? O(log2N)
  • Message Complexity ? O(Nlog2N)
  • Completeness ? Probability that a random members
    vote is included in the final aggregation is
    lower bounded by (1 - 1/N)

Kenneth Birman, Cornell University
50
Simulation Results
  • Scalability and fault-tolerance of protocol
  • Default Parameters
  • N 200 members, K 4
  • 2 gossip targets per gossip round
  • floor(log2N) gossip rounds per phase
  • 25 message loss rate
  • .1 member failure rate per gossip round
  • Metric
  • Incompleteness 1 Completeness
  • risk of excluding a member vote in a final
    aggregate estimate

Kenneth Birman, Cornell University
51
Kenneth Birman, Cornell University
52
Kenneth Birman, Cornell University
53
Kenneth Birman, Cornell University
54
Conclusion
  • Aggregation of global properties in large process
    groups
  • Time and message complexity, Completeness
  • Traditional solutions dont scale
  • Hierarchical gossiping approach
  • Scalability
  • Good fault-tolerance

Kenneth Birman, Cornell University
55
Data-Centric Storage in Sensornets S.
Ratnasamy, D. Estrin, R. Govindan, B. Karp, S.
Shenker
  • Motivation for Data-Centric Storage
  • In data-rich networks data-centric algorithms
    seem to be energy efficient
  • Data-Centric routing has been shown to be energy
    efficient
  • Data-Centric storage could act as a companion to
    data-centric routing to save even more energy

56
Data-Centric Storage Applicability
  • Assumptions
  • Ad-hoc deployment over a known area
  • Nodes can communicate with several neighbors via
    short range radio
  • Nodes know their own location
  • Energy is scarce (Gasp!)
  • Data enters/leaves the sensornet via access
    point(s)
  • Network and communication topology is largely
    static

57
Data-Centric Storage Applicability
  • Definitions
  • Observations low-level readings of basic sensors
  • Ex temperature, light, humidity et al.
  • Event an interesting collection of low level
    observations
  • May combine several modalities
  • Event notifications contain the location of the
    event, making observations available

58
Data-Centric Storage Applicability
  • More Definitions
  • Task what a user specifies the sensornet to do.
  • Action what a node should do upon observing an
    event
  • Query how a user specifies data of interest

59
Data-Centric Storage Applicability
  • Three types of actions
  • External Store
  • Data is sent out of the network for processing
  • Message Cost O( sqrt(n) )
  • Local Store
  • Data is stored at the event source
  • Query Cost O( n )
  • Response Cost O( sqrt(n) )
  • Data-Centric Store
  • Data is sent to a specific node
  • Storage Cost O( sqrt(n) )
  • Query / Reponses Cost O( sqrt(n) )

60
Data-Centric Storage Applicability
  • The Scenario
  • Event locations are not known in advance
  • Event locations are random
  • Tasks are long-lived
  • Only one access point
  • Detecting events requires much more energy then
    ongoing monitoring of data
  • Users may only be interested in event summaries

61
Data-Centric Storage Applicability
  • Scenario Parameters
  • n nodes in the network
  • T unique event types
  • Dtotal is the total number of events detected
  • Q denotes the number of event types (out of T)
    for which queries are issued
  • Dq is the number of events detected

62
Data-Centric Storage Applicability
  • Costs
  • External Storage
  • Total Dtotal sqrt(n)
  • Hotspot Dtotal
  • Local Storage
  • Total Q n Dq sqrt(n)
  • Hotspot Q Dq
  • Data-Centric Storage
  • Total (list) Q sqrt(n) Dtotal sqrt(n)
    Dq sqrt(n)
  • Total (summary) Q sqrt(n) Dtotal sqrt(n)
    Q sqrt(n)
  • Hotspot (list) Q Dq
  • Hotspot (summary) 2 Q

63
Data-Centric Storage Applicability
  • Observations
  • As n gets large local storage costs the most
  • External storage always incurs a lower total
    message count
  • With summarized events data-centric storage has
    the smallest load
  • With listed events local and data-centric storage
    have significantly lower access loads compared to
    external storage

64
Data-Centric Storage Mechanisms
  • Distributed hash-table
  • Put( key, value )
  • Get( key )
  • Implementation details are left to one of several
    P2P computing schemes

65
Data-Centric Storage Mechanisms
  • Greedy Perimeter Stateless Routing (almost)
  • In GPSR a message is dropped if no node exists at
    the specified location
  • Data-Centric Storage routes a message to the node
    closest to the specified location
  • To find an event the tuples that describe it are
    used as inputs to the hash function.
  • The query is then routed to the node
    corresponding to the hash functions output

66
Data-Centric Storage Mechanisms
  • Robustness
  • Refresh periodically the data-cache sends a
    refresh to the event source
  • If a node closer to the key receives a refresh
    then it becomes the new data-cache
  • Local Replication
  • Any node hearing a refresh caches the associate
    data

67
Data-Centric Storage Mechanisms
  • Scalability
  • Structured Replication
  • Events are stored at the closest mirror
  • Reduces storage cost by a factor of 2d
  • d is dependant upon the number of mirrors
  • Queries must be routed to all mirrors

68
The Future of Sensor Networks?
  • Amorphous Computing
  • Draws heavily from biological and physical
    metaphors
  • The Setup
  • Vast number of unreliable components
  • Asynchronous
  • Irregularly placed, but very dense
  • Interconnects are unknown and/or unreliable
  • The Goal
  • How can we engineer coherent behavior?

69
Amorphous Computing
  • Programming Paradigms
  • Nodes are all identical
  • Same program
  • Can store local state
  • Can generate random numbers
  • No knowledge of position or orientation
  • Can communicate with all nodes within a radius R

70
Amorphous Computing
  • Wave Propagation
  • Simulates chemical diffusion amongst cells
  • Chemicals alter the state of nodes
  • Growing-Point Language

71
Amorphous Computing Example
  • A growing point diffuses pheromone
  • Pheromone is specified to diffuse for H hops

72
Amorphous Computing Example
  • A growing point diffuses pheromone
  • Pheromone is specified to diffuse for H hops
  • Because of dense deployment, a circle of radius
    RH is created

73
Wave Propagation Example
  • Growing a line
  • One GP diffuses a pheromone (blue)

74
Wave Propagation Example
  • Growing a line
  • One GP diffuses a pheromone (blue)
  • Green GP diffuses a pheromone that is accepted
    only by nodes that have a higher red pheromone
    concentration then previous hop.

75
Amorphous Computing
  • Proven to be able to produce any planar graph.
  • Global behavior emerges from local interaction
  • Proposes models for using biological components
    as computational elements

76
Amorphous Computing
  • Fault Tolerance
  • Redundancy?
  • Abstractly structuring systems to produce the
    right answer with high probability

77
A Little Bit About My Work
  • I was born in Santa Monica, CA
  • The ultimate goal is Complex Tasking of Sensor
    Networks
  • Currently
  • Efficient, in-network algorithms for identifying
    contours, gradients, and regions of interest

78
Contour/Gradient/Region Finding
  • A first stab at in-network processing
  • Useful to many applications
  • Topology
  • Marine biology
  • Geology
  • Chemical Concentrations
  • and much much more!

79
In the Future
  • Sensor Networks should be Autonomous
  • Questions
  • What sort of infrastructure makes pattern finding
    (and in-network processing) more efficient?
  • Goal
  • To Program or Task the system efficiently

80
My Class Project
  • Using Mica Testbench to collect real sensornet
    data
  • With this data I plan to perform simulations with
    the goal of algorithmic development

81
Acknowledgements
  • DARPA Sensit Program
  • http//www.darpa.mil/ito/research/sensit/
  • Many thanks to Steve Beck, Richard Brooks, Jason
    Hill, Bill Kaiser, Donald Kossman, Sri Kumar,
    Tobias Mayr, Kris Pister, Joe Paradiso

82
Acknowledgements
  • Scalable Fault-Tolerant Aggregation in Large
    Process Groups Indranil Gupta, Robbert van
    Renesse, Kenneth Birman
  • Fjording the Stream An Architecture for Queries
    over Streaming Sensor Data Samuel Madden,
    Michael J. Franklin
  • Supporting Aggregate Queries Over Ad-Hoc
    Wireless Sensor Networks Samuel Madden, Robert
    Szewczyk, Michael J. Franklin, David Culler
  • Amorphous Computing Harold Abelson, Don
    Allen, Daniel Coore, Chris Hanson, George Homsy,
    Thomas F. Knight, Jr., Radhika Nagpal, Erik
    Rauch, Gerald Jay Sussman, Ron Weiss
Write a Comment
User Comments (0)
About PowerShow.com