Scalable data handling in sensor networks - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Scalable data handling in sensor networks

Description:

Transitioning from data acquisition systems to distributed ... Directed Diffusion (Heidemann, Estrin), TinyDB (Madden), Cougar (Bonnet) 18. PROGRESSIVELY AGE ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 48
Provided by: deepakg
Category:

less

Transcript and Presenter's Notes

Title: Scalable data handling in sensor networks


1
Scalable data handling in sensor networks
  • Deepak Ganesan
  • Collaborators Ben Greenstein, Denis
    Perelyubskiy, Deborah Estrin (UCLA) , John
    Heidemann, Ramesh Govindan (USC/ISI)

2
Outline
  • Data challenges in high-bandwidth sensor networks
  • Instance Wireless structural monitoring
  • Transitioning from data acquisition systems to
    distributed storage and search
  • Generation I Economical wireless data
    acquisition systems using motes Under
    Preparation
  • Performance analysis over structural vibration
    data.
  • Generation II Long-lived, distributed storage
    and search systems Sensys03
  • Performance analysis over geo-spatial data
  • Other research directions
  • Optimal node placement and transmission structure
    under distortion bounds. IPSN04

3
Scaling high-bandwidth wireless sensor network
deployments
  • We have made a good start at building scalable,
    long-term sensor network deployments that deal
    with low data rate applications.
  • Notable Examples
  • Micro-climate monitoring system at James Reserve
    (CENS-UCLA), Bird monitoring at Great Duck Island
    (Intel-U.C.Berkeley)
  • Characteristics
  • low-data rate (few samples/minute), medium-scale
    (100s of nodes) deployments.
  • Scaling techniques
  • Duty-cycling low-power listen/transmit, simple
    aggregation schemes (TinyDiffusion/TinyDB).
  • We have very little understanding of how to scale
    high-bandwidth sensor network applications
    (involving vibration/acoustic/image sensors)
    where significant data rates can be expected.
  • How do we deal with applications that have
    predominantly relied on data collection?

4
Challenges in Wireless Structural Monitoring
need more
  • High Data Rates
  • 100Hz, 16bit sample, 15min shaking events.
  • Resource-constrained motes
  • 6MHz processor, 4KB RAM, 4MB Flash Memory (40
    mins of vibration data)
  • Diverse user requirements
  • Data collection of interesting event signatures
    of vibration events.
  • Analysis of data over different time-scales
    (long-term and short-term patterns)
  • State of Art Expensive wireless data acquisition
    systems using 802.11

5
Transitioning from centralized to distributed
storage and search.
Method Wireless/Wired data acquisition
systems Advantage Centralized, persistent
storage and unconstrained search. Disadvantage
Expensive, Cumbersome, Highly power-inefficient.
Multi-hop Wireless Data Acquisition using Motes
Distributed In-network storage and search
Current Data Acquisition Systems
6
Transitioning from centralized to distributed
storage and search.
Method Sensor node-based multi-hop data
acquisition systems Advantage Cheap, Easy to
use, centralized storage, more scalable Disadvanta
ge Power-inefficient.
Multi-hop Wireless Data Acquisition using Motes
Distributed In-network storage and search
Current Data Acquisition Systems
7
Transitioning from centralized to distributed
storage and search.
Method Distributed Storage and Search Advantage
Power Efficient, Flexible use Disadvantage
Non-persistent, Restricted storage and search
Multi-hop Wireless Data Acquisition using Motes
Distributed In-network storage and search
Current Data Acquisition Systems
8
Building Multi-hop Wireless Data Acquisition
Systems Using Motes
  • Goals
  • Near real-time monitoring
  • Reliable, synchronized data transfer
  • Challenge
  • Limited network bandwidth, hence high latency.
  • How can we build a low-latency data-acquisition
    system?

15 minutes of vibration event data (100KB each
after Huffman coding) from a 20 node multi-hop
wireless network takes 4-8 hours to collect
centrally!
9
Progressive, On-demand Data Collection
  • Progressive Data Acquisition
  • Each node stores its data in its local storage
    and transmits low-resolution summaries to the
    base-station immediately after event.
  • User can analyze low-resolution data to determine
    nodes from which higher resolution data is
    required.
  • Lossless data is collected from a subset of nodes
    on-demand within a window of time (before being
    phased out of nodes local storage)
  • What did we achieve?
  • Low-latency lossy data acquisition
  • Lossless data acquisition on-demand.

Low-resolution data for 15 minutes of vibration
event data can be collected within 15-30 minutes
of event occurrence
10
Performance Evaluation
  • Choice of compression scheme
  • Appropriateness for structural vibration data
  • Performance metric Compression ratio, error
    (rms, psnr)
  • Efficient implementation on resource-constrained
    devices Motes
  • Power, memory and processing time
  • Study performed on structural vibration data from
    shaker table tests
  • CUREE-Kajima Joint Research Program, UCLA -
    Thomas Kang, John Wallace

11
Why wavelets?
  • Most of the energy is concentrated in the lower
    frequency subbands.
  • Signal decomposition suggests that it is highly
    appropriate for wavelet compression.

12
Mote implementation of Wavelet Codec
13
Compression Ratio and Error for Mote
Implementation
17-fold reduction in data size with an RMS error
of 3.1 (PSNR 30dB)
  • Good compression ratios can be achieved with low
    error

14
Transitioning towards long-term deployments
  • We achieved low-latency wireless
    data-acquisition, but our deployment lifetimes
    were still short.
  • Data Acquisition systems with motes can last for
    a few weeks.
  • How do our system objectives change for a
    long-term deployment?
  • Need smooth transition for researchers who have
    depended on data collection systems
  • system should retain ability to collect new event
    signatures on demand.
  • Need to achieve very low energy usage for long
    lifetime
  • system focus has to shift from data collection to
    in-network data storage and search.
  • Goal Build a networked storage and search system

15
Can existing storage and search systems satisfy
design goals?
16
Approach Provide a gracefully degrading storage
  • A distributed sensor network is a collection of
    nodes sensing spatio-temporally correlated data
    and possessing a comparatively larger distributed
    storage facility.
  • A gracefully degrading storage model provides two
    benefits
  • Retains the ability to gather data on-demand.
  • Offers tradeoff between resolution and query
    accuracy Lower resolution data offers lower
    query quality but incurs less storage overhead,
    and vice-versa.
  • Questions
  • How do we build a gracefully degrading networked
    store?
  • Can we efficiently query the distributed data
    store?

17
Related Work
  • Data Storage in sensor networks
  • Event Storage DCS (Ratnasamy Hotnets 2000)
  • Indexing schemes DIMS (Li Sensys 2003), DIFS
    (Greenstein SPNA 2002)
  • Multi-resolution computation
  • Beyond Average (Hellerstein IPSN 2003)
  • Edge detection (Nowak IPSN 2003)
  • Wavelet-based compression
  • Structural-health monitoring (Lynch-2003)
  • Sensor network databases
  • Directed Diffusion (Heidemann, Estrin), TinyDB
    (Madden), Cougar (Bonnet)

18
Key Design Ideas
  • Construct distributed load-balanced quad-tree
    hierarchy of lossy wavelet-compressed summaries
    corresponding to different resolutions and
    spatio-temporal scales.
  • Queries drill-down from root of hierarchy to
    focus search on small portions of the network.
  • Progressively age summaries for long-term storage
    and graceful degradation of query quality over
    time.

Level 2
Level 1

Level 0
PROGRESSIVELY AGE
PROGRESSIVELY LOSSY
19
Constructing the hierarchy
Initially, nodes fill up their own storage with
raw sampled data.
20
Constructing the hierarchy
  • Tesselate the network space into grids, and hash
    in each to determine location of clusterhead
    (ref DCS).
  • Send wavelet-compressed local time-series to
    clusterhead.

21
Processing at each level
Store incoming summaries locally for future
search.

Get compressed summaries from children.
time
Decode
Re-encode at lower resolution and forward to
parent.
y
x
Wavelet encoder/decoder
22
Constructing the hierarchy
Recursively send data to higher levels of the
hierarchy.
23
Distributing storage load
Hash to different locations over time to
distribute load among nodes in the network.
24
Drill-down query processing
User hashes to location where the root is
located. The drill-down query is routed down till
it reaches base.
25
Designing an aging policy for summaries
  • Eventually, all available storage gets filled,
    and we have to decide when and how to drop
    summaries.

Local Storage Allocation
Res 3
Res 1
Res 2
Local storage capacity
How do we allocate storage at each node to
summaries at different resolutions to provide
gracefully degrading storage and search
capability?
26
Match system performance to user requirements
95
Query Accuracy
50
Quality Difference
past
Time
present
  • Objective Minimize worst case difference between
    user-desired query quality (blue curve) and query
    quality that the system can provide (red step
    function).

27
How do we determine the step function?
  • Height What is the dip in query accuracy when
    resolution i becomes unavailable?
  • What types of queries are being posed (T)?
  • For each query, q, what is the expected query
    error when drill-down queries terminate at level
    i1, Error(q,i) ?
  • Width How long is resolution i stored within the
    network before being aged?
  • Storage allocated to resolution i at each node
    (Si)
  • Total number of nodes in the network (N)
  • What rate is assigned to resolution i (Ri)?

28
Storage Allocation Constraint-Optimization
problem
  • Objective Find si, i1..log4N that
  • Given constraints
  • Storage constraint Each node cannot store any
    greater than its storage limit.
  • Drill-down constraint It is not useful to store
    finer resolution data if coarser resolutions of
    the same data is not present.

29
Determining Rate and Drilldown query error
How do we determine communication rates?
  • Assume Rates are fixed a-priori by
    communication/network lifetime constraints.

How do we determine the drill-down query error
when prior information about deployment and data
is limited?
30
Prior information about sampled data
full a priori information
Omniscient Strategy Infeasible. Use all data to
decide optimal allocation.
Solve Constraint Optimization
Training Strategy (can be used when small
training dataset from initial deployment).
1 2 4
Greedy Strategy (when no data is available, use a
simple weighted allocation to summaries).
Finer
Finest
Coarse
No a priori information
31
Distributed trace-driven implementation
  • Linux implementation for ipaq-class nodes
  • uses Emstar (cite below), a Linux-based
    emulator/simulator for sensor networks.
  • 3D Wavelet codec based on freeware by Geoff Davis
    available at http//www.geoffdavis.net.
  • Query processing in Matlab.
  • Geo-spatial precipitation dataset
  • 15x12 grid (50km edge) of precipitation data from
    1949-1994, from Pacific Northwest. (Caveat Not
    real sensor data).
  • System parameters
  • compression ratio 6122448.
  • Training set 6 of total dataset.

M. Widmann and C.Bretherton. 50 km resolution
daily precipitation for the Pacific Northwest,
1949-94.
32
Queries posed over precipitation data
  • Use queries at different spatio-temporal scales
    to evaluate the performance of schemes
  • Choosing a Query Set
  • GlobalYearlyEdge look for spatio-temporal
    feature (edge between high and low precipitation
    areas).
  • LocalYearlyMean fine spatial and coarse temporal
    granularity
  • GlobalDailyMax coarse spatial and fine temporal
    granularity
  • GlobalYearlyMax coarse spatio-temporal
    granularity

33
How efficient is search?
Search is very efficient (lt5 of network queried)
and accurate for different queries studied.
34
Comparing Aging Schemes
Training performs within 1 to optimal . Results
with greedy algorithm are sensitive to weights.
35
Summary
  • Provide smooth transition from current data
    acquisition systems to fully distributed storage
    and search systems.
  • Use progressive transmission wireless
    data-acquisition systems as intermediate step
  • Support long-term storage and querying in
    resource-constrained sensor network deployments.
  • Summarization and in-network storage of data
  • Training-based optimization to determine system
    parameters.

36
Power-Efficient Sensor Placement and Transmission
Structure for Data Gathering under Distortion
Constraints
  • Collaborators Razvan Cristescu, Baltasar
    Beferrul-Lozano (EPFL, Switzerland)
  • to appear at IPSN 2004

37
Problem Motivation and Description
  • Motivation
  • The vision of thousands of 10 nodes is
    unrealistic in the near (10 year) term due to
    economies of scale and cost of sensors.
  • Need to add constraint of limited nodes to
    optimization.
  • A user has a bag of N nodes. He/She needs to
    place the nodes in a region A such that the
    sensed field can be reconstructed with
  • maximum distortion for any point in A is less
    than Dmax
  • Average distortion over the entire region is less
    than Davg
  • How does the user place the nodes and construct
    their communication structure for data gathering
    to a sink such that the total multi-hop
    communication power is minimized?

38
Complexity of the problem
  • Interplay of two difficult problems
  • Find feasible placements that satisfy distortion
    bounds.
  • Find most energy-efficient transmission
    structures for each placement (NP-complete)
  • Simple Example Given configurations I and II,
    which would you choose?
  • Node B is closer to the base-station, hence
    transmits its data over less distance
  • Node B is close to A, therefore, better
    correlated. A can jointly compress their data
    which will result in lower energy overhead.
  • Optimal solution is involves finding the most
    power-efficient transmission structure among all
    feasible placements and possible transmission
    structures.

I
II
39
Model and Assumptions
  • Sensing Model
  • Jointly Gaussian model for spatial data with
    exponential decaying covariance function.
  • Data aggregation model
  • Each node on tree jointly compress data from its
    entire sub-tree (eg Huffman/Arithmetic coding)
  • Sink data reconstruction model
  • Nearest neighbor reconstruction is used to
    reconstruct the field given a set of sampled
    points.
  • Communication model
  • Power-per-bit varies super-linearly with
    separation between transmitter and receiver

40
Model and Assumptions
  • Data Correlation Model
  • Jointly Gaussian model for spatial data
  • Sink data reconstruction model
  • Nearest neighbor reconstruction is used to
    reconstruct the field given a set of sampled
    points.
  • Data aggregation model
  • Each node on tree jointly compresses data from
    its entire sub-tree jointly
  • Communication model
  • Path-loss model
  • Jointly Gaussian model for spatial data, X,
    measured at nodes, with an N-dimensional
    multivariate normal distribution Gn(µ,K)

Covariance matrix
41
Model and Assumptions
  • Data Correlation Model
  • Jointly Gaussian model for spatial data
  • Sink data reconstruction model
  • Nearest neighbor reconstruction is used to
    reconstruct the field given a set of sampled
    points.
  • Data aggregation model
  • Each node on tree jointly compresses data from
    its entire sub-tree jointly
  • Communication model
  • Path-loss model

42
Model and Assumptions
  • Data Correlation Model
  • Jointly Gaussian model for spatial data
  • Sink data reconstruction model
  • Nearest neighbor reconstruction is used to
    reconstruct the field given a set of sampled
    points.
  • Data aggregation model
  • Each node on tree jointly compresses data from
    its entire sub-tree jointly
  • Communication model
  • Path-loss model

43
Model and Assumptions
  • Data Correlation Model
  • Jointly Gaussian model for spatial data
  • Sink data reconstruction model
  • Nearest neighbor reconstruction is used to
    reconstruct the field given a set of sampled
    points.
  • Data aggregation model
  • Each node on tree jointly compresses data from
    its entire sub-tree jointly
  • Communication model
  • Path-loss model

d
? 2 in free space, 2 lt ? lt 4 typically
44
Optimization Problem 1-D Case
  • Minimize total power
  • Subject to
  • Maximum distortion constraint
  • Average distortion constraint
  • Total area coverage constraint
  • Solve using Lagrangian relaxation and numerical
    constraint-optimization solving

45
Extend results to 2-D instance
  • Construct a wheel, with nodes on each radial
    spoke being placed optimally using our 1D
    placement solution.
  • Additional constraints
  • Given N nodes, how do we decide number of nodes
    per spoke and number of spokes?
  • How do we ensure that Voronoi cells satisfy the
    average and max distortion bounds?

46
Performance gains over uniformly random placement
and Shortest-path trees
  • One dimensional placement
  • 1-3 fold reduction in power consumption for 10-20
    node linear placements
  • Two dimensional placement of 100-200 nodes
  • Typically one order of magnitude reduction in
    total power consumption.
  • Two orders of magnitude reduction in bottleneck
    energy consumption (i.e. for node near sink)!
  • Other interesting observations
  • Network implodes i.e. with such placement, the
    farthest nodes from the base station are the
    first to die and nodes nearest to the sink are
    the last to die.
  • This is the behavior that we need since nodes
    near the sink are most important for routing.

47
The End
Write a Comment
User Comments (0)
About PowerShow.com