Declarative Support for Sensor Data Cleaning - PowerPoint PPT Presentation

About This Presentation
Title:

Declarative Support for Sensor Data Cleaning

Description:

Spatial granules defines the unit of space in which this homogeneity is expected ... Example: Within a spatial granule, by computing the average of the readings from ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 23
Provided by: defau635
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Declarative Support for Sensor Data Cleaning


1
Declarative Support for Sensor Data Cleaning
  • Shawn Jeffery Gustavo Alonso Michael
    Franklin Wei Hong Jennifer Widom
  • UC Berkeley ETH Zurich UC
    Berkeley Arch Rock Stanford
    Corporation University

  • (Intel Research
    Berkeley)

Presented By Venkatesh (venky) Raghavan
Abhishek Mukherji
Disclaimer Slides adapted / taken from the talk
given by S. Jeffery in Pervasive 06
2
Current Approach
Application
Application
Data Cleaning
Data Cleaning
  • Each application implements its own data cleaning
  • Multiple accesses to a shared resource

Raw, dirty data
Sensor devices
3
Data Cleaning - Infrastructure Approach
Application
Application
Cleaned data
  • Data cleaning built, tested, and deployed once
  • One point of access to sensor devices

Cleaning Infrastructure
Raw, dirty data
The Cleaning Infrastructure translates raw sensor
data to cleaned data applications are
unaffected by the unreliable devices over which
they are deployed.
4
Challenges
  • How to build an infrastructure that supports
  • Many types of sensors
  • Multiple applications
  • Different environments
  • Two facets to our solution
  • Pipeline of sensor cleaning tasks
  • Declarative query processing

5
Temporal and Spatial Granules
  • ESP (Extensible Sensor stream Processing) uses
    high-level abstractions
  • Temporal Granules
  • Spatial Granules
  • Granules
  • Define units of time and space inside which the
    data are expected to be homogeneous
  • Exploits the fact that many applications are not
    interested in
  • individual readings or devices, but with
    higher-level data in
  • time and space

6
Temporal Granules
  • Sensor devices produce data at a frequent rate
  • Applications are concerned with data from a
    larger time period
  • Environment Monitoring application model
    micro-climate of redwood tree
  • Reading required for every 5 minutes.
  • Solution windowed processing to group readings

7
Spatial Granules
  • Reading from devices physically close to each
    other are expected to be homogeneous
  • Spatial granules defines the unit of space in
    which this homogeneity is expected to hold.

8
Sensor Cleaning Pipeline
Virtualize
  • Cleaning Data Involves
  • A set of logically distinct operation
  • Each operation targets different aspects of the
    data, from finest (single readings) to coarsest
    (multiple sensors and various sources)
  • Uses temporal and spatial characteristics of
    sensor data

Arbitrate
Merge
Smooth
Point
9
Declarative Query Processing
  • Program stages with declarative queries
  • CQL continuous query extension to SQL
  • Data stream system as processing engine
  • Real-time cleaning

SELECT S.city, AVG(temp) FROM SOME_STREAM S
RANGE 5 seconds WHERE S.state
California GROUP BY S.city
Window Clause
10
Step 1 Point
  • Operates Single value of sensor stream.
  • Purpose Filter individual values
  • Errant (dirty / faulty) RFID tags
  • Obvious outliers
  • Conversion of raw data into tuples
  • Heat Sensors
  • Output data into voltages. We have to convert
    that raw data into temperature by looking into
    calibration of that sensor.

11
Step 1 Point
P
P
P
P
P
P
P
P
P
P
P
P
Point
12
Step 2 Smoothing
  • Purpose Interpolates (inserts) lost readings
  • Temporal interpolation
  • Outlier detection
  • MethodWindow based queries

Temporal Granules
P
P
P
P
P
P
S
S
P
P
S
S
Smooth
P
P
P
P
Point
13
Step 3 Merge
  • Purpose Spatial interpolation
  • Example Within a spatial granule, by computing
    the average of the readings from different motes
    and omitting individual readings that are outside
    of two deviations from the mean.

Spatial Granules
M
M
Merge
P
P
P
P
P
P
S
S
P
P
S
S
Smooth
P
P
P
P
Point
14
Step 3 Merge
15
Step 4 Arbitrate
  • Purpose Remove
  • conflicting readings
  • de-duplication

Arbitrate
A
M
M
Merge
P
P
P
P
P
P
S
S
P
P
S
S
Smooth
P
P
P
P
Point
16
Step 5 Virtualize
  • Purpose Multi-source integration

Virtualize
V
Arbitrate
A
M
M
Merge
P
P
P
P
P
P
S
S
P
P
S
S
Smooth
P
P
P
P
Point
17
RFID Scenario
Application
Query 2
rfid_data
Virtualize
Each domain needs to modeled
Arbitrate
Query 4
arbitrate_input
Merge
Smooth
Smooth
Query 3
smooth_input
Point
Point
On Sensor
18
RFID Scenario
Fig Expected Output
Fig Query 2 result using raw RFID Data
19
Smoothing
Difference in Shelf 0 and Shelf 1 is likely due
to issues with antenna ports on these particular
RFID readers.
20
Arbitration
21
Arbitration
RFID r1
t
t1
t2
Moving Average (Window (w) 3 time-stamps At
t2, Shelf 0 count(r1) 2 Shelf 1
count(r1) 3
NOTE Window size must be larger than the longest
period of dropped reading. But not too large.
22
Conclusion
  • An infrastructural approach to sensor data
    cleaning is necessary
  • ESP a pipelined declarative framework for
    building such infrastructure
Write a Comment
User Comments (0)
About PowerShow.com