Capturing Sensor-Generated Time Series with Quality Guarantees - PowerPoint PPT Presentation

About This Presentation
Title:

Capturing Sensor-Generated Time Series with Quality Guarantees

Description:

Maintain m, M as the minimum/maximum values of observed samples since last segment ... the sequence of (M, ) suffices. In general, compression should exploit ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 24
Provided by: Informatio367
Learn more at: https://ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Capturing Sensor-Generated Time Series with Quality Guarantees


1
Capturing Sensor-Generated Time Series with
Quality Guarantees
  • Iosif Lazaridis
  • Sharad Mehrotra
  • University of California, Irvine
  • ICDE 2003, Bangalore, India

2
Talk Outline
  • Sensors and QUASAR
  • Archival Vs. On-line applications
  • Poor Mans Compression
  • Using Prediction
  • Experiments
  • Conclusions

3
Sensors
  • Sensors are becoming smaller, cheaper, and more
    configurable
  • Systems incorporating large numbers of them will
    be feasible
  • Main problems limited energy, limited bandwidth
  • Goal limit communication
  • Benefits data recipient as well

4
Quality Aware Sensing Architecture (QUASAR) _at_ UC
Irvine
  • Tradeoff between data/application accuracy, and
    system performance
  • This paper how to capture streaming time series
    with a given error tolerance

5
Archival Vs. Online Applications (I)
  • Online applications discard the time series
    history (e.g., intrusion detection)
  • Archival aspect is important because
  • Data may be precious (e.g., once-in-a-lifetime)
  • Unforeseen uses of the data (e.g., new mining
    applications)
  • Roll-back feature (e.g., what caused an accident
    ?)

6
Archival Vs. Online Applications (II)
  • Online applications have
  • timing requirements
  • time-variable needs (e.g., sensor S1 may become
    more, or less popular with time)
  • Archival
  • timing less important
  • constant needs (archived time series must meet
    some overall quality criteria)

7
Preliminaries
  • Let P lt p1, p2, , pn gt be a sequence of
    environmental measurements (time series)
    generated by the producer, where n now
  • Let S lts1, s2, , sngt be the server side
    representation of the sequence
  • A within-? quality data collection protocol
    guarantees that
  • for all i error (pi, si) ? ?

8
Piecewise Constant Approximation (PCA)
  • Given a time series Sn s1n a piecewise
    constant approximation of it is a sequence
  • PCA(Sn) lt (ci, ei) gt
  • that allows us to estimate sj as
  • s j ci if j ?ei-11, ei
  • c1 if j ? e1

9
Poor Mans Compression
  • Goal Given time series of sensor values,
    generate a within-?capt PCA representation of it
  • Poor Mans Compression - Midrange (PMC-MR)
  • Maintain m, M as the minimum/maximum values of
    observed samples since last segment
  • On processing pn, update m and M if needed
  • if M - m gt 2?capt , output a segment ((mM )/2,
    n-1)

Value
Variant PMC-Mean Uses the mean of the observed
points
?capt 1.6
Time
1 2 3 4 5
10
Optimality of PMC-Midrange
CLAIM A PCA representation BETTER with KltK
segments violates ?capt
Hence, the segment of BETTER must contain
ek1 Since PMC-MR output a new segment after ek
? range of values in ek-11,ek1 is gt 2 ?capt
? Segment of BETTER violates ?capt tolerance
11
Why Predict?
Future (no data)
Recent Past (precise data in sensor)
History (compressed data in archive)
Time
n ( now )
n - nlag
12
Who Should Fit the Predictive Model?
  • Server
  • Long-haul models possible
  • No extra communication
  • - No prediction accuracy guarantee
  • - Server only sees approximate time series
  • Sensor (Data Producer)
  • Short-term adaptation (but )
  • Sensor sees precise time series
  • Prediction accuracy ?pred guarantee

13
Producer-Side Prediction
  • M is the predictive model, and ? its parameters
  • Producer sends (M, ?) to server
  • Server estimates time series value spredj using
    (M, ?)
  • Producer generates new (M, ?) when
  • error(pj,spredj)gt ?pred

14
Issues in Prediction
  • Choice of M is domain-specific
  • Goal is to minimize communication
  • Prediction accuracy error(pj,spredj) lt
    ?pred
  • Parameter size ?
  • Adaptive Model Selection
  • E.g., constant velocity or constant acceleration,
    for predicting a moving objects location

15
Combining Prediction with Compression
  • Imagine that ?pred lt ?capt
  • No need to do extra work for archival
  • the sequence of (M,?) suffices
  • In general, compression should exploit the
    information given in the sequence (M,?)
  • Solution
  • djpj-spredj (error time series)
  • Compress d1n within- ?capt

16
Combining Prediction with Compression (Example)
17
Experiments
  • Data sets
  • Synthetic Random-Walk
  • x1 0 and xixi-1si where si drawn
    uniformly from -1,1
  • Oceanographic Buoy Data
  • Environmental attributes (temperature, salinity,
    wind-speed, etc.) sampled at 10 intervals from a
    buoy in the Pacific Ocean (Tropical Atmosphere
    Ocean Project, Pacific Marine Environment
    Laboratory)
  • GPS data collected using IPAQs (not in paper)
  • Experiments to test
  • Compression Performance of PMC
  • Benefits of Model Selection
  • Query Accuracy over Compressed Data
  • Benefits of Prediction/Compression Combination

18
Compression Performance
K/n ratio number of segments/number of samples
19
Query Performance Over Compressed Data
How many sensors have values gtv? (Mean
selectivity 50)
20
Impact of Model Selection
  • Objects moved at approximately constant speed (
    measurement noise)
  • Three models used
  • locn c
  • locn cvt
  • locn cvt0.5at2
  • Parameters v, a were estimated at sensor over
    moving-window of 5 samples

K/n ratio number of segments/number of samples.
?pred is the localization tolerance in meters
21
Combining Prediction with Compression
K/n ratio number of segments/number of samples
22
GPS Mobility Data from Mobile Clients (iPAQs)
QUASAR Client Time Series
Latitude Time Series 1800 samples
Compressed Time Series (PMC-MR) Accuracy of 100
m 130 segments
23
Conclusions and Future Work
  • We motivated the importance of real-time
    applications to co-exist with data archival
  • We showed how compression can be used to reduce
    the size of the archived time series optimally
  • We investigated how prediction can be used to
    limit communication by allowing the database to
    estimate values ahead of time
  • We noted the interplay between prediction and
    compression and showed how they can be combined
  • In the future
  • Adaptive algorithms for model selection
  • Exploiting inter-sensor correlation for further
    reducing communication
Write a Comment
User Comments (0)
About PowerShow.com