Title: ModelDriven Data Acquisition in Sensor Networks
1Model-Driven Data Acquisition in Sensor Networks
- Amol Deshpande, Carlos Guestrin, Samuel R.
Madden, Joseph M. Hellerstein, Wei Hong
Accepted by VLDB 2004
Presented by Chih-Chieh Hung 2004.9.29
2Outline
- Introduction
- Overview of Approach
- Model-based Querying
- Choosing an observation plan
- Conclusion
3Introduction
- The metaphor that the sensornet is a database
is problematic, because sensors dont
exhaustively represent the data in the real
world. - In this paper, a probabilistic model of that
reality is used to complement the readings.
4Outline
- Introduction
- Overview of Approach
- Model-based Querying
- Choosing an observation plan
- Conclusion
5Architecture of Model-based Querying
- Using probability density function (pdf), p(X1,,
Xn) to answer query, Xi where represents an
attribute at a sensor. - The model is used to estimate sensor readings in
the current time period. - The updating reading will help to refine
estimates for which uncertainty is high. - In BBQ, the specific model is based on
time-varying multivariate Gaussians.
6Architecture of Model-based Querying
1-d
e
7The reason of choosing different attributes in
observation plan
- Correlation in Value
- Cost Differential
- ex cost(read_volt) lt cost(read_temp)
8Support Queries
- Answering queries probabilistically based on a
distribution is conceptually straightward. - Using the pdf to compute the probability that Xi
is within e from the mean,
. - Mean
- Choosing the reading to observe is an
optimization problem. (NP-hard Problem)
9Outline
- Introduction
- Overview of Approach
- Model-based Querying
- Choosing an observation plan
- Conclusion
10Model-based querying
- Central element is the use of a probabilistic
model to answer queries about the attributes in a
sensornet. - This section focuses on specific queries
- Range Predicates
- Attribute-value Estimates
- Standard Aggregate
- A probability density function(pdf) assigns a
probability for each joint value x1, x2,, xn for
the attributes X1, X2,, Xn.
11Probabilistic Queries Range Queries
- Range Queries
- ask if an attribute Xi is in the range ai, bi
- Probabilistic Model
- compute
-
YES
NO
NEED MORE DATA
100
0
1-d
d
12Probabilistic Queries Range Queries
- Two step to compute
- Step 1
- Marginalize the pdf to a density over only Xi
- Step 2
- Test if or
- for given confidence d
-
13Probabilistic Queries Range Queries
- Suppose that we observe the value of attribute Xj
to be xj, we can obtain the conditional pdf - We can then compute
- In general, we will make a set of observation o
and obtain , the posterior
probability of our set of attributes X given o.
14Probabilistic Queries Value Queries
- A user is interesting in the value of Xi, we can
answer this query by the mean value of Xi, given
the observations o - Confidence
- for a given error bound egt0
15Probabilistic Queries AVERAGE aggregates
- We are interested in the average value of a set
of attribute A. - Define a random variable Y to present this
average by - The pdf of Y
16Dynamic Models
- The single static probability density function
represents spatial correlation in sensornet
deployment. - However, many real-world system include
attributes that evolve temporal and spatial
correlations. - A dynamic probabilistic model can represent
temporal correlations.
17Dynamic Models (contd)
- Goal compute
- For simplicity, the model restrict to Markovian
model. - Markovian model
t
t-1
t-2
1
time
Independent to attributes in t
18Dynamic Models (contd)
- By the assumption, the dynamics are summarized by
a pdf called the transition model - Using this transition model, we can compute
-
19Outline
- Introduction
- Overview of Approach
- Model-based Querying
- Choosing an observation plan
- Conclusion
20Choosing an Observation Plan
- Cost of Observations
- Improvement in Confidence
- Optimization
21Cost of observations
- A set of observation
- Expected cost
Data transmission cost
The cost of observing attribute Xi
Acquisition cost
22Data Transmission Cost
- Transmission cost is dependent on the data
collection mechanism and network topology. - For simplicity
- Then the whole problem can be seen as TSP with
weighted 1/ PijPji
j
i
Pij
j
i
Pji
23Improvement in Confidence
- Observation attributes O should improve the
confidence of our posterior density. - If we observe the specific value o, the benefit
are - Range Query
- Value and Average Query
- Expected Benefit
24Optimization
- We would like to pick the set of attribute O that
meet the confidence 1-d at a minimum cost - However, its a NP-hard problem.
25Solving the Optimization Problem
- Burst Force
- Need Exponential Time
- Greedy Incremental Heuristic
- O ? F
- At each iteration, for each Xi ,
- compute
- If
- then pick the lowest and return
- else
26Outline
- Introduction
- Overview of Approach
- Model-based Querying
- Choosing an observation plan
- Conclusion