Title: Presented by: Mingzhu Wei and Abhishek Mukherji
1Presented by Mingzhu Wei and Abhishek Mukherji
2Motivation
3Categories of Location-aware Queries
- Query Stationary
- Object Moving
4Snapshot vs. Continuous Query Processing
- Traditional Spatio-Temporal (Snapshot) Queries
Data
5Challenge I Massive size of incoming data streams
- Spatio-temporal Databases
- secondary storage
- DSMS
- Load Shedding
- Immediately drop insignificant tuples
- Possibly reduce the query area
- Objects that satisfy less than k queries are
insignificant - Lazily drop insignificant tuples
- PLACE
- Predicate-based window
6Challenge II Continuous evaluation of CQ
- Spatio-temporal Databases
- Associate a validation condition with each query
answer
- Valid time (t)
- The query answer is valid for the next t time
units - Valid region (R)
- The query answer is valid as long as you are
within a region R
- It is challenging to maintain the computation of
valid time/region for querying moving objects - DSMS
- Sliding-window
- PLACE
- Progressive evaluation paradigm
7Challenge III Wide variety of query types
- Spatio-temporal Databases
- Have solution for stationary range queries on
moving objects such as aggregation and k-NN
queries - DSMS
- Wide range of query operators
- No operators for spatio-temporal queries
- PLACE
- Extend PREDATOR and NILE SQL language
- Use of INSIDE and kNN operators
8Challenge IV Large number of concurrent queries
- Spatio-temporal Databases
- Solution
- Centralized environment Q-index (R-tree-like)
- Distributed environment Ship part of query
processing down to the moving objects to save
server from bottleneck - DSMS
- Multi-query Optimization techniques
- Sharing query plan
- Sharing at operator level
- PLACE
- Shared execution paradigm Spatial Join
9PLACE Architecture
10Data Models
- Three-level Storage Hierarchy
- In-memory
- Subset of incoming data stored in memory
- Associate with outstanding queries
- Cache readings
- Cache readings and flush them to secondary
storage - Secondary storage
- Sample data and choose kth reading to disk
- Keep one reading of objects and queries
- Index data using grid structure
- Repository storage
- Take snapshot of in-disk database every Tarchive
time - Multi-version structure of moving object
11Extended SQL Syntax
- inside_clause
- Stationary query (x1,y1,x2,y2)
- Moving query (M,OID, width, length)
- knn_clause
- Stationary query (k,x,y)
- Moving query (M, OID, k)
Q1 What is the query and the object here? Q2
Query result for K2?
12 More Questions!
- The first k objects are considered an initial
answer - Q3 Is there any relation between the k-NN query
and the range query? - K-NN query is reduced to a circular range query
However, the query area may shrink or grow
K 3
13Predicatebased Sliding Windows
- Temporal expiration
- Same as sliding window
- Spatial expiration
- Predicate-based expiration
- Other form of predicates
14Predicate-based Sliding Window (continued)
- Only significant objects are stored in-memory
- An object is considered significant if it is
either in the query area or the cache area
- Due to the query and object movements, a stored
object may become insignificant at any time - Larger cache area indicates more storage overhead
and more accurate answer
15Predicate-based Sliding Window (continued)Cachin
g the Result
- Observation Consecutive evaluations of a
continuous query yield very similar results - Idea Upon evaluation of a continuous query,
retrieve more data that can be used later
- K-NN query
- Initially, retrieve more than k
- Range query
- Evaluate the query with a larger range
- How much we need to pre-compute?
- How do we do re-caching?
16Incremental Evaluation
- The query is evaluated only once. Then, only the
updates of the query answer are evaluated - There are two types of updates- Positive and
Negative updates
Query Result
- Only the objects that cross the query boundary
are taken into account
- Need to continuously listen for notifications
that someone cross the query boundary
17Spatio-temporal Incremental Pipelined Operators
- Pipelined Query Operators
- Combination of Spatio-temporal operators with
regular CQ operators - ST Operators Pushdown- reduces the number of
tuples - Flexible Query Optimization- multiple candidate
execution plans - Algorithm
- Keep track of recently reported answer Q.Answer
of each query Q - For each new coming tuple P, test
- Is P part of previously reported Q.Answer?
- Does P qualify to be part of the current answer?
- Four cases
- CASE I P is part of Q.Answer and P still
qualifies- P will not be processed - CASE II P is part of Q.Answer and P does not
qualifies- negative P propagated - CASE III P is not part of Q.Answer and P
qualifies- positive P propagated - CASE IV P is not part of Q.Answer and P does not
qualifies- P has no effect on Q
18Scalability
- Continuous queries last for long times at the
server side - While a query is active in the server, other
queries will be submitted - Shared execution among multiple queries
- Should we index data OR queries?
- Data and queries may be stationary or moving
- Data and queries are of large size
- Data and queries arrive to the system with very
high rates - Treat data and queries similarly
- Queries are coming to data OR data are coming to
queries? - Both data and queries are subjected to each other
- Join data with queries
19ScalabilitySpatial Join
- To accommodate for the continuous movement of
both data and queries - Concurrent continuous queries share a grid
structure - Moving objects are hashed to the same grid
structure as queries - The spatio-temporal join is done by overlaying
the two grid structures
20Scalability
- Evaluating a large number of concurrent
continuous spatio-temporal queries is abstracted
as a spatio-temporal join between moving objects
and moving queries
21Performance Evaluation
- Size of incremental answer
- Pipelined spatio-temporal operators
- Pipeline with a select operator
- Pipeline with a join operator
22Conclusion
- PLACE Pervasive Location-Aware Computing
Environments - Scalable execution of continuous queries over
spatio-temporal data streams - Shared execution among concurrent continuous
queries
- Built inside a database engine
- Incremental evaluation of continuous queries
- Spatio-temporal query operators
- Treats Query and Data Symmetrically
- Out-of-order expiry of tuples