Title: Indexing and Range Queries in SpatioTemporal Databases
1Indexing and Range Queries in Spatio-Temporal
Databases
Danzhou Liu, Wei Cui, Yun Fan School of
Computer Science University of Central Florida
2Outline
- Introduction
- The R-tree
- The TPR-tree
- The TPR-tree
- Experiments
- Conclusions
3Introduction
- Spatio-temporal databases
- record moving objects geographical locations
(sometimes also shapes) at various timestamps. - support queries that explore their historical and
future (predictive) behaviors. Applications. - applications flight control systems, weather
forecast and mobile computing - The database stores the motion functions of
moving objects. - For each object o, its motion function gives its
location o(t) at any future time t. - A predictive window query
- specifies a query region qR and a future time
interval qT - retrieves the set of all objects that will fall
in qR during qT. - our goal index moving objects so that a
predictive window query can be answered with as
few disk I/Os as possible. - Examples
- Find all airplanes that will be over Florida in
the next 10 minutes. - Report all vessels that will enter the United
States in the next hour.
4Motion Function
- We consider linear motion.
- For each object, the database stores
- Its minimum bounding rectangle (MBR) at the
reference time 0 - Its current velocity bounding rectangle (VBR)
- Examples MBR(a)2,4,3,4, VBR(a)1,1,1,1
MBR(c)8,9,3,4, VBR(c)-2,0,0,2 - An update is necessary only when an objects VBR
changes.
5R-tree
- The R-tree aims at minimizing
- the area
- The perimeter of each MBR
- The overlap between two MBRs (e.g., N1, N2) in
the same node - The distance between the centroid of an MBR and
that of the node containing it
6R-tree Insertion
7The Time Parameterized R-Tree (TPR-Tree)
- Extends the R-tree by introducing the velocity
bounding rectangle (VBR) in all entries. - Queries are compared with conservative MBRs of
non-leaf entries. N1v-2,1,-2,1 and
N2v-2,0,-1,2
8TPR-Tree
- Our goal
- index moving objects so that a predictive window
query can be answered with as few disk I/Os as
possible. - A mathematical model that estimates the cost of
answering a predictive window query using
TPR-like structures. - Number of node accesses.
- Application of the model to derive the optimal
performance. - The TPR-tree is much worse than the optimal
structure. - Exam the algorithms of the TPR-tree, identify
their deficiencies, and propose new ones. - The TPR-tree.
9TPR deficiency 1 Choosing sub-tree to insert
- To insert an entry, the TPR-tree picks the
sub-tree incurring the minimum penalty (smallest
MBR/VBR enlargement).
- May result in inserting an entry into a bad
sub-tree this problem is increasingly serious as
time evolves.
10TPR solution Choose path
- Aims at finding the best insertion path globally,
namely, among all possible paths. - Observation We can find this path by accessing
only a few more nodes (than the TPR-tree
algorithm).
Maintain a heap (g),0, (h),0, (i),20
the path expanded so far
the accumulated penalty so far
11TPR solution Choose path
- Aims at finding the best insertion path globally,
namely, among all possible paths. - Observation We can find this path by accessing
only a few more nodes (than the TPR-tree
algorithm).
Visit node g (h),0, (a,g),3, (i),20,
(b,g),32
complete paths already although nodes a and b are
not visited
12TPR solution Choose path
- Aims at finding the best insertion path globally,
namely, among all possible paths. - Observation We can find this path by accessing
only a few more nodes (than the TPR-tree
algorithm).
Visit node h (a,g),3, (d,h),9, (c,h),17,
(i),20, (b,g),32
The algorithm stops now.
13TPR deficiency 2 Which entries to re-insert
- When a node overflows, some of its entries are
re-inserted to defer node split (the ones that
diverge most from the node centroid). - The entries chosen by the TPR-tree are very
likely to be re-inserted back to the same node,
so that a node split is still necessary.
14TPR solution Pick worst
- Aims at selecting entries that can most
effectively shrink the MBR or VBR of the node
for re-insertion. - The first step picks an appropriate dimension
(either spatial or velocity) based purely on
estimation using our cost model (see the paper
for details).
- The second step performs sorting on this
dimension and decides the entries to be removed . - Example If the axis chosen in the first step is
the x-axis, then the sorting list is b,d,a,c.
Either b or c is removed.
15TPR deficiency 3 Tightening MBR in deletion
- Entry deletion requires first finding the entry,
which accesses many nodes of the tree. The
TPR-tree uses this fact to tighten the MBR of
non-leaf entries. - Assume nodes h and i are accessed before e is
found then the TPR-tree will tighten the MBR of
i only (enclosing g and f).
16TPR deficiency 3 Tightening MBR in deletion
- Entry deletion requires first finding the entry,
which accesses many nodes of the tree. The
TPR-tree uses this fact to tighten the MBR of
non-leaf entries. - Assume nodes h and i are accessed before e is
found then the TPR-tree will tighten the MBR of
i only (enclosing g and f).
17TPR solution Active tightening
- Tightening more entries for free.
- Assume nodes h and i are accessed before e is
found then the TPR-tree will tighten the MBR of
both h and i.
18TPR solution Active tightening
- Tightening more entries for free.
- Assume nodes h and i are accessed before e is
found then the TPR-tree will tighten the MBR of
both h and i.
19TPR solution Active tightening (Cont.)
- Another example Assume the shaded nodes are
accessed to find e. - The active tightening can tighten the MBR of n5,
n6, n3, and n4. - But not n1 and n2.
20Challenge of Migration
- 3 Operating Systems
- Microsoft Windows
- Sun Solaris
- Redhat Fedora Core 1
- 2 Compilers CL, GCC (2.9.5, 3.3.2)
- Difference of Code Conversion
- How close the compilers to the standard?
- Compatibility of Library
21Experiments Settings (query and tree)
- Dataset
- 50,000 sampled objects MBRs are taken from a
real spatial dataset NJ Tiger - each object is associated with a VBR such that on
each dimension - The velocity extent is zero (i.e., the object
does not changespatial extents during its
movement) - the velocity value distribution is randomed in
range 0,8 - the velocity can be positive or negative with
equal probability. - We compare TPR- with TPR-trees.
- Disk page size1k bytes (node capacity27 for
both trees). - For each object update, perform a deletion
followed by an insertion on each tree. - Each predictive query is a moving rectangle, and
has these parameters - qRlen The length of the querys MBR
- qVlen The length of the querys VBR
- qTlen The number of timestamps covered.
22TPR-tree
23TPR-tree
24Conclusions
- The TPR-tree combines the idea of conservative
MBR directly with the tree construction
algorithms of R-trees. - The TPR-tree improves it by designing algorithms
that take into account the special features for
moving objects. - Cost model for performance analysis
- The optimal performance of a hypothetically best
structure - Reduce disk I/Os for predictive queries
25QA
26Thanks!