Title: Cutting Spatial Data Into Ribbons
1Cutting Spatial Data Into Ribbons
- Sampling Proximity Relations to Permit Some
Spatial Analysis While Preventing Other Spatial
Analysis - Alan Saalfeld
- The Ohio State University
2NOT CALLEDCutting Spatial Data into Scraps
- Throwing Away All Adjacency Relations, Thereby
Permitting No InterRegional Spatial Analysis
Whatsoever!
3Preview
- Some discrete math
- Some affine geometry
- Some spatial relations
- Some existence results
- Some non-existence results
- Some transformations
- Some solutions in search of problems
4If only we lived in a 1-D world!
- Everything would be comparable.
- There would be a metric-compatible total order on
point locations (no such order exists in 2-D or
higher-D). - Neighborhoods would always be intervals (a,b).
- After sorting the data, many of the processing
tasks would have linear-time complexity. - E.g., finding all intervals of a fixed size
would be a linear-time (O(n)) operation.
5All right, youse guys, now I want you to line
up alphabetically, by height...
Yogi Berra finds a way to sort 2-D data
6Data Ribbons
- Decompose the dataset into square regions that
form a fixed width ribbon. - Each region has a predecessor and a successor
region (of different shades). - Each region shares a single square edge with its
predecessor and shares a different single edge
with its successor. - Exactly ? of the 4-connected adjacencies in 2-D
space are preserved.
7Hilbert-order Ribbons
- Every square data set may be broken into square
regions and fitted with a fixed width,
non-overlapping, covering ribbon. - The ribbon order is quadrant-recursive the
ribbon covers a (sub)quadrant before moving on to
the next (sub)quadrant. - Each ribbon has two possible subquadrant
refinements.
8Statistical Behavior
- (Quad)tree hierarchical data structures have
well-behaved variance/covariance structure - ...covariance of values associated with any two
subquadrants is equal to the variance of the
first common ancestor (Cressie,1998). - The Hilbert ribbon structure organizes data to
facilitate covariance estimation of contiguous
cells.
9Advantages of Hilbert Ribbons
- They are area-based, not point-based.
- They are regular, composed of same-size squares.
- They preserve exactly half the adjacencies.
- They fill area of entire subquadrants at a time.
- They follow the hierarchy of quad-trees.
- Opportunities for multi-resolution statistical
analysis. - Opportunities to increase or decrease resolution.
10Disadvantages of Hilbert Ribbons
- Their squares are regular in size and
orientation. - They cannot accommodate traditional geography.
- They may allow recovery of the underlying grid.
- Recovering that grid would disclose all
locations! - There are few degrees of freedom in Hilbert
ribbon placement. - Spatial statistical theory of ribbon structure
may prove accessible, but it still needs
development.
11Suppose that we begin with our space decomposed
into convex regions. We want to order data
elements in those regions to try to keep all data
from each region together on our file and to try
to keep data from adjacent regions in adjacent
data blocks. Wed also like to be able to analyze
some spatial properties from our ordered data
sets alone based on proximity properties.
12Regions and their planar dual graph, with dual
edges crossing region edges at midpoints
13Drawing of m regions and their planar dual graph
on m vertices. Each edge has a two-piece
bisecting dual edge. Together the edge halves and
dual edge halves decompose each n-sided convex
polygon into n quadrilaterals
14Finally we have the structure sufficiently in
place to be able to describe how to cut the
data to ribbons
When we finish our cutting, each quadrilateral
will be connected to exactly two of its four
neighboring quadrilaterals. The connections will
form a cyclic ribbon of quadrilaterals.
1. Cut along any (m-1) dual edges that form
a spanning tree of the dual graph. 2. Cut along
every edge in the original graph whose dual edge
has not been cut.
How many different decomposing ribbons exist?
One for each spanning tree of the planar dual
graph!!
15The Ribbon of Quadrilaterals
- On average, regions will have their set of
interior quadrilaterals distributed over slightly
less than 2 unbroken ribbon segments. - The edges chosen for the spanning tree of the
dual graph may be constrained to respect
hierarchies of regions to a large extent. - For example, a county-to-county link may be given
a very low priority of being in the spanning tree
if the neighboring counties are in different
states.
1610
2
2
2
2
10
2
2
2
2
10
2
2
2
10
10
To limit the number of crossings of major
boundaries by the spanning tree of the dual
graph, dual edges that cross the boundary may be
assigned a much higher cost.
17Piecewise Linear Homeomorphisms
- PLH maps are described fully by their action on
triangles - PLH maps are described fully by their action on
triangle vertices - PLH maps agree on shared edges that are straight
line segments
18Any polygon can be mapped to any other by a PLH
map
- Any n-sided polygon can be triangulated.
- Any triangulation of an n-sided polygon is also a
triangulation of a similarly labeled regular
n-gon. - The composition of two PLH maps is a PLH map.
19(No Transcript)
20Area-preserving transformations
- Lemma A PLH map, realized as a bijection of
triangles, is area-preserving everywhere if and
only if it sends triangles to triangles of the
same area. - Proof A necessary and sufficient condition for
an affine function (x,y) ? (axbyc, dxeyf) to
preserve area is for the determinant of its
Jacobian, ae-bd, to be equal to 1 or (-1).
21Area-preserving transformations
- PLH maps may be found that are area-proportional
everywhere. - Proof by induction
- Trivial for n3.
- Construction for n4.
- First show for convex sets for ngt4.
22Area-preserving transformations
- The following are equivalent
- 1. Any two convex n-sided polygons of equal area
are homeomorphic under an area-preserving PLH map
that is linear on each corresponding edge pair. - 2. Any convex n-sided polygon is homeomorphic
under an area-preserving PLH map to a regular
n-gon of the same area. The homeomorphism may be
taken to be linear on each corresponding edge
pair.
23Node Splitting and Area Splitting
- Every convex polygon possesses a splitter that
divides the area in equal parts and also splits
the vertices into equal groups.
24Area-preserving transformations
- Proof for convex sets
- Suppose 1. and 2. are true for all convex sets of
size kltn, where ngt4. - Find a splitter for a convex n-gon that splits it
into two equal area (?n/2?2)-sided convex
polygons. Note ?n/2?2ltn. - Map each half into half of a regular n-gon.
25Area-preserving transformations
- A technical detail
- On 1 or 2 of the boundary edges, the ratio of the
two pieces of the divided edge may differ for the
convex (?n/2?2)-gon and for the half of the
regular n-gon. We can always adjust the edge
pieces with yet another area-preserving PLH map
so that they recover the original ratio.
26(No Transcript)
27Area-preserving transformations
- Proof for non-convex sets
- Suppose any non-convex k-gon for kltn has a PLH
area-preserving, boundary-extending map to any
same area convex k-gon, where ngt4. - For the non-convex n-gon, triangulate and remove
an ear. Ears always exist. - Map the (n-1)-gon to a slab convex set with the
edge from the missing ear going to the long edge
of the slab. - Map the ear to an ear-sized triangle attached
to the slab along the long edge. (Construction
makes it possible.).
28In Summary
- We saw how to cut data into ribbons to remove
exactly half of the neighbor relations. We
discussed a randomization procedure for relation
sampling. - We proved the existence of conforming meshes with
prescribed PLH behavior on the boundary that
scale all triangles uniformly. We illustrated
transformations that preserve some spatial
analysis capabilities.
29(No Transcript)
30(No Transcript)
31(No Transcript)
32More Things to Do
- We will analyze how well randomization of the
spanning tree generation will protect sensitive
location information. - We will develop theory for spatial analysis of
ribboned data sets, including measures and bounds
on the uncertainty that is due to our ribboning
procedure.
33An Earlier Strategy
- 1. Transform the 2-D dataset into a 1-D dataset
(using a proximity-preserving transformation). - 2. Create contextual variables for the 1-D
dataset. - 3. Interpret the 1-D contextual variables in
terms of their corresponding sampled 2-D
contexts. - 4. Assess information loss and uncertainty, and
interpret them as data protection measures.
34Gamut of Spatial Detail
- Smoothed local summary statistics
- Aggregated measures Averages, Densities
- Measures of local interaction
- Regression, Spatial autocorrelation
-
- Raw data to permit any spatial analysis
- Exact coordinates of every data point