Title: An Overview of Geospatial Data Structure, Algorithms, Mining, and Fusion
1An Overview of Geospatial Data Structure,
Algorithms, Mining, and Fusion
- Presented by
- GDF Learning Group
2Geospatial Data Structure
- Representation Modes
-
- Representing the Geometry of a Collection of
Objects
3Representation Modes of Geospatial Data Structure
- Tessellation
- Vector Mode
- Half-Plane Representation
4Tessellation
- Fixed Tessellation (Regular)
- This model usually contains a grid or pattern.
- Ex. Raster data
- Variable Tessellation (Irregular)
- This model is less organized.
- Ex. An area partitioned into zones using polygons.
5Vector Mode
- In this mode objects are constructed by points
and edges. These are used to create a
2-Dimensional plane.
- Structure Notation
- point
- polyline
- polygon
- region
6Half-Plane Representation
- This type of modeling is used more in instances
of 3D modeling. - It combines the first two techniques and uses
them to represent something more solid. Such as
a building structure, mountains, valleys,etc.
7Representing the Geometry of a Collection of
Objects
- Spaghetti Model
- Network Model
- Topological Model
8Spaghetti Model
- Spaghetti model is a type of Vector model.
- Here, the geometry of any spatial object within
is described independently from all others. - The boundary of two adjacent regions is
represented twice. - The key to this model is simplicity.
9Network Model
- Another type of Vector model.
- Designed to represent networks in network-based
applications. - In this type nodes are made the central points
with arcs, lines, and polylines branching out. - All polygons may not close.
10Topological Model
- Another Vector model much like the network model.
- Only difference is that some nodes may not be
connected by a line, arc, or polyline. - Lines here are only stored once.
11Algorithms
12What is an algorithm?
- A procedure or formula for solving a problem.
-www.cctvconsult.com/glossary.htm - A finite set of step-by-step instructions for a
problem-solving or computation procedure,
especially one that can be implemented by a
computer. -www.garlic.com/lynn/secgloss.htm
13Types of algorithms in GIS
- Point in polygon
- Polyline intersections
- Polygon intersections
- Windowing
- Clipping
14Point in Polygon
- Used to determine if a point lies in the area of
a polygon - Tests the edges of a polygon to see if the point
lies on an edge of the polygon - Tests the inside of the polygon to see if the
point lies inside the polygon - Refer to the text for the actual algorithm on
page 178
15(No Transcript)
16Polyline intersections
- It is possible to detect intersections between
polylines with algorithms - Refer to the text for the actual algorithm on
page 180 - It is also possible to compute new geometric
objects as the result of an intersection between
polylines - Refer to the text for the actual algorithm on
page 182
17(No Transcript)
18Polygon intersections
- Used to determine if two polygons overlap
- Checks if one polygon and another share an edge
- Also checks if one polygon is inside (or
partially inside) another polygon - Refer to the text for the actual algorithm on
page 186
19A
C
B
20Windowing
- Tests whether a geometric object intersects a
rectangle - Also tests to see if the object is entirely
contained by the rectangle - Refer to the text for the actual algorithm on
page 193
21r
A
22Clipping
- Computes the part of a geometric object that lies
inside of a rectangle - Refer to the text for the actual algorithm on
page 196
23(No Transcript)
24Introduction to Geospatial Data Mining and
Knowledge Discovery
- Remote sensing technologies has greatly enhanced
our capabilities to collect terabytes of
geographic data - This data is difficult to understand and needs to
be transformed into useful and understandable
information
25Knowledge Discovery (KD) Technology
- Knowledge discovery technology empowers
development of the next generation database
management and info systems through its abilities
to extract new, insightful info embedded within
large heterogeneous databases and to formulate
knowledge - A KD process includes data warehousing, target
data selection, cleaning, preprocessing,
transformation and reduction, data mining, model
selection (or combination), evaluation and
interpretation, and consolidation and use of the
extracted knowledge (Fayyad 1997, P5)
26Knowledge Discovery
- Ultimately, KD aims to enable an information
system to transform information to knowledge
through hypothesis testing and theory formation
27Data Mining (DM)
- Data mining is the non-trivial process of
identifying valid, novel, potentially useful and
ultimately understandable patterns in data
(Fayyad et al. 1996) - It aims to develop algorithms for extracting new
patterns from the facts recorded in a database - Data mining does not apply in cases where the
outcome is already known
28Data Mining
- The info data mining reveals should be useful and
relevant - Successful applications of DM are not common
- Establishing its relevance and explaining its
cause are very difficult - There are many analysis techniques that can be
used to try to determine if info is useful
29Data Mining
- Millions or even billions of hypotheses must be
made - It must be determined how false positives can be
differentiated from truly significant findings
30Visualization
- Discovered knowledge must be expressed in some
way - This data is best displayed visually so as to be
better understood
31Geospatial Data
- 3 characteristics of geospatial data create
challenges to development of a robust data
foundation
32Characteristics of Geospatial Data
- Geospatial data repositories tend to be very
large - The second characteristic relates to phase
characteristics of data collected cyclically - Data discovery must accommodate collection cycles
that may be unknown or that may shift from cycle
to cycle in both time and space
33Characteristics of Geospatial Data
- Third characteristic applies to a characteristic
of the data foundation rather than of the data - The internet has supported development of data
clearinghouses, digital libraries, and online
repositories wherein one does not access data,
but pointers to data - As digital data becomes more available on the
internet, they become increasingly difficult to
locate, retrieve, and analyze
34Unique Properties of Geographic Data
- While KD applications involve highly dimensioned
information spaces, geographic data is unique
since up to four dimensions of the information
space are interrelated and provide the
measurement framework for the remaining dimensions
35Unique Properties of Geographic Data
- Measured geographic attributes often exhibit the
properties of spatial dependency and spatial
heterogeneity - Spatio-temporal objects and patterns are very
complex
36Unique Properties of Geographic Data
- The development of data mining and knowledge
discovery tools must be supported by a solid
geographic foundation that accommodates the
unique characteristics and challenges presented
by geospatial data
37Geographic Knowledge Discovery Uses in Geographic
Research
- Map interpretation and info extraction
- Info extraction from remotely sensed imagery
- Mapping environmental features
- Extracting spatio-temporal patterns
- Spatial interaction, flow and movement in
geographic space and human geographic systems
38Critical Challenges in Geographic Knowledge
Discovery and Data Mining
- Developing and supporting geographic data
warehouses - Better spatio-temporal representations in
geographic knowledge discovery - Geographic KD using diverse data types
- User interfaces for geographic KD
- Proof of concepts and benchmarking
- Building discovered geographic knowledge into GIS
and spatial analysis
39Objectives
- Apply DM and KD techniques to the new generations
of geospatial data models and identify analytical
and visualizational needs for geospatial DM and
KD - Develop a taxonomy of geographic knowledge and
categorize models for geographic information
computing - Enable a full implementation of geographic KD
across distributed databases that allow the
general public to inspect climate patterns and
regional demographic dynamics, for example, on
the internet
40Geospatial Data Structure, Data Fusion
- Fusion the blending of 2 or more
things.
41Geospatial Data Fusion
- Fusion of gridded collections of measurement (ex.
Image fusion) - Fusion of remote sending image data and semantic
data residing in a GIS
42Image Fusion
- Data Level Fusion
- Feature Level Fusion
- Decision Level Fusion
43Fusion of remote sending image data and semantic
data residing in a GIS
- Feature Level Fusion
- Decision Level Fusion
- Modeling of Processes (biogeochemical)
44Data Level Fusion
- Spatial Domain Fusion
- Spectral Domain Fusion
- Scale-Space Fusion
45Decision Level Fusion
- Classifier Fusion
- Classifier Selection
46Conclusion
- Geospatial data integration requires a good
understanding of data structure, algorithms, data
mining, and knowledge discovery techniques. - In our next discussion we will explore these
concepts further.