Title: Image Information Mining in Remote Sensing
1Image Information Mining in Remote Sensing
2Outline
- Introduction
- Remote sensing image information mining (ReSim)
- Supervised image classification
- Texture feature extraction
- Category-based clustering
- Query-by-example and evaluation
- Summary and future work
3Data Collected and Stored
- Technology is available to help us collect data
- Bar code, scanners, satellites, cameras,
- Technology is available to help us store data
- Databases, data warehouses,
4We are drowning in data, but starving for
information and knowledge.
Knowledge Discovery and Data Mining
Knowledge Discovery
5Data Mining An Iterative Process
Knowledge
Pattern Evaluation
- In theory, data (information) mining is a step in
the knowledge discovery process. In practice,
they are becoming synonyms.
Patterns
Data Mining
Task-relevant Data
Selection
Databases
Data Cleaning
Data Preprocessing
Data
- Learning the application domain for relevant
prior knowledge - Gathering, integrating, cleaning, and
preprocessing data - Selecting data (find useful features,
dimensionality/variable reduction,) - Choosing data mining functions (classification,
clustering, association rules, ) - Interpreting and evaluating results
(visualization, removing redundant patterns,)
6Data Mining Confluence of Multiple Disciplines
7Remote Sensing
- Remote Sensing is the science of acquiring
information about the Earth's surface without
actually being in contact with it.
recording reflected energy
images collected in multiple bands of the
electromagnetic spectrum
8Information Mining in Remote Sensing Images
- Motivation
- Enormously growing of the volume of remotely
sensed imagery - Existing systems allow simple queries on sensor,
date, location, - Need for efficient retrieval of useful information
- Objective
- Develop integrated software tools for
professionals in remote sensing to mine
interesting information in remotely sensed image
databases
- Commercial data mining packages
- IBM Intelligent Miner for Data, SGI MineSet,
- Image information mining research prototypes
- ITSC Algorithm Development and Mining (Adam)
System - NASA JPL Diamond Eye System
- DLR Intelligent Satellite Information Mining
System - Insightful VisiMine System
9Information Mining in Remote Sensing Images
(contd)
- Image information mining is an interdisciplinary
endeavor - Computer vision (image processing)
- Pattern recognition (classification clustering)
- Databases (images ancillary data)
- Information Retrieval (indexing and queries)
- Challenges of mining information in remote
sensing images - Multi /hyper spectral (huge size, different
formats) - Time consuming preprocessing (correction and
registration) - Complex spatial / temporal associations
- Feature extraction semantic definition
(application specific) - Ancillary data (climate variables, digital
elevation model) - Interpretation (priori and domain knowledge)
10- Introduction
- Remote sensing image information mining (ReSim)
- Supervised image classification
- Texture feature extraction
- Category-based clustering
- Query-by-example and evaluation
- Summary and future work
11ReSim System Architecture
Spectral information land cover classes
Spatial information texture
- Image Processing
- principal component analysis
- texture feature extraction
- classification clustering
- Databases
- object-oriented database
- image database
- Graphical User Interface
- query
- browsing
- visualization
12Multi-band Landsat Thematic Mapper (TM) images
- Radiometric and geometric rectified scenes of
Nebraska - Central 4096 4096 pixels in each Full Scene (6
/ 7 bands) - Two-level split to facilitate the implementation
- Each 1024 1024 Image 64 128 128
Regions - 4 Full Scenes, 64 Images, 4096 Regions
13- Introduction
- Remote sensing image information mining (ReSim)
- Supervised image classification
- Texture feature extraction
- Category-based clustering
- Query-by-example and evaluation
- Summary and future work
14Remote Sensing Image Classification
- Supervised classification
- User first specifies the land cover types and
training pixels in the image - The classifier is trained and used to classify
remaining pixels - Unsupervised classification
- The clustering algorithm aggregate all pixels
into clusters - User then assigns these spectral groups into land
cover types
supervised image classification
unsupervised image classification
15Support vector machines classification
16Support vector machines classification (contd)
- Maximal margin hyperplane (w, b)? - solve the
optimization problem - dual form representation with the primal
Lagrangian -
- Non linear separable? - Kernel method
- Transform input vectors into a higher dimensional
feature space - by a mapping function and then do a
linear separation there. - The expensive computation of inner products
can be - reduced significantly by using a suitable
kernel function - We do not need to have an explicit representation
of , but only K
17Support vector machines classification (contd)
- Radial Basis Function (RBF) kernel
leave-one-out model selection
18USGS Land Use/Land Cover scheme
19Support vector machines classification (contd)
- Radial Basis Function (RBF) kernel
- Accuracy better than Maximum Likelihood classifier
original image
classified image
land cover types
20- Introduction
- Remote sensing image information mining (ReSim)
- Supervised image classification
- Texture feature extraction
- Category-based clustering
- Query-by-example and evaluation
- Summary and future work
21Principal Component Analysis
- A linear transformation of a multivariate dataset
(multispectral image) into a new coordinate
system - Reduce the dimensionality (decorrelation) of the
data set while retaining most of the variance - Eigenvalues of covariance matrix
TM image (Omaha)
1st component
eigenvalues of principal components
22Texture Feature Extraction
- Texture feature representation
- statistics model
- co-occurrence matrices
- probability model
- Markov random fields parameters
- transform-based model
- Gabor wavelets
- A two-dimensional Gabor function and its Fourier
transform
23Texture Feature Extraction (contd)
- Gabor wavelets
- self-similar filter dictionary obtained by
appropriate dilations and rotations of
through the generation function
scale factor
, K of orientations
- Feature representation
- Gabor wavelet transform of an image (PCA 1
region) - mean and standard deviation of the magnitude of
the coefficients - feature vector
S of scales
24- Introduction
- Remote sensing image information mining (ReSim)
- Supervised image classification
- Texture feature extraction
- Category-based clustering
- Query-by-example and evaluation
- Summary and future work
25Category-based Clustering
- Partition the texture feature space into
subspaces in terms of the combined land cover
classes
water/wetlands river/grassland forest/pasture
crops/pasture urban/grasslands
26Category-based Clustering (contd)
- k-means clustering within each subspace
1) Randomly choose k patterns as each cluster
centers 2) Assign each pattern to the closest
cluster 3) Compute the cluster centers (mean). 4)
Go to 2) if not converge
- Optimization - How many clusters? Better starting
centers?
1) Randomly choose J small sub-samples of the
data 2) Minimum validity value gives optimal
number of clusters Kopt 3) Input Kopt to the
starting centers refinement algorithm
27The Image Database
28The Object-oriented Database
29- Introduction
- Remote sensing image information mining (ReSim)
- Supervised image classification
- Texture feature extraction
- Category-based clustering
- Query-by-example and evaluation
- Summary and future work
30Pattern Retrieval using Query-by-Example
31Pattern Retrieval using Query-by-Example (contd)
- Query-by-Example - search for similar patterns
using a selected example
- Select a region referring to the land cover
classes
- Example similar patterns with mixed
crops/grasslands and water
32Pattern Retrieval using Query-by-Example (contd)
Online Mode query example NOT in database
Batch Mode query example in database
33Pattern Retrieval using Query-by-Example (contd)
Query example generation of a river scene
Similar patterns shown in a ranked order
34Pattern Retrieval using Query-by-Example (contd)
Similar patterns shown in a ranked order
Query example generation of a crop scene
35Evaluation
A high coverage value shows that system can
retrieve most of the relevant images the user
expects to see.
A high novelty value indicates that the system
can discover many new relevant images previously
unknown to the user.
36Evaluation (contd)
Relevant patterns (agricultural lands around the
Missouri River)
37Summary
- Introduced data mining and information mining in
remote sensing images - Presented a remote sensing image information
mining framework - Explored state-of-the-art data mining and
databases technologies to retrieve spectral and
spatial information from remote sensing imagery - LCLU corresponding to spectral reflection SVM
classification - Texture features characterizing spatial
information Gabor wavelets - Optimized k-means clustering to acquire search
efficient space - Object-oriented databases and image databases
- K-nearest neighbor search via query-by-example
(QBE) - Graphical user interface (GUI)
- Validated the system effectiveness by coverage
and novelty measures
38Future Work
- Shape features, spatial-temporal relationships,
trend analysis, - More data mining functions such as association
rules, decision trees, - Data warehouse techniques and uniform
object-oriented data model for hierarchy
multi-resolution storage and retrieval - Geographic Information System (GIS) connectivity
- To access ancillary data in vector format such as
soil/hydrologic map, - To update GIS databases with the retrieved
information - Evolve the prototype system into a practical
software tool and apply it into specific
applications (agricultural and environmental
monitoring)
39References
- U. M. Fayyad, G. P. Shapiro, P. Smyth, and R.
Uthurusamy (Eds.), Advances in Knowledge
Discovery and Data Mining, AAAI/MIT Press, 1996. - J. Zhang, H. Wynne, M. L. Lee, Image mining
issues, frameworks, and techniques, in
Proceedings of 2nd International Workshop on
Multimedia Data Mining, San Francisco, Aug 2001,
pp. 13 20. - J. R. Jensen, Introductory Digital Image
Processing A Remote Sensing Perspective,
Prentice-Hall, New Jersey, 1996. - V.N. Vapnik, Statistic Learning Theory,
John-Wiley and Sons, Inc, 1998. - N. Cristianini, and J. Shawe-Taylor, An
Introduction to Support Vector Machines, The
Cambridge University Press, Cambridge, UK, 2000. - J. Li and R. M. Narayanan, "Integrated spectral
and spatial information mining in remote
sensing," IEEE Transactions on Geoscience and
Remote Sensing, vol. 42, no. 3, pp. 673 685,
March 2004. - J. Li and R. M. Narayanan, "A shape-based
approach to change detection of lakes using time
series remote sensing images," IEEE Transactions
on Geoscience and Remote Sensing, vol. 41, no.
11, pp. 2466 2477, November 2003. - J. Li, R. M. Narayanan, William J. Waltman, and
Albert J. Peters, Fuzzy feature-based image
mining in remote sensing, in SPIE AeroSense
Conference on Data Mining and Knowledge
Discovery, Orlando, Florida, April 2001, vol.
4384, pp. 46 55.