Image Classification_ Accuracy Assessment

About This Presentation

Title:

Image Classification_ Accuracy Assessment

Description:

Image Classification_ Accuracy Assessment – PowerPoint PPT presentation

Number of Views:534

Slides: 59

Provided by: jwanaldoski

Category: How To, Education & Training

Tags: image_classification_accuracy_assessment

more less

Transcript and Presenter's Notes

Title: Image Classification_ Accuracy Assessment

1
Image Classification Accuracy Assessment
Reorganized By Jwan M Aldoski
Department of Civil Engineering , Faculty of
Engineering, Universiti Putra Malaysia, 43400
UPM Serdang, Selangor Darul Ehsan. Malaysia.
2
Where in the World?
3
Learning objectives

Remote sensing science concepts
Rationale and technique for post-classification
smoothing
Errors of omission vs. commission
Accuracy assessment
Sampling methods
Measures
Fuzzy accuracy assessment
Math Concepts
Calculating accuracy measures overall accuracy,
producers accuracy and users accuracy and kappa
coefficient.
Skills
Interpreting Contingency matrix and Accuracy
assessment measures

4
Post-classification smoothing

Most classifications have a problem with salt
and pepper, i.e., single or small groups of
mis-classified pixels, as they are point
operations that operate on each pixel independent
of its neighbors
Salt and pepper may be real. The decision on
whether to filter/eliminate depends on the choice
of the minimum mapping unit does it equal
single pixel or an aggregation
Majority filtering replaces central pixel with
the majority class in a specified neighborhood (3
x 3 window) con alters edges
Eliminate clumps like pixels and replaces
clumps under size threshold with majority class
in local neighborhood pro doesnt alter edges

5
Example Majority filtering
6 6 6 6 6 2 6 6 2 6 2 6 2 6 6 2 8 2 6 6 2 2 2 2 2
6 6 2 6 2 6 8
2 6
3x3 window
Class 6 majority in window
Example from ERDAS IMAGINE Field Guide, 5th ed.
6
Example reduce single pixel salt and pepper
Input
Output
6 6 6 6 6 2 6 6 6 6 2 2 6 6 6 2 2 2 6 6 2 2 2 2
2
6 6 6 6 6 2 6 6 2 6 2 6 2 6 6 2 8 2 6 6 2 2 2 2 2
Edge
7
Example altered edge
Input
Output
6 6 6 6 6 2 6 6 2 6 2 6 2 6 6 2 8 2 6 6 2 2 2 2 2
6 6 6 6 6 2 6 6 6 6 2 2 6 6 6 2 2 2 6 6 2 2 2 2
2
Edge
8
Example Majority filtering
6 6 6 6 6 2 6 6 2 6 2 6 2 6 6 2 8 2 6 6 2 2 2 2 2
6 6 2 6 2 6 8
2 6
3x3 window
Class 6 majority in window
Example from ERDAS IMAGINE Field Guide, 5th ed.
9
Example ERDAS Eliminate no altered edge
Input
Output
6 6 6 6 6 2 6 6 2 6 2 6 2 6 6 2 8 2 6 6 2 2 2 2 2
6 6 6 6 6 2 6 6 2 6 2 6 2 6 6 2 2 2 6 6 2 2 2 2
2
Edge
Small clump eliminated
10
Accuracy Assessment

Always want to assess the accuracy of the final
thematic map! How good is it?
Various techniques to assess the accuracy of
the classified output by comparing the true
identity of land cover derived from reference
data (observed) vs. the classified (predicted)
for a random sample of pixels
The accuracy assessment is the means to
communicate to the user of the map and should be
included in the metadata documentation

11
Accuracy Assessment

R.S. classification accuracy usually assessed and
communicated through a contingency table,
sometimes referred to as a confusion matrix
Contingency table m x m matrix where m of
land cover classes
Columns usually represent the reference data
Rows usually represent the remote sensed
classification results (i.e. thematic or
information classes)

12
Accuracy Assessment Contingency Matrix
13
Accuracy Assessment

Sampling Approaches to reduce analyst bias
simple random sampling every pixel has equal
chance
stratified random sampling of points will be
stratified to the distribution of thematic layer
classes (larger classes more points)
equalized random sampling each class will have
equal number of random points

Sample size at least 30 samples per land cover
class

14
How good is good?

How accurate should the classified map be?
General rule of thumb is 85 accuracy
Really depends on how much risk you are willing
to accept if the map is wrong
Are you interested in more in the overall
accuracy of the final map or in quantifying the
ability to accurately identify and map individual
classes
Which is more acceptable overestimation or
underestimation

15
How good is good? Example

USGS_NPS National Vegetation classification
standard
Horizontal positional locations meet National Map
Accuracy standards
Thematic accuracy gt80 per class
Minimum Mapping Unit of 0.5 ha
http//biology.usgs.gov/npsveg/aa/indexdoc.html

16
A whole set of field reference point can be
developed using some sort of random allocation
but due to travel/access constraints, only a
subset of points is actually visited. Resulting
in a not truly random distribution.
17
Accuracy Assessment Issues

What constitutes reference data? - higher
spatial resolution imagery (with visual
interpretation) - ground truth GPSed
field plots - existing GIS maps
Reference data can be polygons or points

18
Accuracy Assessment Issues

Problem with mixed pixels possibility of
sampling only homogeneous regions (e.g., 3x3
window) but introduces a subtle bias
If smoothing was undertaken, then should assess
accuracy on that basis, i.e., at the scale of the
mmu
If a filter is used should be stated in metadata
Ideally, of overall map that so qualifies
should be quantified, i.e., 75 of map is
composed of homogenous regions greater than 3x3
in size thus 75 of map assessed, 25 not
assessed.

19
Errors of Omission vs. Commission

Error of Omission pixels in class 1 erroneously
assigned to class 2 from the class 1 perspective
these pixels should have been classified as
class1 but were omitted
Error of Commission pixels in class 2
erroneously assigned to class 1 from the class 1
perspective these pixels should not have been
classified as class but were included

20
Errors of Omission vs. Commission from a Class2
perspective
Omission error pixels in Class2 erroneously
assigned to Class 1
Commission error pixels in Class1 erroneously
assigned to Class 2
of pixels
Class 1
Class 2

0
255
Digital Number
21
Accuracy Assessment Measures

Overall accuracy divide total correct (sum of
the major diagonal) by the total number of
sampled pixels can be misleading, should judge
individual categories also
Producers accuracy measure of omission error
total number of correct in a category divided by
the total in that category as derived from the
reference data measure of underestimation
Users accuracy measure of commission error
total number of correct in a category divided by
the total that were classified in that category
measure of overestimation

22
Accuracy Assessment Contingency Matrix
Reference Data
23
Accuracy Assessment Measures
24
Accuracy Assessment Measures
25
Accuracy Assessment Measures
26
Accuracy Assessment Measures

Kappa coefficient provides a difference
measurement between the observed agreement of two
maps and agreement that is contributed by chance
alone
A Kappa coefficient of 90 may be interpreted as
90 better classification than would be expected
by random assignment of classes
Whats a good Kappa? General range
K lt 0.4
poor 0.4 lt K lt 0.75 good K gt 0.75
excellent
Allows for statistical comparisons between
matrices (Z statistic) useful in comparing
different classification approaches to
objectively decide which gives best results
Alternative statistic Tau coefficient

27
Kappa coefficient
Khat (n SUM Xii) - SUM (Xi Xi)
n2 - SUM (Xi Xi) where SUM sum across all
rows in matrix Xii diagonal Xi
marginal row total (row i) XI marginal
column total (column i) n of
observations Takes into account the off-diagonal
elements of the contingency matrix (errors of
omission and commission)
28
Kappa coefficient Example
(SUM Xii) 308 279 372 26 10 93
176 48 1312 SUM (Xi Xi) (348315)
(295305) (379408) (2729) (1813)
(9997) (194189) (5155) Khat
1411(1312) 404,318
(1411)2 404,318 Khat 1851232 404,318
1,446,914 .912 1990921 404,318
1,586,603
29
Accuracy Assessment Measures
30
Case StudyMulti-scale segmentation approach to
mapping seagrass habitats using airborne digital
camera imaging

Richard G. Lathrop¹, Scott Haag¹² , and Paul
Montesano¹.
¹Center for Remote Sensing Spatial Analysis
Rutgers University
New Brunswick, NJ 08901-8551
²Jacques Cousteau National Estuarine Research
Reserve
130 Great Bay Blvd
Tuckerton NJ 08087

31
Methodgt Field Surveys

All transect endpoints and individual check
points were first mapped onscreen in the GIS.
Endpoints were then loaded into a GPS (-
3meters) for navigation on the water.
A total of 245 points were collected.

32
Methodgt Field Surveys

For each field reference point, the following
data was collected
GPS location (UTM)
Time
Date
SAV species presence/dominance Zostera marina or
Ruppia maritima or macroalgae
Depth (meters)
cover (10 intervals) determined by visual
estimation
Blade Height of 5 tallest seagrass blades
Shoot density ( of shoots per 1/9 m2 quadrat
that was extracted and counted on the boat)
Distribution (patchy/uniform)
Substrate (mud/sand)
Additional Comments

33
Resultsgt Accuracy Assessment
Reference Reference
GIS Map Seagrass Absent Seagrass Present Users Accuracy
Seagrass Absent 67 32 68
Seagrass Present 10 136 93
Producers Accuracy 87 81 83

The resulting maps were compared with the 245
field reference points.
All 245 reference points were used to support the
interpretation in some fashion and so can not be
truly considered as completely independent
validation
The overall accuracy was 83 and Kappa statistic
was 56.5, which can be considered as a moderate
degree of agreement between the two data sets.

34
Resultsgt Accuracy Assessment
Reference Reference
GIS Map Seagrass Absent Seagrass Present Users Accuray
Seagrass Absent 14 3 82
Seagrass Present 9 15 62
Producers Accuracy 61 83 71

The resulting maps were also compared with an
independent set of 41 bottom sampling points
collected as part of a seagrass-sediment study
conducted during the summer of 2003 (Smith and
Friedman, 2004).
The overall accuracy was 70.7 and Kappa
statistic was 43, which can be considered as a
moderate degree of agreement between the two data
sets.

35
SAV Accuracy Assessment Issues

Matching spatial scale of field reference data
with scale of mapping
Ensuring comparison of apples to apples
Spatial accuracy of ground truth point
locations
Temporal coincidence of ground truth and image
acquisition

36
Fuzzy Accuracy Assessment

Real world is messy natural vegetation
communities are a continuum of states, often with
one grading into the next
R.S. classified maps generally break up land
cover/vegetation into discrete either/or classes
How to quantify this messy world? R.S. classified
maps have still have some error while still
having great utility
Fuzzy Accuracy Assessment doesnt quantify
errors as binary correct or incorrect but
attempts to evaluate the severity of the error

37
Fuzzy Accuracy Assessment

Fuzzy rating severity of error or conversely the
similarity between map classes is defined from a
user standpoint
Fuzzy rating can be developed quantitatively
based on the deviation from a defined class based
on a difference (i.e., within /- so many )
Fuzzy set matrix fuzzy rating between each map
class and every other class is developed into a
fuzzy set matrix

For more info, see Gopal Woodcock, 1994.
PERS181-188
38
Fuzzy Accuracy Assessment
Level Description
5 Absolutely right Exact match
4 Good minor differences species dominance or composition is very similar
3 Acceptable Error mapped class does not match types have structural or ecological similarity or similar species
2 Understandable but wrong general similarity in structure but species/ecological conditions are not similar
1 Absolutely wrong no conditions or structural similarity
http//biology.usgs.gov/npsveg/fiis/aa_results.pdf
http//www.fs.fed.us/emc/rig/includes/appendix3j.
pdf
39
Fuzzy Accuracy Assessment

Each user could redefine the fuzzy set matrix on
an application-by-application basis to determine
what percentage of each map class is acceptable
and the magnitude of the errors within each map
class
Traditional map accuracy measures can be
calculated at different levels of error
Exact only level 5
(MAX) Acceptable
level 5, 4, 3 (RIGHT)
Example from USFS

Label Sites MAX(5 only) RIGHT (3,4,5) CON 88
71 81 82 93
40
Fuzzy Accuracy Assessment example from USFS
Confusion Matrix based on Level 3,4,5 as Correct

Label Sites CON MIX HDW SHB HEB NFO Total
CON 88 X 0 1 5
0 0 6
MIX 14 2 X 1 1
0 0 4
HDW 6 1 1 X 0
0 0 2
SHB 8 1 0 0 X
0 0 1
HEB 1 0 0 0 1
X 0 1
NFO 4 3 0 0 3
0 X 6
Total 121 7 1 2 10
0 0 20

http//www.fs.fed.us/emc/rig/includes/appendix3j.p
df
41
Fuzzy Accuracy Assessment

Ability to evaluate the magnitude or seriousness
of errors
Difference Table error within each map class
based on its magnitude with error magnitude
calculated by measuring the difference between
the fuzzy rating of each ground reference point
and the highest rank assigned to all other
possible map classes
All points that are Exact matches have
Difference values gt 0 all mismatches are
negative. Values -1 to 4 generally correspond to
correct map labels. Values of -2 to -4 correspond
to map errors with -4 representing a more serious
error than -1

42
Fuzzy Accuracy Assessment Difference Table
example from USFS
Label Sites Mismatches Matches -4
-3 -2 -1 0 1 2 3
4 CON 88 4 2 0 11
3 0 12 23 33 Higher
positive values indicate that pure conditions are
well mapped while lower negative values show pure
conditions to be poorly mapped. Mixed or
transitional conditions, where a greater number
of class types are likely to be considered
acceptable, will fall more in the middle
http//www.fs.fed.us/emc/rig/includes/appendix3j.p
df
43
Fuzzy Accuracy Assessment

Ambiguity Table tallies map classes that
characterize a reference site as well as the
actual map label
Useful in identifying subtle confusion between
map classes and may be useful in identifying
additional map classes to be considered
Example from USFS

Label Sites CON MIX HDW SHB HEB NFO Total CON
88 X 11 6 15 0
0 32 15 out of 88 reference sites mapped
as conifer could have been equally well labeled
as shrub
http//www.fs.fed.us/emc/rig/includes/appendix3j.p
df
44
Alternative Ways of Quantifying Accuracy Ratio
Estimators

Method of statistically adjusting for over- or
underestimation
Randomly allocate test areas, determine area
from map and reference data
Ratio estimation uses the ratio of Reference/Map
area to adjust the mapped area estimate
Uses the estimate of the variance to develop
confidence levels for land cover type area

Shiver Border, 1996. Sampling Techniques for
Forest Resource Inventory, Wiley, NY, NY. Pp.
166-169
45
Example NJ 2000 Land Use Update Comparison of
urban/transitional land use as determined by
photo-interpretation of 1m BW photography vs.
10m SPOT PAN
1 m BW 10 m SPOT PAN
46
Above 1-to-1 line underestimate
Below 1-to-1 line overestimate
47
Example NJ Land Use Change
Land Use Change Category Mapped Estimate (Acres) Statistically Adjusted Estimate with 95 CI (acres)
Urban 73,191 77,941 /- 17,922
Transitional/Barren 20,861 16,082 /- 7,053
Total Urban Barren 94,052 89,876 /- 16,528
48
Case Study Sub-pixel Un-mixing
Urban/Suburban Mixed Pixels varying proportions
of developed surface, lawn and trees
30m TM pixel grid on IKONOS image
49
Objective Sub-pixel Unmixing
False Color Composite Image R Forest G Lawn B
IS
Impervious Surface Estimation
Woody Estimation
Grass Estimation
50
Validation Data

For homogenous 90mx90m test areas
interpreted DOQ
-DOQ pixels scaled to match TM
For selected sub-areas
IKONOS multi-spectral image
3 key indicator land use classified map
impervious surface, lawn, and forest
-IKONOS pixels scaled to match TM

51
Egg Harbor City Egg Harbor City
IKONOS
Impervious
Grass
Woody
Landsat SOM-LVQ
Landsat LMM
52
Hammonton Hammonton
IKONOS Landsat LMM
Impervious
Grass
Woody
Landsat SOM-LVQ
53
Root Mean Square Error 90m x 90m test plots
Hammonton
Impervious Grass Tree
IKONOS 7.4 8.2 7.1
LMM 10.8 13.6 20.7
SOM_LVQ 12.0 10.3 11.0
Egg Harbor City
Impervious Lawn Urban Tree
IKONOS 5.6 5.8 6.1
LMM 7.7 12.5 19.6
SOM_LVQ 6.8 6.0 5.0
54
Hammonton Egg Harbor City
I m p e r v i o u s
G r a s s
T r e e s
SOM-LVQ vs. IKONOS Study sub-area comparison 3x3
TM pixel zonal
RMSE 13.5
RMSE 17.6
RMSE 15.0
RMSE 14.4
RMSE 21.6
RMSE 17.6
55
Comparison of Landsat TM vs. NJDEP IS estimates
56
Summary of Results

Impervious surface estimation compares favorably
to DOQ and IKONOS
10 to 15 for impervious surface
12 to 22 for grass and tree cover.

Shows strong linear relationship with IKONOS in
impervious surface and grass estimation

Greater variability in forest fraction due to
variability in canopy shadowing and understory
background

57
Summary

1 Majority filter remove salt pepper and/or
eliminate clump-like pixels.
2 Sampling methods of reference points
3 Contingency matrix and Accuracy assessment
measures overall accuracy, producers accuracy
and users accuracy, and kappa coefficient.
4 Fuzzy accuracy assessment Fuzzy rating, set
matrix, and ratio estimators.

58
Thank you

Write a Comment

User Comments (0)