Maximum Entropy - PowerPoint PPT Presentation

1 / 51

About This Presentation

Title:

Maximum Entropy

Description:

'Regularization' 13. find distribution p such that ... Effect of regularization: multiplier = 5. Larger confidence. Intervals. Higher entropy ... – PowerPoint PPT presentation

Number of Views:1383

Avg rating:3.0/5.0

Slides: 52

Provided by: nrac

Category:

more less

Transcript and Presenter's Notes

Title: Maximum Entropy

1
Maximum Entropy

RESM 575
Spring 2009
Lecture 13

2
Maximum entropy
(Phillips et al. 2008)

History
E. T. Janes 1957
Thermodynamics
Inference and information theory

3
The Maximum Entropy Method

Origins Jaynes 1957, statistical mechanics
Recent use machine learning, eg. automatic
language translation
To estimate an unknown distribution
Determine what you know (constraints)
Among distributions satisfying constraints
Output the one with maximum entropy

4
(No Transcript)
5
What is it?

Maxent is a general-purpose method for making
predictions or inferences from incomplete
information.
Its origins lie in statistical mechanics (Jaynes,
1957), and it remains an active area of research
with an Annual Conference, Maximum Entropy and
Bayesian Methods, that explores applications in
diverse areas such as
astronomy, portfolio optimization, image
reconstruction, statistical physics and signal
processing.

6
Like other Bayesian models

Uses prior information
Maxent is an alternative to methods of inference
of classical statistics

7
Maximum Entropy Principle
The fact that a certain probability distribution
maximizes entropy subject to certain constraints
representing our incomplete information, is the
fundamental property which justifies the use of
that distribution for inference it agrees with
everything that is known but carefully avoids
assuming anything that is not known (Jaynes,
1990).
8
Why?

Introduced as a general approach for presence
only modeling of species distributions, suitable
for all existing applications involving
presence-only datasets.

9
Modeling species distributions
Yellow-throated Vireo
occurrence points

environmental variables
10
Estimating a probability distribution

Given
Map divided into cells
Environmental variables, with values in each cell
Occurrence points samples from an unknown
distribution
Our task is to estimate the unknown
probability distribution
Note
The distribution sums to 1 over the whole map
Most probability values will be very small
Different from estimating probability of presence

11
Entropy

More entropy more spread out, closer to
uniform distribution
2nd law of thermodynamics
Without external influences, a system moves to
increase entropy
Maximum entropy method
Apply constraints to remove external influences
Species spreads out to fill areas with suitable
conditions

12
Using Maxent for Species Distributions

Features
Constraints
Regularization

13
Features impose constraints
Feature environmental variable, or function
thereof
find distribution p of maximum entropy such
that for all features f mean(f) sample average
of f
14
Features

Environmental variables or functions thereof.
Maxent has these classes of features (others are
possible)
Linear variable itself
Quadratic square of variable
Product product of two variables
Binary (indicator) membership in a
category
Threshold
Hinge

1
0
Environmental variable
1
0
Environmental variable
15
Constraints
Each feature type imposes constraints on output
distribution Linear features mean Quadratic
features variance Product features
covariance Threshold features proportion
above threshold Hinge features mean above
threshold Binary features (categorical)
proportion in each category
16
Regularization
precipitation
sample average
true mean
temperature
find distribution p of maximum entropy such
that Mean(f) in confidence region of sample
average of f
17
The Maxent distribution
is always a Gibbs distribution
q?(x) exp(Sj ?jfj(x)) / Z
Z is a scaling factor so distribution sums to
1 fj is the jth feature ?j is a
coefficient, calculated by the program
18
Maxent is penalized maximum likelihood
Log likelihood LogLikelihood(q?) 1/m Si
ln(q?(xi)) where x1 xm are the occurrence
points.
Maxent maximizes regularized likelihood LogLike
lihood(q?) - Sj ßj?j where ßj is the width of
the confidence interval for fj Similar to Akaike
Information Criterion (AIC).
19
Output

When Maxent is applied to presence-only species
distribution modeling, the pixels of the study
area make up the space on which the Maxent
probability distribution is defined,
Pixels with known species occurrence records
constitute the sample points, and the features
are
climatic variables,
elevation,
soil category,
vegetation type or other environmental variables,
and functions thereof.

20
To note

Sometimes both presence and absence occurrence
data are available for the development of models,
in which case general-purpose statistical methods
can be used
(for an overview of the variety of techniques
currently in use, see Corsi et al., 2000 Elith,
2002 Guisan and Zimmerman, 2000 Scott et al.,
2002).

21
Opportunity

However, while vast stores of presence-only data
exist, (records etc.) absence data are rarely
available,
Poorly sampled areas, remote, difficult
Absence data may be of questionable value in many
situations

22
(No Transcript)
23
Background

16 modeling methods
226 well surveyed species in 6 regions of the
world

24
The authors used three statistics, the area under
the Receiver Operating Characteristic curve
(AUC), correlation (COR) and Kappa, to assess the
agreement between the presence-absence records
and the predictions.
25
(No Transcript)
26
(No Transcript)
27
Maximum Entropy

Only useful when applied to testable information.
(whether a given distribution is consistent with
it)
Given testable information, the maximum entropy
procedure consists of seeking the probability
distribution which maximizes information entropy,
subject to the constraints of the information.
This constrained optimization problem is
typically solved using the method of Lagrange
multipliers.

28
(No Transcript)
29
Output format
Raw output Cumulative output
30
Cumulative output format

Gives estimate of omission rate
A pixel p has cumulative value c
Total probability of pixels with lower
probability than p is c
Set a threshold of c
Binary model with presence if cumulative value
c
Omission rate is c if test data drawn from
Maxent distribution
Predict omission rate of c for real test data
Example thresholds
5 (light red)
20 (dark red)

31
Logistic output format

Estimates probability of presence
Between 0 and 1
Scaled so that a typical presence has value 0.5
Defined as
c q?(x) / (1 c q?(x))
where c exp(H(q?(x))
Probability of presence depends on sampling
details
Site size
Observation time
These details should correspond to collection
effort for occurrence points

32
Response curves

Show how predicted probability of presence
depends on each variable
Simple features ? simpler model
Easier interpretation
Complex features ? complex model
Better fit to data
Linear quadratic (top)
Threshold features (middle)
All feature types (bottom)

33
Effect of regularization multiplier 0.2
Smaller confidence Intervals Lower
entropy Less spread-out
34
Effect of regularization multiplier 5
Larger confidence Intervals Higher
entropy More spread-out
35
Effect of regularization over-fitting
Regularization multiplier 1.0 (not over-fit)
Regularization multiplier 0.2 (clearly over-fit)
36

Sage grouse distribution model
MAXENT software package
Consistently superior to alternative methods
Robust to colinearity between explanatory
variables
Accepts continuous and categorical variables
Stable distribution with limited training data
Evaluates relative variable importance

37
West Virginia Conservation Prioritization using
Species Distribution Modeling

Michael Dougherty
West Virginia Division of Natural Resources

The Conservation Fund
38

Project Goals
Develop statewide conservation prioritization map
based on the
distribution of
Species of Greatest Conservation Need (SGCN)
Habitats of Concern
Existing public land
The Challenge
Develop distribution models for 500 state-tracked
species
Species include plants, herps, birds, bats,
mammals, aquatics
Modeling process must be defensible, transparent,
and repeatable

Occurrence data
1. State Natural Heritage Program Biotics
database
Biologists collect Source Features
Source Features are grouped into Element
Occurrences (EOs)
EOs represent known populations
Species identification is accurate and spatial
accuracy documented
Use of EOs seems to greatly reduce spatial
autocorrelation
2. Community Ecologists Vegetation Plots
Database

Predictor Variables
Developed a broad range of predictor variables
Climate
Landcover
Terrain
Ecoregions
Geology
Soils
Disturbances

Workflow Overview
Build an array of workstations to run models
Develop R scripts to automate running the
maxent models by iterating through all 500
species
Develop web-based map viewer to assist
biologists in reviewing maxent model results
Perform patch and connectivity analysis using
FunConn
(TBD) Assign weights to patches and connectors

Scripting Steps
Developed R script to performed variable
pre-selection using boosted regression trees to
reduce the number of variables to an appropriate
number (30)
Developed R script to produce the maxent batch
files and perform file management
Developed R script to harvest maxent results, a
Python script to store grids in an ArcSDE
database, and publish results to a website
(TBD) Develop R scripts to perform functional
connectivity analysis
(TBD) Perform layer weighting to produce
conservation prioritization index

43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
Occurrence localities

Csv file format. Each line has
Species name
X coordinate
Y coordinate
Multiple species can be in 1 file.

Example species,longitude,latitude bradypus_vari
egatus,-65.4,-10.3833 bradypus_variegatus,-65.3833
,-10.3833 bradypus_variegatus,-65.1333,-16.8 brady
pus_variegatus,-63.6667,-17.45
49
Environmental variables

ESRI ascii raster grid file format.
One file per environmental variable
All files must have exactly the same bounds, cell
size
Coordinate system must be same as for occurrence
localities
Alternative Diva .grd format.

50
Samples with data (SWD) format

Environmental data given with samples in a .csv
format file
Example
species,longitude,latitude,cld,dtr,ecoreg,frs,h_de
m,pre,pre_l10,pre_l1,pre_l4,pre_l7,tmn,tmp,tmx,vap
bradypus_variegatus,-65.4,-10.3833,76.0,104.0,10.0
,2.0,121.0,46.0,41.0,84.0,54.0,3.0,192.0,266.0,337
.0,279.0
bradypus_variegatus,-65.3833,-10.3833,76.0,104.0,1
0.0,2.0,121.0,46.0,40.0,84.0,54.0,3.0,192.0,266.0,
337.0,279.0
bradypus_variegatus,-65.1333,-16.8,57.0,114.0,10.0
,1.0,211.0,65.0,56.0,129.0,58.0,34.0,140.0,244.0,3
21.0,221.0
bradypus_variegatus,-63.6667,-17.45,57.0,112.0,10.
0,3.0,363.0,36.0,33.0,71.0,27.0,13.0,135.0,229.0,3
07.0,202.0

51
Background data in SWD format

Environmental data at (typically) random points
in study area
Useful
when environmental grids huge
Maxent needs only small random sample (10,000)
when doing non-uniform sampling
Example
species,longitude,latitude,cld,dtr,ecoreg,frs,h_de
m,pre,pre_l10,pre_l1,pre_l4,pre_l7,tmn,tmp,tmx,vap
background,-61.775,6.175,60.0,100.0,10.0,0.0,747.0
,55.0,24.0,57.0,45.0,81.0,182.0,239.0,300.0,232.0
background,-66.075,5.325,67.0,116.0,10.0,3.0,1038.
0,75.0,16.0,68.0,64.0,145.0,181.0,246.0,331.0,234.
0
background,-59.875,-26.325,47.0,129.0,9.0,1.0,73.0
,31.0,43.0,32.0,43.0,10.0,97.0,218.0,339.0,189.0
background,-68.375,-15.375,58.0,112.0,10.0,44.0,20
39.0,33.0,67.0,31.0,30.0,6.0,101.0,181.0,251.0,133
.0
background,-68.525,4.775,72.0,95.0,10.0,0.0,65.0,7
2.0,16.0,65.0,69.0,133.0,218.0,271.0,346.0,289.0