Multivariate Statistics, Part 2 - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Multivariate Statistics, Part 2

Description:

Multivariate Statistics, Part 2 – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 13
Provided by: brucek64
Category:

less

Transcript and Presenter's Notes

Title: Multivariate Statistics, Part 2


1
Multivariate Statistics, Part 2
  • ESM 206, 5/19/05

2
Cluster analysis
  • Ordination organizes the samples along a set of
    latent variables that are continuous
  • Cluster analysis organizes the samples along a
    set of latent variables that are categorical
  • Find a set of discrete clusters such that the
    observations within a cluster are much more
    similar to each other than they are with
    observations in other clusters
  • Requires some notion of distance in the
    P-dimensional space
  • Euclidean distance
  • Mahalanobis distance Like Euclidean but takes
    statistical correlations into account
  • Dissimilarity coefficient often used for count
    data of species
  • Non-hierarchical clustering
  • Get just the clusters
  • Hierarchical clustering
  • Also get similarities between clusters

3
Community composition at 6 sites
4
(No Transcript)
5
Cluster Analysis
  • Makes very few assumptions about data ?
  • Lots of options and they matter ?
  • Distance measure
  • Agglomerate or divide
  • Rules for growing or dividing clusters
  • Other challenges
  • Only finds spherical clusters
  • Finds clusters even when data are following
    continuous variables
  • If this is true then different approaches will
    build very different clusters

6
(No Transcript)
7
(No Transcript)
8
Discriminant Analysis
  • Have predefined groups each observation has
    unique value
  • Presence or absence of a species
  • Political party affiliation
  • Vegetation type
  • Also have set of continuous variables measured at
    each observation
  • Want to find linear combinations of continuous
    variables that best discriminate among the groups
  • Sort of like logistic regression

9
(No Transcript)
10
Example presence/absence of a marine invertebrate
  • Data from EPA EMAP surveys in estuaries on US
    east coast
  • Presence/absence of Mediomastus ambiseta
  • Data on 16 water quality variables
  • Salinity, temperature, etc. at surface and bottom

11
Stepwise procedure suggests most discrimination
from surface salinity and pH PAR at bottom
12
Classification accuracy
  • Confusion matrix
  • Kappa statistic
Write a Comment
User Comments (0)
About PowerShow.com