Columbia University - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Columbia University

Description:

Nonlinear Dimensionality Reduction and K-Nearest Neighbor Classification ... Semipositive definiteness. Inner product centered on the origin ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 12
Provided by: Micro307
Category:

less

Transcript and Presenter's Notes

Title: Columbia University


1
Columbia University Advanced Machine Learning
Perception Fall 2006 Term Project
Nonlinear Dimensionality Reduction and K-Nearest
Neighbor Classification Applied to Global Climate
Data
Carlos Henrique Ribeiro Lima New York Dec/2006
2
Outline
  • Goals
  • Motivation and Dataset
  • Methodology
  • Results
  • Low-Dimensional Manifold
  • KNN on Low-Dimensional Manifold
  • Conclusion

3
1. Goals
  • Use of kernel PCA based on Semidefinite
    Embedding to identify the low-dimensional,
    non-linear, manifold of climate data sets ?
    identification of main modes of spatial
    variability
  • Classification on the feature space ?
    predictions on the original space (KNN method)

4
2. Motivation
Dataset of Monthly Sea Surface Temperature (SST)
Huge economical and social impacts of extreme El
Nino events (e.g. 1997) ? Need of forecasting
models!
5
2. Dataset
  • Monthly Sea Surface Temperature (SST) Data
  • from Jan/1856 to Dec/2005
  • Latitudinal Band 25oS-25oN
  • Grid with 599 cells
  • Training data Jan/1856 to Dec/1975 120 years
  • Testing set Jan/1976 to Dec/2005 30 years
  • Input matrix

n 1440 points m 599 dimensions
6
3. Methodology
1) Semidefinite Embedding (Code from K. Q.
Weinberger)
Semipositive definiteness
Inner product centered on the origin
Isometry - local distances of the input space are
preserved on the feature space
2) KNN ? Euclidian Distance 3) Probabilistic
Forecasting ? Skill Score (RPS)
7
4. Results Low-Dimensional Manifold
8
4. Results Labeling on the feature space
9
4. Results Forecasts Testing Set KNN method
and skill score
E.g. March 1997 1) Want to predict the class
of nino3 in Dec/1997 ? lead time 9 months. 2)
KNN on feature space (March1856 to 1975) 3)
Take classes and weights of the k neighbors 4)
Skill score.
10
4. Results Forecasts Testing Set KNN method
and skill score El Nino of 1982 and 1997
11
  • 5. Conclusions
  • Semidefinite Embedding performs well on the SST
    data (high dimensional ? just 3 dimensions 90of
    exp. variance)
  • KNN method provides very good classification and
    forecasts
  • Need to check sensibility to change in some
    parameters ( local neighbors, KNN)
  • Plan to extend to other climate datasets
  • Try other metrics, multivariate data, etc.
Write a Comment
User Comments (0)
About PowerShow.com