Principal Components Analysis - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Principal Components Analysis

Description:

Vehicle use, territory, driving record. Breakdown of change in ... Example Scree Plot. 16. Pleasure. Commute. Business. Rural. Suburban. Urban. PC Calculation ... – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 26
Provided by: lf3m
Category:

less

Transcript and Presenter's Notes

Title: Principal Components Analysis


1
Principal Components Analysis
  • Eric Vaagen, FCAS
  • Assistant Actuary
  • September 5, 2008

2
Agenda
  • Motivation
  • What is PCA?
  • Background
  • Simple example
  • Is PCA right for you?

3
Motivation
  • Forecast average premium by coverage
  • Explanatory variables
  • Vehicle use, territory, driving record
  • Breakdown of change in average premium
  • Multicollinearity exists

4
Average Premium 2002-2006
5
Modeling Procedure
Explanatory Variables
Variable Selection
Response Variable
Model
Chosen Variables
6
Modeling Procedure
Vehicle Use Territory Drv. Record
Variable Selection
Average Premium
Multiple Regression
Chosen Variables
7
Variable Selection Methods
Variable Selection
  • Stepwise regression
  • Forward, backward
  • PCA
  • Unsupervised
  • Partial least squares
  • Supervised
  • GLM

8
Background
  • First described in 1901 by Karl Pearson
  • Find the best lines and planes to fit a set of
    points
  • What else did he discover?
  • Pearsons ?²
  • Linear regression
  • Classification of distributions (exponential
    family)

9
PCA Example
Explanatory Variables
  • Vehicle use
  • Pleasure
  • Commute
  • Business
  • Territory
  • Rural
  • Suburban
  • Urban

10
Vehicle Use 2002-2006
11
Territory 2002-2006
12
Example Average Premium
Response Variable
13
Modeling Procedure
Vehicle Use Territory
PCA
Average Premium
Multiple Regression
Chosen PCs
14
PCA Procedure
  • PCs
  • No multicollinearity
  • The 1st PC has the most variance
  • Output
  • Weights to create the PCs
  • Variability of each PC

15
Modeling Procedure
Vehicle Use Territory
5 years x 6 variables
Weights
PCA
5 years x 6 variables
Variability
Chosen PCs
16
Example Scree Plot
17
PC Calculation
Chosen Variables
PC 3 -0.55 0.36 0.23 -0.02 0.47 -0.55
  • Pleasure
  • Commute
  • Business
  • Rural
  • Suburban
  • Urban

PC 2 -0.54 0.14 0.48 -0.20 -0.31 0.58
  • PC 1
  • -0.19
  • 0.54
  • -0.40
  • 0.56
  • -0.45
  • -0.03

18
PC Calculation
  • PC1 - 0.19P 0.54C - 0.40B
  • 0.56R - 0.45S - 0.03U
  • PC12002 -0.19(30)0.54(50)-0.40(20)
  • 0.56(20)-0.45(30)-0.03(50)

19
Example - Modeling Procedure
Vehicle Use Territory
PCA
Average Premium
Multiple Regression
Chosen PCs
20
Example Results
Multiple Regression
21
ICBC Personal TPB
22
Advantages
  • Eliminates multicollinearity
  • Most of the original variance is captured in a
    few principal components
  • More refined selection method

23
Disadvantages
  • Can be hard to interpret the PCs
  • PC weights may not be stable from year to year
  • Difficult to explain

24
Is PCA Right For You?
  • Concerned about multicollinearity?
  • Confident in the set of explanatory variables?
  • Want to reduce dimensionality, without throwing
    away variables?

25
For More Information
  • 2008 Discussion Paper
  • PCA and Partial Least Squares Two Dimension
    Reduction Techniques for Regression
  • http//www.casact.org/pubs/dpp/dpp08/08dpp76.pdf
  • Predictive modeling seminar
  • Oct 6-7, 2008 in San Diego, CA
  • PCA and Partial Least Squares
Write a Comment
User Comments (0)
About PowerShow.com