Haas MFE SAS Workshop Lecture 3: - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Haas MFE SAS Workshop Lecture 3:

Description:

Title: Introduction to SAS Author: LIU PENG Last modified by: Peng Liu Created Date: 3/22/2006 11:23:10 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 22
Provided by: LIUP150
Category:

less

Transcript and Presenter's Notes

Title: Haas MFE SAS Workshop Lecture 3:


1
Haas MFE SAS WorkshopLecture 3
  • Peng Liu http//faculty.haas.berkeley.edu/peliu/c
    omputing

Haas School of Business, Berkeley, MFE 2006
2
Commonly used PROCeduresin Financial Economics
  • Peng Liu http//faculty.haas.berkeley.edu/peliu/c
    omputing

Haas School of Business, Berkeley, MFE 2006
3
Basic Statistical Analysis
  • Univariate statistics
  • PROC MEANS
  • PROC UNIVARIATE
  • PROC FREQ
  • Bivariate and Multivariate Statistics
  • PROC CORR
  • PROC NPAR1WAY
  • PROC TTEST

4
Comparison of PROC MEANSand PROC UNIVARIATE
  • PROC MEANS
  • DESCRIPTIVE STATISTICS
  • CLM CSS CV KURTOSIS LCLM MAX MEAN MIN N NMISS
    RANGE SKEWNESS STD STDERR SUM SUMWGT UCLM USS VAR
  • QUANTILE STATISTICS
  • MEDIANP50 Q1P25 Q3P75 P1 P5 P10 P90 P95 P99
    RANGE
  • HYPOTHESIS TESTING
  • PROBT T
  • PROC UNIVARIATE
  • DESCRIPTIVE STATISTICS
  • CSS CV KURTOSIS MAX MEAN MIN MODE N NMISS RANGE
    SKEWNESS STD STDMEAN SUM SUMWGT USS VAR
  • QUANTILE STATISTICS
  • MEDIAN P1 P5 P10 P90 P95 P99 Q1 Q3 RANGE
  • QUANTILE STATISTICS
  • NORMAL PROBN MSIGN PROBM SIGNRANK PROBS T PROBT
  • ROBUST STATISTICS
  • GINI MAD QN SN STD_SINI STD_MAD STD_QN
    STD_QRANGE STD_SN

5
PROC MEANS
  • PROC MEANS DATAmfe.loan
  • VAR appraisal ltv
  • CLASS state
  • RUN

PROC MEANS DATAmfe.loan max min VAR appraisal
ltv OUTPUT OUTm maxmaxvalue maxltv
minminvalue minltv RUN
  • The default output for PROC MEANS are variable
    label N Mean Std Dev Min max
  • median min max clm alpha0.05 are examples of
    options you can specify.
  • You can get summary statistics for many variables
  • CLASS statements will produce summary stat for
    each grouping class.
  • You can suppress print using NOPRINT option
  • You can save the result in a self-defined sas
    dataset.

6
PROC UNIVARIATE
  • PROC UNIVARIATE DATAmfe.loan
  • VAR ltv ID id
  • RUN

PROC UNIVARIATE DATAmfe.loan VAR ltv
HISTOGRAM QQPLOT /normal RUN
  • Use VAR to specify which variable you want to
    analyze, otherwise, this PROC will produce all
    variables
  • Use ID to identify Extreme Observations, without
    ID statement it will use observation number by
    default
  • Can plot histogram, quantile-quantile plots etc.
  • Can do twosided T test, etc.

7
PROC FREQ
  • PROC FREQ DATAmfe.loan
  • TABLE term
  • RUN

PROC FREQ DATAmfe.loan TABLE state
stateterm/nocol norow RUN
  • One-way v.s two-way frequency table
  • /CHISQ or /BINOMIAL option can be used to test
    equal proportion
  • In one TABLE statement, you can produce more than
    one frequency tables
  • You can suppress col percentage or/and row
    percentage by option /nocol norow

8
PROC CORR
  • PROC CORR DATAmfe.loan
  • VAR rate ltv fico_orig
  • RUN

PROC CORR DATAmfe.loan COV SPEARMAN VAR rate
ltv fico_orig RUN
  • The CORR procedure computes Pearson correlation
    coefficients, three nonparametric measures of
    association (Spearman rank-oder correlation,
    Kendalls taub and Hoeffdings measure of
    dependence D), and the probabilities associated
    with these statistics for numeric variables
  • The default is Pearson correlation.
  • COV option evolke the computation of covariance

9
PROC TTEST
  • DATA
  • INPUT a b _at__at_
  • DATALINES
  • 51 55 64 61 75 74 86 90
  • 95 93 68 71 73 72 90 95
  • RUN

PROC TTEST PAIRED ab RUN
  • DATA step will produce automatic dataset, if user
    did not specify one.
  • _at__at_ in INPUT lets SAS continuously read from
    datelines
  • DATALINES is a SAS statement followed by lines
    of raw data.
  • Data are typed continuously separated by blank,
    you can separated into a different line in the
    way you like.
  • should be stand by itself
  • PROC step will perform specified procedure on
    current dataset in working directory if user did
    not specify a particular dataset name
  • Paired T-Test

10
PROC NPAR1WAY
  • PROC NPAR1WAY DATAmfe.loan
  • CLASS state
  • VAR ltv
  • RUN
  • NONPARAMETRIC TEST FOR DIFFERENCE ACROSS ONE-WAY
    CLASSIFICATION.
  • IF the normality assumption does not hold, we may
    use some nonparametric tests.
  • PROC NPAR1WAY performs nonparametric tests for
    location and scale differences across a one-way
    classiication, based on the following scores
    Wilcoxin, Median, Van Der Waerden, Savage,
    Siegel-Tukey, Ansari-Bradley, Klotz, and Modd
    Scores.

11
Financial Econometrics using SAS
  • Linear Models (OLS, GLS and their variates)
  • PROC REG
  • PROC GLM (Skip)
  • Logistic Regression
  • PROC LOGISTIC
  • PROC GENMOD
  • Hazard Regression (Cox-P.H.)
  • PROC PHREG

12
Linear Model Theory
  • Data (yi, xi(xi1, xi2, xik)) for i1, , n
    and yi ? R
  • Model yi ?0?-1xi1 ?kxik ?i for i1,,n
  • For short yX??
  • where
  • Assumption ?i are i.i.d. normal N(0,?2)
  • Ordinary Least Square Estimation
  • ? (XTX)-1XTy

13
PROC REG
  • PROC REG is a SAS procedure for simple or
    multivariate linear regression models with
    continuous dependent variables.
  • Part of SAS/STAT
  • Model fitting (parameters, residuals, confidence
    limits, influential statistics, etc)
  • Model selection (forward, backward, stepwise,
    ,etc)
  • Hypothesis testing
  • Model diagnostics
  • Plotting
  • Outputting estimates and statistics

14
PROC REG Examples
  • PROC REG DATAmfe.loan
  • MODEL ltv rate
  • PLOT ltv rate
  • QUIT

MODEL ltv rate fico_orig OLSMODEL ltv term
rate fico_orig MODEL ltv rate fico_orig
term/SELECTIONF
  • Begin with PROC REG end with QUIT
  • Multiple independent , dependent variables are
    separated by space
  • Label OLS is optional, useful for multiple
    MODEL statement in one PROC REG
  • By default, a constant is included
  • Use /Options to request additional stat or
    specify model selection method
  • PLOT creates a scatter plot of your regression
    data and automatically adds the regression line.

15
Logistic Regression Theory
  • Data (yi, xi(xi1, xi2, xik)) for i1, , n
    and yi is a binary or ordinal response variable.
    e.g. yi ? 0,1
  • Model
  • Maximum Likelihood estimate of ?
  • Assumption binomial Variation

16
Logistic Regression SAS procedure
  • SAS has several procedures that performs logistic
    regression, e.g. GENMOD, CATMOD and LOGISTIC
  • PROC LOGISTIC
  • Works for binary or ordinal response variables
  • Performs MLE using different optimization
    algorithms
  • 4 model selection methods F, B, Stepwise, Score
  • Outputs statistics to dataset
  • Tests linear hypotheses of parameters

17
PROC LOGISTIC Examples
  • PROC LOGISTIC DATAmfe.loan
  • CLASS state edu
  • MODEL default ltv age edu term rate
    state/LINKLOGIT
  • RUN
  • Begin with PROC LOGISTIC end with QUIT
  • /LINKLOGIT option can be ignored, other options
    PROBIT, CLOGIT, CLOGLOG
  • Use CLASS statement to avoid creating dummy in
    DATA step
  • /option can be used to request additional stat,
    or specify selection method.
  • TEST statement

18
Survival Analysis Background 1
19
Survival Analysis Background 2
20
Cox Proportional Hazard Regression
21
PROC PHREG - Example
  • PROC PHREG DATAmfe.loan
  • MODEL loanageprepay(0) age edu race rate ltv
    fico_orig state
  • RUN
  • Use WHERE option to subset sample to want to
    regress
  • You can define, group variables inside PHREG
    after MODEL using IF THEN ELSE
  • Handling tied data /TIESEXACT, other option
    DISCRETE
  • Run PHREG for different group, use BY option,
    need to sort data.
  • Use CLASS statement to create dummy variables
Write a Comment
User Comments (0)
About PowerShow.com