Applying Statistics to Litigation Consulting - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Applying Statistics to Litigation Consulting

Description:

This means for every time period that passes, in this case quarters, revenues ... Observations of the independent and dependent variables for the same time period ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 39
Provided by: shawnr5
Category:

less

Transcript and Presenter's Notes

Title: Applying Statistics to Litigation Consulting


1
Applying Statistics to Litigation Consulting
  • Shawn Roeske Chin Yu

2
WARNINGPELIGRO
When Using Statistical Analysis
  • Pros
  • Provides a basis or sanity check
  • Widely accepted methodology
  • Can develop conclusions on large amounts of data
  • Cons
  • Complex Equations
  • Filled with hidden
  • assumptions
  • Many areas for opposing side to attack

3
Points of Discussion
  • Simple Regression Analysis
  • When to use
  • Interpreting regression output
  • Time Series Analysis
  • Problems Using Regression Analysis

4
Why Use Regression?
  • When we are using one variable to draw a
    conclusion about another variable
  • Make predictions of one variable using another
  • Test assumptions about the relation between
    variables
  • Quantify the strength of the relationship between
    variables

5
Linear Regression Equation
Where, i1, , n Yi Dependent Variable
b0 Intercept b1 Slope Coefficient Xi
Independent Variable ei Error term
6
Linear Regression Equation (cont.)
  • Linear regression assumes a linear relationship
    between the dependent and independent variables
  • Linear regression computes a line that best fits
    the observations it chooses values for the
    slope, b0, and intercept, b1, that minimize the
    sum of the squared vertical distances between the
    observations and the regression line

7
Sample Regression Output
8
Assumptions of The Linear Regression
  • A linear relation exists between the dependent
    variable and the independent variable
  • The independent variable is not random
  • The expected value of the error term is 0
  • The variance of the error term is the same for
    all observations
  • The error term is uncorrelated across
    observations
  • The error term is normally distributed

9
What did you just say those assumptions were?
  • Assumption 1
  • If the independent and dependent variables DO NOT
    have a linear relation, then estimating that
    relation with a regression model will be INVALID
  • Assumptions 2 3
  • Needed to ensure that the linear regression
    produces the correct estimates of b0 b1
  • Assumptions 4
  • Is also known as the HOMOSKEDASTICITY assumption
    or having equal variances
  • Assumptions 5
  • Necessary for correctly estimating the variances
    of the estimated parameters
  • Assumptions 6
  • Allows us to easily test a particular hypothesis
    about a linear regression model

10
Sample Regression Output
11
Coefficients
  • Coefficients correspond to the bs in a standard
    linear regression model
  • Y b0 b1X1 ei
  • In our example, b1 422.09
  • This means for every time period that passes, in
    this case quarters, revenues will increase by
    422.09
  • The standard error measures the precision of the
    coefficients as an estimate of the model
    parameter

12
Sample Regression Output
y 13950.75 422.09x
13
Sample Regression Output
14
R-Squared (R2) or Coefficient of Determination
  • Goodness of fit for the regression model
  • Measures the proportion or of the total
    variation in the dependent variable explained by
    the model

15
R-Squared (R2) or Coefficient of Determination
Where, RSS Regression Sum of Squares TSS
Total Sum of Squares
16
Sample Regression Output
17
Caveats for Using R2
  • A high R2 does not imply causality
  • Just because your R2 is high does not mean your
    regression is reliable
  • What does it mean? It means you must look at
    other factors along with your R2

18
Standard Error Estimate
  • The Standard Error Estimate (SEE) or Standard
    Error of the Regression measures the standard
    deviation of the error estimate
  • SEE computes the difference between the actual
    and predicted values for each dependent variable
    observation

19
Standard Error Equation
20
Standard Error Equation (cont.)
Where, n observations SSR Sum of
Squared Residuals
21
Sample Regression Output
22
Testing t-Statistic
  • A t-test is used to test the significance of
    individual estimated coefficients
  • i.e. it tests whether a single regression
    coefficient is significantly different from zero

23
Testing t-Statistic (cont)
  • The critical t-value is obtained by using a
    t-distribution table and applying an error rate
    and the correct df (df n-2 in simple
    regression)
  • The t-stat in our example is 8.81 which is
    greater than the critical value of 2.145

t - Distribution
a 1 - .95 .05
Critical Region at a/2 .025
0
Critical t -2.145
Critical t 2.145
24
Sample Regression Output
25
Confidence Intervals
  • A confidence interval is an interval that we
    believe includes the true parameter value, b1,
    with a given degree of confidence
  • To compute a confidence interval, we must
  • Select the significance level for the test
  • Know the standard error of the estimated
    coefficient

26
Confidence Interval Equation
Where, CI Confidence Interval b1
Coefficient of the Independent Variable tc
Critical t-value Sb1 Standard Error of the
Coefficient
27
Sample Regression Output
Look in Students- t table
28
Types of Data Used
  • Cross-sectional Data
  • Observations of the independent and dependent
    variables for the same time period
  • Times-series Data
  • Observations of the independent and dependent
    variables over time

29
Time-Series Analysis
  • A Time-series is a set of values of a particular
    variable in different time periods (in this case
    time is the independent variable)
  • Can be used in forecasting by estimating a linear
    trend in a time-series and using that trend to
    predict future values

30
Linear Trend
  • The simplest type of trend is a linear trend
    which is expressed in the following equation.

Were regressing time (independent) with the
desired variable (dependent)
31
Seasonality
  • Seasonality is shows a regular pattern of
    movements in a given time period
  • If significant seasonality exists, we can correct
    this by adding a seasonal lag

32
Potential Problems
  • Heteroskedasticity
  • Variance of the errors differs across
    observations
  • Causes relationships to exist when they really do
    not
  • Causes incorrect standard errors

33
Sample Regression Output
Linear Pattern
Linear Pattern
No Pattern
Exponential Pattern
Parabolic Pattern
34
Potential Problems (cont.)
  • Serial Correlation (a.k.a. Autocorrelation)
  • Regression errors are correlated across
    observations
  • Causes incorrect standard errors
  • Parameters are accurate as long as none of the
    independent variables is a lagged value of the
    dependent variable

35
Detecting Problems
  • Heteroskedasticity
  • Breusch and Pagan Test
  • Serial Correlation
  • Durbin-Watson Test Statistic
  • EXCEL DOES NOT PERFORM
  • EITHER TEST!!!

36
Caveats Using Regression Analysis
  • Your estimate is only relevant for the time
    period you are looking at, but it may be the best
    estimator you have for the future
  • Excel works great if you want a quick dirty
    regression, but has many limitations, for example
    Excel does not contain tools to test for
    heteroskedasticity and autocorrelation

37
Questions
38
Sources
  • http//www.ats.ucla.edu/stat/
  • http//www.surveysystem.com
  • Basic Econometrics, Gujarati, Damodar N., McGraw
    Hill, 1995.
  • The Basic Practice of Statistics, Moore, Dennis
    S., W.H. Freeman and Company, 1995.
  • Quantitative Methods for Investment Analysis,
    Defusco, McLeavey, Pinto, Runkle, Association for
    Investment Management and Research, 2001
Write a Comment
User Comments (0)
About PowerShow.com