Probability Distribution of Random Error - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Probability Distribution of Random Error

Description:

Reconsider the Obstetrics example with the following data: Estriol ... Reconsider the Obstetrics example. Interpret a coefficient of Determination of 0.8167. ... – PowerPoint PPT presentation

Number of Views:142
Avg rating:3.0/5.0
Slides: 63
Provided by: dto9
Learn more at: https://www.msu.edu
Category:

less

Transcript and Presenter's Notes

Title: Probability Distribution of Random Error


1
Probability Distribution of Random Error
2
Regression Modeling Steps
  • 1. Hypothesize Deterministic Component
  • 2. Estimate Unknown Model Parameters
  • 3. Specify Probability Distribution of Random
    Error Term
  • Estimate Standard Deviation of Error
  • 4. Evaluate Model
  • 5. Use Model for Prediction Estimation

3
Linear Regression Assumptions
  • Assumptions of errors ?1, ..., ?n
  • - Gauss-Markov condition
  • Independent errors
  • Mean of probability distribution of errors is 0
  • Errors have constant variance s2, for which an
    estimator is S2
  • Probability distribution of error is normal
  • Potential violation of G-M condition.

4
Error Probability Distribution
5
Random Error Variation
6
Random Error Variation
  • 1. Variation of Actual Y from Predicted Y

7
Random Error Variation
  • 1. Variation of Actual Y from Predicted Y
  • 2. Measured by Standard Error of Regression
    Model
  • Sample Standard Deviation of ?, s


8
Random Error Variation
  • 1. Variation of Actual Y from Predicted Y
  • 2. Measured by Standard Error of Regression Model
  • Sample Standard Deviation of ?, s
  • 3. Affects Several Factors
  • Parameter Significance
  • Prediction Accuracy


9
Evaluating the Model
  • Testing for Significance

10
Regression Modeling Steps
  • 1. Hypothesize Deterministic Component
  • 2. Estimate Unknown Model Parameters
  • 3. Specify Probability Distribution of Random
  • Error Term
  • Estimate Standard Deviation of Error
  • 4. Evaluate Model
  • 5. Use Model for Prediction Estimation

11
Test of Slope Coefficient
  • 1. Shows If There Is a Linear Relationship
    Between X Y
  • 2. Involves Population Slope ?1
  • 3. Hypotheses
  • H0 ?1 0 (No Linear Relationship)
  • Ha ?1 ? 0 (Linear Relationship)
  • 4. Theoretical basis of the test statistic is the
    sampling distribution of slope

12
Sampling Distribution of Sample Slopes
13
Sampling Distribution of Sample Slopes
14
Sampling Distribution of Sample Slopes
  • All Possible Sample Slopes
  • Sample 1 2.5
  • Sample 2 1.6
  • Sample 3 1.8
  • Sample 4 2.1 Very
    large number of sample slopes

15
Sampling Distribution of Sample Slopes
  • All Possible Sample Slopes
  • Sample 1 2.5
  • Sample 2 1.6
  • Sample 3 1.8
  • Sample 4 2.1 large
    number of sample slopes

Sampling Distribution

S
?1

?1
16
Slope Coefficient Test Statistic
17
Test of Slope Coefficient Rejection Rule
  • Reject H0 in favor of Ha if t falls in colored
    area
  • Reject H0 for Ha if P-value P(Tgtt) lt a

Reject H
Reject H
0
0
a/2
a/2
Tt(n-2)
0
t1-a/2, (n-2)
-t1-a/2, (n-2)
18
Test of Slope Coefficient Example
  • Reconsider the Obstetrics example with the
    following data
  • Estriol (mg/24h) B.w. (g/1000)
  • 1 1 2 1 3 2 4 2 5 4
  • Is the Linear Relationship betweenEstriol
    Birthweight significant at .05 level?

19
Solution Table For ßs
20
Solution Table for SSE




21
Test of Slope Parameter Solution
  • H0 ?1 0
  • Ha ?1 ? 0
  • ? ? .05
  • df ? 5 - 2 3
  • Critical Value(s)

Test Statistic
22
Test StatisticSolution
From Table
23
Test of Slope Parameter
  • H0 ?1 0
  • Ha ?1 ? 0
  • ? ? .05
  • df ? 5 - 2 3
  • Critical Value(s)

Test Statistic Decision Conclusion
Reject at ? .05
There is evidence of a linear relationship
24
Test of Slope ParameterComputer Output
  • Parameter Estimates
  • Parameter
    Standard
  • Variable DF Estimate
    Error t Value Pr gt t
  • Intercept 1 -0.10000
    0.63509 -0.16 0.8849
  • Estriol 1 0.70000
    0.19149 3.66 0.0354



t ?k / S?

?k
S?

k
k
P-Value
25
Measures of Variation in Regression
  • 1. Total Sum of Squares (SSyy)
  • Measures Variation of Observed Yi Around the
    Mean?Y
  • 2. Explained Variation (SSR)
  • Variation Due to Relationship Between X Y
  • 3. Unexplained Variation (SSE)
  • Variation Due to Other Factors

26
Variation Measures
Unexplained sum of squares (Yi -?Yi)2

Yi
Total sum of squares (Yi -?Y)2
Explained sum of squares (Yi -?Y)2

27
Coefficient of Determination
  • 1. Proportion of Variation Explained by
    Relationship Between X Y

0 ? r2 ? 1
28
Coefficient of Determination Examples
r2 1
r2 1
r2 .8
r2 0
29
Coefficient of Determination Example
  • Reconsider the Obstetrics example. Interpret a
    coefficient of Determination of 0.8167.
  • Answer About 82 of the
  • total variation of birthweight
  • Is explained by the mothers
  • Estriol level.

30
r 2 Computer Output
r2
  • Root MSE 0.60553
    R-Square 0.8167
  • Dependent Mean 2.00000 Adj
    R-Sq 0.7556
  • Coeff Var 30.27650

r2 adjusted for number of explanatory variables
sample size
S
31
Using the Model for Prediction Estimation
32
Regression Modeling Steps
  • 1. Hypothesize Deterministic Component
  • 2. Estimate Unknown Model Parameters
  • 3. Specify Probability Distribution of Random
    Error Term-Estimate Standard Deviation of Error
  • 4. Evaluate Model
  • 5. Use Model for Prediction Estimation

33
Prediction With Regression Models
  • What Is Predicted?
  • Population Mean Response E(Y) for Given X
  • Point on Population Regression Line
  • Individual Response (Yi) for Given X

34
What Is Predicted?
35
Confidence Interval Estimate of Mean Y
36
Factors Affecting Interval Width
  • 1. Level of Confidence (1 - ?)
  • Width Increases as Confidence Increases
  • 2. Data Dispersion (s)
  • Width Increases as Variation Increases
  • 3. Sample Size
  • Width Decreases as Sample Size Increases
  • 4. Distance of Xp from Mean?X
  • Width Increases as Distance Increases

37
Why Distance from Mean?
Greater dispersion than X1
?X
38
Confidence Interval Estimate Example
  • Reconsider the Obstetrics example with the
    following data
  • Estriol (mg/24h) B.w. (g/1000)
  • 1 1 2 1 3 2 4 2 5 4
  • Estimate the mean BW and a subjects BW response
    when the Estriol level is 4 at .05 level.

39
Solution Table
40
Confidence Interval Estimate Solution - Mean BW
X to be predicted
41
Prediction Interval of Individual Response
Note!
42
Why the Extra S?
43
SAS codes for computing mean and prediction
intervals
  • Data BW /Reading data in SAS/
  • input estriol birthw
  • cards
  • 1 1
  • 2 1
  • 3 2
  • 4 2
  • 5 4
  • run
  • PROC REG dataBW /Fitting a linear regression
    model/
  • model birthwestriol/CLI CLM alpha.05
  • run

44
Interval Estimate from SAS- Output
  • The REG Procedure
  • Dependent Variable y
  • Output Statistics
  • Dep Var Predicted Std Error
  • Obs y Value Mean Predict
    95 CL Mean 95 CL Predict Residual
  • 1 1.0000 0.6000 0.4690
    -0.8927 2.0927 -1.8376 3.0376 0.4000
  • 2 1.0000 1.3000 0.3317
    0.2445 2.3555 -0.8972 3.4972 -0.3000
  • 3 2.0000 2.0000 0.2708
    1.1382 2.8618 -0.1110 4.1110 0
  • 4 2.0000 2.7000 0.3317
    1.6445 3.7555 0.5028 4.8972 -0.7000
  • 5 4.0000 3.4000 0.4690
    1.9073 4.8927 0.9624 5.8376 0.6000

Predicted Y when X 3
Confidence Interval
Prediction Interval
SY

45
Hyperbolic Interval Bands
46
Correlation Models
47
Types of Probabilistic Models
48
Correlation vs. regression
  • Both variables are treated the same in
    correlation in regression there is a predictor
    and a response
  • In regression the x variable is assumed
    non-random or measured without error
  • Correlation is used in looking for relationships,
    regression for prediction

49
Correlation Models
  • 1. Answer How Strong Is the Linear Relationship
    Between 2 Variables?
  • 2. Coefficient of Correlation Used
  • Population Correlation Coefficient Denoted ?
    (Rho)
  • Values Range from -1 to 1
  • Measures Degree of Association
  • 3. Used Mainly for Understanding

50
Sample Coefficient of Correlation
  • 1. Pearson Product Moment Coefficient of
    Correlation between x and y

51
Coefficient of Correlation Values
-1.0
1.0
0
-.5
.5
52
Coefficient of Correlation Values
No Correlation
-1.0
1.0
0
-.5
.5
53
Coefficient of Correlation Values
No Correlation
-1.0
1.0
0
-.5
.5
Increasing degree of negative correlation
54
Coefficient of Correlation Values
Perfect Negative Correlation
No Correlation
-1.0
1.0
0
-.5
.5
55
Coefficient of Correlation Values
Perfect Negative Correlation
No Correlation
-1.0
1.0
0
-.5
.5
Increasing degree of positive correlation
56
Coefficient of Correlation Values
Perfect Positive Correlation
Perfect Negative Correlation
No Correlation
-1.0
1.0
0
-.5
.5
57
Coefficient of Correlation Examples
r 1
r -1
r .89
r 0
58
Test of Coefficient of Correlation
  • 1. Shows If There Is a Linear Relationship
    Between 2 Numerical Variables
  • 2. Same Conclusion as Testing Population Slope ?1
  • 3. Hypotheses
  • H0 ? 0 (No Correlation)
  • Ha ? ? 0 (Correlation)

59
1 Sample t-Test on Correlation Coefficient
  • Hypotheses
  • H0 ? 0 (No Correlation)
  • Ha ? ? 0 (Correlation)
  • test statistic under H0
  • t r (n-2)1/2 / (1-r2)1/2 t (n-2)
  • Reject H0 if t gt ta/2, n-2

60
1 Sample Z-Test on Correlation Coefficient
  • Hypotheses (Fisher)
  • H0 ? ?0
  • Ha ? ? ?0
  • test statistic under H0
  • Reject H0 if z gt z 1-a/2

61
Conclusion
  • Describe the Linear Regression Model
  • State the Regression Modeling Steps
  • Explain Ordinary Least Squares
  • Compute Regression Coefficients
  • Understand and check model assumptions
  • Predict Response Variable
  • Comments of SAS Output

62
Conclusion
  • Correlation Models
  • Test of coefficient of Correlation
Write a Comment
User Comments (0)
About PowerShow.com