Regression Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Regression Analysis

Description:

Regression Analysis Modeling Relationships Regression Analysis Regression Analysis is a study of the relationship between a set of independent variables and the ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 15
Provided by: basi51
Category:

less

Transcript and Presenter's Notes

Title: Regression Analysis


1
Regression Analysis
Modeling Relationships
2
Regression Analysis
  • Regression Analysis is a study of the
    relationship between a set of independent
    variables and the dependent variable.

The Linear Equation representing the true or
population relationship
3
Variables
  • Dependent Variable Also called the predicted
    variable. Its value depends on, or can be
    predicted by the independent variables.
  • Independent Variables Also called the predictor
    variables. These can be measured directly, and
    are used to predict the dependent (or to simply
    understand it better).

4
Modeling Process
Define Goal To study the impact of various factors on individual health
Choose y Lung Capacity, measured in cc.
List possible Xs Minutes of Exercise per day, of days/week of exercise, ethnicity, gender, age, height, altitude at which lived.
Collect Data Primary, Secondary sources
Preliminary Analyses Univariate, bivariate
Build Regression Model How is y related to all the Xs?
Evaluate Model How good is the model at predicting y?
Implement/Monitor Create DSS, monitor, update
5
The Data
A portion of the data is shown below. See
Spreadsheet for all data.
Y X1 X2 X3 X4 X5
Lung Capacity (cc) Gender Height Smoker Exercise Age
5673 1 69.5 0 25 47
5632 1 70.1 0 24 67
5712 1 68.2 0 26 36
5723 1 70.9 0 26 68
5484 1 71.9 1 20 58
5308 1 69.2 1 15 19
5133 1 71.9 1 0 40
6
Preliminary Analyses
The table below shows some descriptive statistics
for each variable. What basic statements about
our data can we make from this?
  Lung Capacity (cc) Gender Height Smoker Exercise Age
Mean 5325.60 0.50 68.23 0.39 21.35 46.42
Stdev 410.48 0.50 3.45 0.49 8.91 13.98
Min 4233.71 0.00 58.93 0.00 0.00 19.00
Max 6261.00 1.00 76.61 1.00 40.29 82.14
7
Capacity by Gender, Smoking
    Gender    
Smoker Data Female Male Grand Total
Non-Smoker Average of Lung Capacity (cc) 5427.67 5662.22 5546.87
  StdDev of Lung Capacity (cc) 256.41 284.71 293.75
  Count of Smoker 30.00 31.00 61.00
Smoker Average of Lung Capacity (cc) 4837.45 5129.05 4979.51
  StdDev of Lung Capacity (cc) 273.74 297.51 318.12
  Count of Smoker 20.00 19.00 39.00
Total Average of Lung Capacity (cc) Total Average of Lung Capacity (cc) 5191.58 5459.61 5325.60
Total StdDev of Lung Capacity (cc) Total StdDev of Lung Capacity (cc) 391.51 387.93 410.48
Total Count of Smoker Total Count of Smoker 50.00 50.00 100.00
Does there appear to be a relationship between,
Smoking, Gender, and Lung Capacity?
8
Distributions
9
Bivariate Analysis Matrix Plot
10
Capacity distribution by Gender, Smoking
Men have a larger lung capacity than women, on
average.
Non-Smokers have a larger lung capacity than
smokers on average. What about the variance?
11
Simple Regression
  • How well can exercise time alone predict the
    lung capacity?

12
Multiple Regression
SUMMARY OUTPUT
Regression Statistics Regression Statistics
Multiple R 0.8798341
R Square 0.7741081
Adjusted R Square 0.7620926
Standard Error 200.21
Observations 100
  • How do all the Xs together help predict y?

  Coefficients Standard Error t Stat P-value
Intercept 1662.3965 475.1456634 3.498709192 0.000716253
Gender 202.3282 41.86861042 4.832456809 5.23607E-06
Height 50.3468 7.08207335 7.109058989 2.24959E-10
Smoker -278.9711 52.71395448 -5.292169492 7.88193E-07
Exercise 11.2949 2.991170972 3.776112614 0.000279023
Age -0.1174 1.462303258 -0.080303367 0.936166702
13
Final Model
SUMMARY OUTPUT SUMMARY OUTPUT
Regression Statistics Regression Statistics
Multiple R 0.879825
R Square 0.774093
Adjusted R Square 0.764581
Standard Error 199.164
Observations 100
1656.937 202.104 Gender 50.359 Height
279.025 Smoker 11.259 Exercise
  Coefficients Standard Error t Stat P-value
Intercept 1656.937 467.7903 3.54205 0.000617
Gender 202.104 41.55695 4.86332 4.57E-06
Height 50.359 7.043082 7.150271 1.78E-10
Smoker -279.025 52.43341 -5.3215 6.85E-07
Exercise 11.259 2.943494 3.825342 0.000234
14
Prediction Exercise
  1. Predict the lung capacity for a non-smoking
    female who does not exercise, and is 66 inches
    tall, based on the model above.
  2. What would be the predicted value if she smoked?
  3. What would it be for a male in both the above
    cases?
Write a Comment
User Comments (0)
About PowerShow.com