Statistics and Quantitative Analysis U4320 - PowerPoint PPT Presentation

About This Presentation
Title:

Statistics and Quantitative Analysis U4320

Description:

Say we add 5 more pounds of ... Say we have the following single and multiple ... Now, let's say we think there might also be a relationship between a ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 46
Provided by: CCN4
Learn more at: http://www.columbia.edu
Category:

less

Transcript and Presenter's Notes

Title: Statistics and Quantitative Analysis U4320


1
Statistics and Quantitative Analysis U4320
  • Lecture 11 Path Diagrams
  • Prof. Sharyn OHalloran

2
Key Points
  • Slope Coefficient as a Multiplication Factor
  • Path Diagram and Causal Models
  • Direct and Indirect Effects

3
Regression Coefficients as Multiplication Factors
  • I. Regression Coefficients as Multiplication
    Factors
  • A. Simple Regression
  • 1. Basic Equation
  • Remember our basic one variable regression
    equation is
  • b is the slope of the regression line. It
    represents the change in Y corresponding to a
    unit change in X.

4
Regression Coefficients as Multiplication Factors
(cont.)
  • 2. Multiplication Factor
  • We can also think of b as a multiplication
    factor.
  • 3. Example
  • Take the first fertilizer equation
  • Say we add 5 more pounds of fertilizer. Then the
    change in yield according to this equation will
    be

5
Regression Coefficients as Multiplication Factors
(cont.)
  • B. Multiple Regression "Other Things Being
    Equal"
  • Now consider the multiple regression equation
  • We can still think of the slopes as
    multiplication factors.
  • But now they are multiplication factors if we
    change only one variable and keep all others
    constant.

6
Regression Coefficients as Multiplication Factors
(cont.)
  • Say we change X1 to (X1 DX1)
  • Then we can write
  • If X1 changes while all others remain constant,
    then change in Y b1(change in X1)

7
Regression Coefficients as Multiplication Factors
(cont.)
  • C. Examples
  • Let's try an example.
  • Say we have the following single and multiple
    regression equations

8
Regression Coefficients as Multiplication Factors
(cont.)
  • 1. What will be the change in yield if a farmer
    adds another 100 pounds of fertilizer?
  • Answer Only the fertilizer will change, not the
    rain. So use the multiple regression equation
  • DY b1 DX1
  • DY 100 (.038)
  • DY 3.8 bushels

9
Regression Coefficients as Multiplication Factors
(cont.)
  • 2. What will be the change in yield if a farmer
    irrigates his fields with 3 inches of water?
  • Answer Only the amount of water will change,
    not the fertilizer. So use the multiple
    regression equation
  • DY b2 DX2
  • DY 3 (.83)
  • DY 2.5 bushels

10
Regression Coefficients as Multiplication Factors
(cont.)
  • 3. Say the farmer adds both 100 pounds of
    fertilizer and 3 inches of irrigation. Now what
    will the difference in yield be?
  • Answer The change in yield will reflect the
    changes in both independent variables
  • DY b1 DX1 b2 DX2
  • DY 0.38 (100) (0.83) (3)
  • DY 3.8 2.5
  • DY 6.3 bushels

11
Regression Coefficients as Multiplication Factors
(cont.)
  • 4. Now say that we know the rainfall has
    increased 3 inches and we know that fertilizer is
    not necessarily held constant.
  • Now what would your best guess be as to the
    difference in yield?

12
Regression Coefficients as Multiplication Factors
(cont.)
  • Answer Since fertilizer is not held constant,
    we should use the single regression equation
  • DY b DX
  • DY 3 (1.5)
  • DY 4.5 bushels.
  • What we want to do is develop a technique that
    allows us to disaggregate the effects caused
    directly by the increase in rainfall and
    indirectly by other factors.

13
Path Analysis
  • II. Path Analysis
  • A. Fiji Women
  • Say we have data on 4700 women from Fiji.
  • 1. Basic Model
  • We know for each woman
  • Age
  • Years of education, and
  • Number children

14
Path Analysis (cont.)
  • a. Path Diagram
  • We might think that a woman's age and education
    correlate with how many children she has.
  • We can write a causal model that looks like this

15
Path Analysis (cont.)
  • b. Estimates
  • When we estimate these relationships, we get the
    results
  • CHILDREN 3.4 .059 AGE - .16 EDUC
  • We can represent these results as follows

16
Path Analysis (cont.)
  • 2. Additional Effects within the Model Direct
    and Indirect
  • Now, let's say we think there might also be a
    relationship between a woman's age and education.
  • a. Estimated Equation
  • If we estimate this regression, we get the
    result
  • EDUC 7.6 - .032 AGE.
  • Older women have less education than younger
    women.

17
Path Analysis (cont.)
  • b. Path Diagram
  • We now add this new information into the causal
    model

18
Path Analysis (cont.)
  • Question
  • 1. What is the change in the expected number of
    children due to 1 extra year, holding education
    constant?
  • 2. What is the change in the years of education
    from this same 1 extra year of age?

19
Path Analysis (cont.)
  • 3. Direct and Indirect Effects
  • Question
  • What's the change in number of children from one
    extra year of age, letting education change too?
  • The change in age has two effects a direct and
    an indirect effect.

20
Path Analysis (cont.)
  • a) Direct Effect (Multiple regression
    coefficient)
  • The direct effect is captured in the coefficient
    leading from AGE to CHILDREN.
  • This is the multiple regression coefficient, and
    it represents the expected extra number of
    children from one extra year, holding education
    constant

21
Path Analysis (cont.)
  • b) Indirect Effect
  • We know that an extra year corresponds with -.032
    years of school.
  • Each extra year of school corresponds with -.16
    extra children.
  • We get the indirect effect by multiplying along
    the arrows leading from AGE to CHILDREN through
    EDUC
  • (-.032) (-.16) .005.

22
Path Analysis (cont.)
  • c) Total Effect
  • So the total effect of AGE on CHILDREN letting
    EDUC vary too is the sum of the direct and
    indirect effects.
  • That is,
  • .059 .005 .064.
  • Question
  • What do you think would have happened if we ran a
    simple regression of CHILDREN on AGE? What would
    the coefficient have been?

23
Path Analysis (cont.)
  • Summary
  • A path diagram gives us some insight as to the
    relationship between simple and multiple
    regression.
  • Multiple regression gives us the partial effects
    of the independent variables on the dependent
    variable holding all else constant.
  • Simple regression gives us the total effect,
    which is the sum of the direct and the indirect
    effects.

24
Path Analysis (cont.)
  • B. Brady, Cooper and Hurley
  • 1. Defining Unity
  • Party unity scores are calculated as
  • ( voting in the majority - voting in the
    minority)
  • 2. Building the Causal Model
  • Two components to party unity internal and
    external factors.
  • So we can write a causal model like this

25
Path Analysis (cont.)
  • External factors define how homogeneous is the
    constituent base of the party.
  • Internal factors have to do with the strength of
    party leadership.

26
Path Analysis (cont.)
  • However, it is also thought that external factors
    influence internal factors.
  • That is, when legislators from a party are united
    on the issues, they are more likely to give their
    leaders power to get things done.
  • Thus we add another line to our model

27
Path Analysis (cont.)
  • 3. Results
  • When this model was estimated, the results were
  • PARTY STRENGTH .61 INTERNAL .58 EXTERNAL
  • INTERNAL .66 EXTERNAL.
  • 4. Question
  • What is the effect of External factors on Party
    Unity?
  • Direct Effect 0.58
  • Indirect Effect (.66)(.61) 0.40
  • Total Effect .58 .40 .98

28
Path Analysis (cont.)
  • C. Commie Model from Shapiro
  • What determines people's attitudes towards
    whether communists should be allowed to teach
    college?
  • 1. How to Build a Causal Model
  • First of all, what constitutes a valid causal
    model?
  • For now, the answer is no cycles.
  • That is, you shouldn't be able to start at a
    point and follow arrows and end up back at the
    same point.

29
Path Analysis (cont.)
  • How to build a causal model? (cont.)
  • Second, once you have a causal model, how do you
    know which regressions to run?
  • For each variable, see what arrows are going into
    it. Then run a regression with those variables
    as the independent variables.

30
Path Analysis (cont.)
  • 2. Variables and the Causal Model
  • Our hypothesis is that attitudes towards teaching
    depend on attitudes towards communism in general,
    party ID, education, and age.
  • The full causal model can be written like this

31
Path Analysis (cont.)
  • Variables
  • Attitudes towards teaching are determined by all
    the other variables.
  • Attitude towards the communist system depends on
    party ID, education, and age.
  • Party ID depends on education and age.
  • Finally, education depends on age.

32
Path Analysis (cont.)
  • 3. Defining the Variables
  • First we make our own copies of all the
    variables.
  • 1. TeachCom is a dichotomous variable, coded 1 if
    the respondent thought it was OK for communists
    to teach college.
  • 2. Smarts is years of education.
  • 3. PartyOn is the respondent's party ID. 0
    stands for strongly Democrat, up to 6 for
    strongly Republican.

33
Path Analysis (cont.)
  • Variables (cont.)
  • 4. ComPhile is how you think about communism as a
    system of government. Higher values mean that
    it's a good system.
  • 5. Finally, Years is your age.

34
Path Analysis (cont.)
  • 4. Estimating the Model
  • a) Regression commands
  • How we specify our causal model determines what
    regression we run.
  • For instance, TeachCom has arrows going into it
    from all other variables, so we run the
    regression with all the variables.
  • Then we take ComPhile, and regress it on Years,
    Smarts and PartyOn.
  • And so on down the line.

35
Path Analysis (cont.)
  • b) Descriptive Statistics
  • We then report our descriptive statistics that
    we'll use
  • "Means" gives the mean of each variable.
  • "Stddev" gives their standard deviation.
  • N gives the number of valid observations.
  • "Corr" gives the correlations between variables.
  • "Sig" tells us the significance of each
    correlation.

36
Path Analysis (cont.)
  • 5. Results
  • a) Means Table
  • Look at the means table.

37
Path Analysis (cont.)
  • b) Correlation Matrix
  • Next is the correlation matrix.
  • Smarts is negatively correlated with years. That
    means that older people tend to have had fewer
    years of schooling.
  • PartyOn is negatively correlated with years, so
    older people tend to be Republican.

38
Path Analysis (cont.)
  • b) Correlation Matrix (cont.)
  • Comphile and Teachcom is also negatively related
    to years.
  • Older people tend to have more negative attitudes
    toward the communist system and be against
    communists teaching college.
  • One-tailed p-values are reported beneath the
    correlation coefficient.

39
Path Analysis (cont.)
  • c) Regression Results

40
Path Analysis (cont.)
  • d) Question
  • What is the effect of Years on Teachcom?
  • 1. Direct Effect -.003 -0.003
  • 2. Indirect Effect via Comphile
    (-.006)(.16) -.00096
  • 3. Indirect Effect via Partyon
  • Partyon alone (-.0047)(-.015) .0000705
  • Partyon and Comphile
  • (-.0047)(-.029)(.016) .0000218

41
Path Analysis (cont.)
  • 4. Indirect via Smarts
  • Smarts alone (-.044)(.035) -.00154
  • Smarts Partyon (-.044)(.053)(-.015)
    .000035
  • Smarts Comphile (-.044)(.032)(.16)
    -.000023
  • Smarts, Partyon Comphile
  • (-.044)(.053)(-.029)(.16) .000011
  • Total Indirect Effects -.00259
  • Total Effects Direct Indirect
  • Total Effects -.003 - .0026 -.0056

42
Homework
  • III. Homework
  • A. Recap
  • Your homework assignment is to write a Path
    Diagram.
  • B. Issues in the Article
  • There is a dispute between American and European
    researchers on the effectiveness of AZT.
    Americans say that it works, and Europeans say
    that there's not enough evidence.

43
Homework
  • 1. The U.S. View
  • The U.S. allowed AZT to be distributed to
    HIV-positive individuals on the basis of a study
    completed in 1989.
  • Usually the FDA requires that to release a drug
    the experimenters show
  • DRUG --------------gt HEALTH
  • Instead of direct link, researchers showed an
    indirect link.
  • AZT --------------gt MARKERS ---------------gt
    HEALTH
  • If both of these correlations are positive, then
    so should be the total effect from AZT to health.

44
Homework
  • 2. The European View
  • The European researchers said that although it's
    true that AZT raised the level of CD-4 markers,
    these markers didn't indicate any long-term
    improvement in health.
  • So they say that the model looks like this
  • AZT --------------gt MARKERS ---------------gt
    HEALTH
  • If there's no link between CD-4 and health, then
    the overall link between AZT and health is also 0
    on the basis of the information presented so far.

45
Homework
  • 3. How To Resolve the Dispute
  • What kind of evidence would they need to resolve
    this dispute?
  • First, they could do studies to show that AZT has
    a direct effect on health. These studies take
    longer, but their conclusions are more reliable
    since they show a direct link.
  • Or, they could find another marker. That is,
    another intermediate substance that AZT affects
    and that affects health.
Write a Comment
User Comments (0)
About PowerShow.com