4'1: Cautions on Regression - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

4'1: Cautions on Regression

Description:

Lurking Variable: ... Finding lurking variables is often a matter of common sense and ... If there is a possible lurking variable, it does not mean the ... – PowerPoint PPT presentation

Number of Views:135
Avg rating:3.0/5.0
Slides: 18
Provided by: peterj72
Category:

less

Transcript and Presenter's Notes

Title: 4'1: Cautions on Regression


1
4.1 Cautions on Regression
2
  • Extrapolation use of a regression line to
    predict far outside the domain of values
  • - Often very inaccurate
  • - Ex. Using a linear model of age vs. height for
    the first ten years, to predict at age 30
  • Lurking Variables a variable not among the
    explanatory or response variables that may
    influence the interpretation of the relationship

3
  • a. Falsely suggest a strong relationship
  • Ex. More men complaining of chest pains are
    likely to get detailed tests and treatment than
    women
  • Lurking Variable Age of patient. Women tend
    to have heart problems later in life where
    aggressive tests and treatments might not be an
    option

4
  • Hide a relationship that exists
  • Ex. A British study compared the relationship
    between amount of overcrowding vs. the lack of
    indoor toilets and found no relationship.
  • Lurking Variable Public Housing. A high amount
    of public housing (where there is indoor toilets)
    would hide the relationship.

5
  • Finding lurking variables is often a matter of
    common sense and content knowledge.
  • Many lurking variables will present themselves if
    you sort the data over time vs. sorting by x
  • If there is a possible lurking variable, it does
    not mean the data are bad but that you should
    reconsider how you report and interpret your
    results.

6
  • For each of the following, determine a possible
    lurking variable as well as the effect it is
    having on the relationship.
  • People who use artificial sweeteners in place of
    sugar tend to be heavier than people who use
    sugar. You conclude that increased use of
    sweeteners will increase your weight.
  • A study showed that women who worked in a
    manufacturing plant have an increased number of
    miscarriages. The union concludes that there are
    unsafe work conditions.
  • A study finds that students who take algebra and
    geometry in high school are more successful in
    college. The researcher concludes that taking
    more math classes will increase your college
    grades.

7
  • Using averaged data
  • If a study uses the average of many individuals
    as one data pt. you must be cautious of making
    predictions for individuals.
  • Ex. If we plot age vs. the average height of
    children, then we would expect to see a very
    strong positive correlation.
  • It doesnt make sense to use this relationship to
    predict the height of one individual child.
  • If we looked at each individual data point, you
    would see much more scatter and variability.
  • What effect would this have on r?

8
Causation
  • Think about this A researcher randomly selects
    20 children and finds the height of their
    parents. He performs a linear regression of
    mothers height vs. fathers height and finds a
    strong positive correlation. Does he conclude
    that one variable caused or explains the other?
  • What are some possible factors that might help
    explain these results?

9
I. Causation
  • When two variables have a direct cause and effect
    link.
  • Ex. Mothers Body Mass Index vs. Daughters Body
    Mass Index.
  • Heredity is the causal link
  • Note Even when direct causation is present, it
    is rarely a complete explanation of the
    association between two variables.

10
  • Ex. An experiment on the amount of saccharin in a
    rats diet vs. the number of tumors in a rats
    bladder has shown a causal relationship
  • Note We cannot generalize a causal relationship
    to other settings than the one observed!
  • Best way to show evidence of causation is
    from experiments that actually change one
    variable while keeping all others fixed

11
II. Common Response
  • Observed association between two variables x and
    y is actually explained by a lurking variable z.
  • Could be any number of lurking variables, often
    use common sense to come up with another
    explanation.

12
  • Common Response can create an association where
    one may not exist.
  • Ex. x SAT score
  • y Fresh. Yr. GPA
  • There would appear to be a positive association,
    but what is a lurking variable that could be
    having an effect on this relationship?

13
  • Ex. x of children
  • y income
  • Possible lurking variables?
  • Ex. x of years of education
  • y of books in a personal library
  • Possible lurking variables??

14
III. Confounding
  • Two variables are confounded when their effects
    on a response variable cannot be distinguished
    from each other. The confounded variables may be
    either explanatory or lurking variables.
  • What are some possible confounding variables in
    the mother vs. daughter BMI case?

15
  • Ex. What might a confounding variable be in each
    case?
  • x whether a person regularly attends
    religious services
  • y lifespan
  • Confounding variable
  • 2. x number of pieces of fruit eaten regularly
  • y number of cavities
  • Confounding variable

16
IV. Establishing Causation
  • Conduct a carefully designed experiment in which
    the effects of possible lurking variables are
    controlled. (Chapter 5)
  • Social/Political Issues often hard to establish
    causal relationship since an experiment is not
    possible from an ethical standpoint.

17
  • Ex. Power lines and Leukemia there is no way
    to conduct any experiment. But, through research
    and case studies there has only been a chance
    connection shown.
  • Ex. Smoking vs. Lung cancer certain benchmarks
    have been met to infer causation.
  • Association is strong
  • Association is consistent in multiple countries
    and at different times
  • Higher doses stronger response
  • Cause precedes the effect in time
  • Cause is plausible
Write a Comment
User Comments (0)
About PowerShow.com