Communicating Quantitative Information - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Communicating Quantitative Information

Description:

I am 95% confident (chances are only 1/20 that this is wrong) that the actual ... nicotine experiments with animals. lab study of lungs, blood, blood pressure, etc. ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 48
Provided by: Jeanin
Category:

less

Transcript and Presenter's Notes

Title: Communicating Quantitative Information


1
Communicating Quantitative Information
  • Diagrams
  • Sampling issue
  • Risk Communicating Risk
  • Smoking. Hormone Replacement Therapy
  • Dimension
  • Homework Prepare/Design diagram/chart. Postings

2
General comment
  • ASK QUESTIONS
  • There often is more than way to
  • explain a problem
  • do a problem

3
Fitting together puzzle
  • 400 freshmen, 250 taking math

4
More information
  • 400 freshmen, 250 taking math
  • 100 taking neither math nor science so out of
    the not taking math part
  • Note this part is 400-250 is 150

5
out of not math part
  • Dark blue represents 100 taking neither

50 taking science!
100
250
6
next information
  • Of the ones taking math, 40 are taking science
  • .40 times 250 is 100
  • Divide the freshmen into those taking math and
    those not
  • Final answer 100 50 taking science

7
Special Survey
  • Will come back to topic of Sampling
  • Accuracy (confidence, error) of tests, surveys
    depends on
  • quality of sample
  • size of sample
  • My sister asked me can young people identify
    Einstein?
  • Use my students / sample of my students
  • Last Spring
  • students responding to request to take survey in
    both my classes
  • Last Spring quantity was pretty small (14 11)

8
Preview
  • Margin of error
  • Claim actual result (for whole population) is
    within certain limits (answer plus/minus MoE)
  • Confidence
  • confidence that this particular sample is not so
    unusual as to make results wrong where wrong
    means the actual result is outside the margin of
    error.
  • Generally, the means (averages) of samples are
    distributed normally around the mean of the whole
    population and the SD of this distribution is
    smaller (tighter) than the SD of the distribution
    for this quantity for the whole population.

9
Thought experiment.
  • Want to get average height of people in the
    class.
  • Claim impossible to measure everyone, so use a
    sample.
  • Only time to measure 6 people
  • 62 72 68 66 69 71 for a sample mean of 68
  • Statistics say (IF the sample is random) then we
    can be 95 confident that the mean of the whole
    class is between61.4 and 74.6
  • will spend some more time on this

10
Thought experiment
  • Want to know favorable rating of the President
    from whole USA population.
  • Ask a sample
  • p (proportion) of sample view President
    favorably.
  • Want to make a statement with 99 confidence
  • Formula for the margin of error, call it E we
    can be 95 sure that the proportion of the whole
    population favorable is within p-E and pE
  • If we want to be 99 sure, then formula will give
    a bigger margin of error, call this F we can be
    99 sure that the proportion of the whole
    population favorable is within p-F and pF.

11
Another way
  • Random sample of size N means that each person in
    the whole population equally likely to be in the
    sample
  • There are many samples of size N
  • The results of a sample of size N vary, but
  • Some samples of size N are very different from
    the whole population, but most arent
  • In most cases, the sample result will be close to
    the result for the whole population
  • What do I mean by close? Within the margin of
    error
  • What do I mean by most? This refers to the
    confidence interval (19/20, 99/100)

12
Formally
  • The averages of samples of size N are normally
    distributed
  • The average (mean) is the mean of the whole
    population
  • The standard deviation is smaller by a factor of
    square root of N
  • Think of narrow mountain
  • To half (reduce to ½ what it was) the required
    margin of error, you need to quadruple (4) the
    sample size

13
Note
  • The size of the whole population does not enter
    into these calculations!

14
Back to special survey
  • Objective to find out if college students know
    about Einstein
  • Extra credit used as motivation
  • (didnt work that well for you?)
  • Is this factor independent of what is being
    tested?????

15
Quality of sample
  • Does not mean how good you arein any way.
  • Does mean how representative of general college
    population.
  • The former class was practically all seniors.
    That class and all since tend to be journalism,
    history, political science majors
  • Students who don't have specific required
    mathscience courses
  • These factors mean sample is not representative
    of college population!

16
Quality of sample
  • Opportunity
  • subjects available to me in my classes. Are
    they/you typical of 'young people'?(My sister
    thought yes.)
  • Response bias
  • students who took up offer. Are they/you more
    likely to 'know Einstein' than those who didn't.
  • higher level of general curiosity
  • diligence at obtaining extra credit

17
Tester reliability
  • I was generous in categorizing answers as
    correct.
  • Two questions considered separately
  • 23 out of 26
  • 24 out of 26

18
Reporting
  • Confidence at level alpha that actual proportion
    is within error of tested proportion
  • More confidant at larger interval
  • I am 95 confident (chances are only 1/20 that
    this is wrong) that the actual proportion is at
    least 83 that knows Einstein.
  • I am 99.5 confident (chances are 1/200 that this
    is wrong) that the actual proportion is at least
    78

19
Formulas
  • Based on the finding that means of samples are
    close to normally distributed with standard
    deviation function of tested proportion and size
    of sample.
  • One-tailed test (just checking one side because
    tested proportion close to 1)

20
Correlation (again)
  • Two variables
  • common examples
  • height weight
  • mortality set of health risks factors (e.g.,
    smoking history)
  • Are the two correlated? Does value of one predict
    some of value of the other?

21
Linear model
  • Linear line.
  • X and Y (standard names for two
    variablesvariables, values that vary!)
  • Y a bX
  • if a 0, bgt0 if agt0, bgt0

Note negative values of X and/or Y may or may
not be valid
22
Linear Model
  • agt0, blt0
  • (This will be basis of negative correlation.
    Still a relationship, but in the negative. As X
    gets big, Y gets small.)

23
Cab fare
  • (Numbers are not right, but the idea is)
  • 3 to get in
  • 2 every ¼ of a mile
  • Y is the fare/total cost (not including tip!) and
    X is distance, given in miles rounded up to the
    nearest quarter mile.
  • Fare 3 2(miles 4)
  • Example rode ½ a mile. Fare is 3 22 7

24
(rough) graph of cab fare
  • Points
  • (0,3), (1,8)

25
Aside
  • Units miles versus quarter miles, miles versus
    feet versus kilometers versus need to be
    understood. Some stories/calculations/experiments
    succeed or fail based on getting the units
    right!
  • space flight that failed due to
    misunderstanding/lack of agreement on units.

26
Correlation
  • Two variables, X and Y.
  • Make a graph (computer program does not make a
    graphyou think about a graph)
  • Process determine line that would be the best
    fit
  • defined as minimizing sum of the squares of the
    distances from the line ('least squares')

27
Excel example
  • List two sets of numbers
  • Graph using scatter plot
  • Use
  • correl(B2B8,C2C8) .96927

28
Other models
  • other relationship quadratic, log,
    exponential, etc.

Say you know deer population at two points in
time. Is/will the growth be linear or
exponential?????
Pop.
Time
29
Caution
  • Correlation is not cause
  • coincidental
  • both caused by other factor
  • Cause is not.absolute determination.
  • other factors

30
Terminology (reprise)
  • False positive wrongly say someone/something has
    condition.
  • False negative wrongly say someone/something
    does NOT have condition when, if fact, he or she
    or it does
  • Control group group in experiment that does not
    have treatment.
  • treatment group group in experiment that does
    have treatment

31
Double-blind study
  • Randomly assign subjects to
  • treatment
  • control (may give placebo)
  • Subject does not know which.
  • Tester/evaluator does not know which
  • See what happens. Time period may be long.
  • Smoking cannot be studies using a double-blind
    study!

32
Retrospective study
  • Of the people who did/have X, ask how many did Y?
  • Not as reliable.
  • Also need to study group that do not have X.
  • 85 of people with lung cancer report that they
    smoked.
  • (How many George Burns are there?)

33
Smoking and Lung Cancer
  • Strong correlation
  • more smoking increases chances of lung cancer
  • smoking comes before the cancer
  • many different studies
  • Women's incidence of lung cancer went up when
    women started smoking
  • Incidence going down in groups decreasing smoking
  • Biological evidence
  • nicotine experiments with animals
  • lab study of lungs, blood, blood pressure, etc.

34
(No Transcript)
35
What's likely to kill you
  • http//www.reason.com/blog/show/128501.html

36
(No Transcript)
37
Data dimension
  • The data that is worth presenting in graphics
    form (as opposed to clear text) is generally
    complex multi-dimensional.
  • Dimension measurement, extent, reach
  • the degree of manifoldness time has 1 dimension,
    space has 3.
  • Edward Tufte (and others)
  • don't give data dimensions it doesn't have. Don't
    use 3D for bar graphs
  • read (borrow) his books

38
(No Transcript)
39
As with 3D bar charts when you only have points,
avoid rainbow, when the data is
one-dimensional(Note shades of blue chart
better for color-blind visitors.)
40
Sign and dimension
  • Previous example contrasted height on land with
    depth of the ocean.
  • Next chart is problematic one dimension of
    shading used for negative and positive values.
  • is this a problem?

41
(No Transcript)
42
Dimension
  • Identifying dimension is important and may not be
    obvious.
  • Challenger disaster problems were associated
    with temperature.
  • Values 'along' a dimension may be discrete or
    continuous or forced into discrete (quantized)
    categories or discrete but many values
  • weight and height are continuous, but we round
    (down or up) to a standard unit
  • registration is done by grouping credits earned

43
Recent news
  • Large study found that low-fat diet did not
    appear to have statistically significant effects
    on mortality due to cancer or heart disease
  • Subjects were older women (post-menapausal women)
  • Posting opportunity project I opportunity find
    more than one article and write good (better)
    report.
  • Will cover more on this topic.

44
Recent examples
  • China/world water
  • http//www.nytimes.com/interactive/2007/09/28/worl
    d/asia/choking_on_growth_2.htmlstory4
  • Mortgage foreclosures/Atlanta
  • http//www.nytimes.com/2007/07/09/business/09aucti
    ons.html
  • Iraq neighborhoods
  • http//www.nytimes.com/interactive/2007/09/06/worl
    d/middleeast/20070907_BUILDUP_MAIN_GRAPHIC.html

45
Small multiples
  • Several (many) graphs/diagrams of the same format

46
(No Transcript)
47
Homework
  • Identify complex topic (such as health risks,
    sports records, voting)
  • multiple dimensions/factors multiple categories
    timeline?, geography?
  • Find reputable source (more than one source even
    better)
  • Determine critical findings
  • determine audience
  • Design/build diagram (chart, graph)
  • Bring to class to present AND to turn in. Be
    professional!
  • as appropriate, consider using 'small multiple'
    idea as done for 31 days
  • as appropriate, consider charts presented on
    health risks
  • This could be topic for your project I paper
    charts
  • Continue postings.
Write a Comment
User Comments (0)
About PowerShow.com