STA 4322STA5328 Introduction to Statistical Theory - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

STA 4322STA5328 Introduction to Statistical Theory

Description:

... clinical trials in early prostate cancer, Radical prostatectomy group (n=347) ... answers in a sample of size n and the true proportion of yes answers in ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 26
Provided by: drramonc
Category:

less

Transcript and Presenter's Notes

Title: STA 4322STA5328 Introduction to Statistical Theory


1
STA 4322/STA5328 Introduction to Statistical
Theory
  • Key Questions

2
A key question of statistics
  • Key questions easy to understand, not easy to
    answer without the knowledge of a special field.
  • Example 1. A key question of mathematics Three
    angles of a triangle
  • Example 2 A key question of calculus The volume
    of a ball is 4?r3/3.

3
A key question of chemistry Why O2, not O?
  • A key question in physics What Newton saw the
    law of motion that people had not seen before
    him?
  • Speed (velocity) is a vector
  • A key question in computer science How would
    electricity in wires understand sentences such as
    IF A?B then C2, otherwise C4?

4
Key Concept in StatisticsNatural cure rate of a
disease 50A drug is invented and to be tested.
Where does the no jump to yes?
5
There is no 100 correct statistical decision
Risk Risk of making a wrong decision Accidental
death rate 10-6/day in USA
How many patients should we recruit in the
beginning?
6
Another key question
  • Suppose I wish to know the percentage of voters
    who support public health insurance.
  • I have no hypothesis to test, but I am interested
    in estimating this proportion.
  • Suppose I asked 100 randomly selected persons and
    the yes answer was 65. Or suppose I asked 1000
    persons the answer was 650. In both cases, my
    answer would be 65, but I know their accuracies
    are different, but by how much? Do we need a
    large sample to make the estimate even more
    accurate?

7
Beyond cure rates
  • Survival time improved by a drug
  • Patient difference in age, gender, tumor size
    and/or genetic markers.
  • Cure rate in medicine affected rate in plants,
    accident death rate in car insurance
  • Survival time in medicine fruit weight in
    plants, accident payment in car insurance

8
The Purpose of Statistics
  • To make inference about unknown quantities from
    samples of data
  • For example You want to know something about
    the age distribution of graduate students at the
    University of Florida. How many ages are lt22,
    lt23, lt24, etc? Or, what is the average age?

9
Populations and Samples
  • In either case you want information about the set
    of ages of all UF graduate students. These ages
    would be the population of interest.
  • If it is infeasible to get the ages of all UF
    graduate students, i.e., you cannot observe the
    entire population, you may get ages of a subset
    of the population. The subset is called a
    sample. Then, you use the data in the sample to
    estimate what you want to know about the
    population.

10
Examples of Populations
  • Amounts of grapefruit on all trees in Florida
  • Serum zinc levels in dogs in Gainesville area
  • Strengths of concrete from given mix of sand,
    cement and gravel

11
Samples
  • Amounts of grapefruit on trees in plots drawn
    from the state of Florida
  • Serum zinc levels in dogs entering UF College of
    Veterinary Medicine Small Animal Clinic
  • Measurements from samples of concrete with known
    ingredients in concrete mix

12
Applications of StatisticsKey When there are
uncertainty in response
  • Effectiveness of new drugs or treatments
  • DNA evidence in court
  • Estimating the bowhead whale population
  • Corn yield by different fertilizers
  • Quality control of light bulbs
  • Public opinion by polls

13
Key Exercise
  • If someone asks you what is statistics, can you
    point out a key question to him/her?

14
Successful stories of polls
1992 US Presidential election predictions
Source, from newspaper a few days before the
election.
15
More on polls
Source Nov. 5 (Election day morning) USA Today
Both 2000 and 2004, the candidates (Bush vs Gore,
Kerry) were too close to call (within ?3). The
actual results showed the same.
It is difficult to reduce ?3 by sample size
alone. From mathematics to practice Random
sample, mind change, not telling mind
16
The next two elections, 2000 (Bush vs Gore) and
2004 (Bush vs Kerry) were too close to call
before the election. The final results confirmed
this fact. Now the 2008 election.
  • This map was drawn by the New York Times 3 - 1
    day before the election. All the state
    projections were correct. Toss-up states were
    extremely close.
  • It also predicted that Obama would get 52?2 and
    McCain 41?2 with 7 undecided.
  • The actual result is Obama 52.5 and McCain 46.
  • The total number of votes was 124,471,000.

17
Solution to the key questionWhat you need to
know beforehand?
  • What risk you can take on a wrong claim (to claim
    ineffective drug as effective).
  • What do you considered as a good drug that need
    to be detected with high probability.
  • Let the first answer to be a0.05
  • Let the second answer to be if the cure rate
    becomes larger than 0.6 (p1), I want at least 0.9
    (1-ß) probability to detected.

18
Danger of treatment based on screening (I)
  • Source New England Journal of Medicine, Sep. 12,
    2002, pp. 781-789.
  • Randomized clinical trials in early prostate
    cancer, Radical prostatectomy group (n347),
    watchful waiting (n348).(Duration 1989-1999,
    median follow-up time 6.2 years)
  • It is obvious that there were less death due to
    prostate cancer in the surgical group, because
    the prostate had been removed. To claim
    effectiveness based on 6253 is unreasonable.
  • No expense and quality of life change is
    reflected in this table.

19
Danger of treatment based on screening (II)
  • Source The lancet, 2000, 355 129-43. The
    lancet, 2001, 358 1340-42.
  • Randomized clinical trials in mammography for
    breast cancer.
  • Malmö (Sweden) study (1988- 97screened
    21,088 control 21,195)
  • Canada study (1981 97 screened 44,925
    control 44,910)

20
The solution (1)
21
The solution (2)
22
The solution (3)
23
Solution to another key question
  • 65 yes in a sample of 100, we feel the real
    percentage is 65.
  • 650 yes in a sample of 1000, we feel the real
    percentage is 65.
  • Which one is more accurate?
  • Idea In a single observation, we do not know,
    but on the whole, the large sample gives a more
    accurate estimate.
  • How to quantitative this concept?
  • Let Y be the yes answers in a sample of size n
    and the true proportion of yes answers in the
    population is p.

24
(No Transcript)
25
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com