Title: Stat 10x
1Stat 10x
- J. Chang
- Tuesday, 10/3/00
2Today Producing DataSampling and Experimental
Design
- 3 Principles of Experimental Design
- Simple random samples
- Bias, variance
- Stratified sampling and blocking
Moore and McCabe Chapter 3.
3Observation versus experiment
- Both attempt to study relationship between an
explanatory variable and a response variable - Experiment deliberately impose treatments on
individuals to observe their responses. - Observational study observe and measure what
participants do naturally
4An example experiment
- Wangensteen (1958) Gastric freezing. Experiment
reported in JAMA treatment reduced ulcer pain.
24 patients all said they felt better.
Technique widely used for several years. OK? - Several years later a different, larger study
with a control group. Results - 34 in treatment group improved.
- 38 in control group improved.
- Salk vaccine trial
5Principle 1 Control or Comparison
- Comparison of different treatments.
- Want different treatment groups to be as similar
as possible -- except for the treatments applied. - Control effects of environmental or outside
variables. - Outside influences act the same on the different
treatment groups. (E.g. placebo effect)
6Bias
- How to assign experimental units to treatments?
- E.g.
- in comparing two medical treatments dont want to
assign one treatment to sicker patients - comparing seed varieties dont plant one in more
fertile ground - A study is biased if it systematically favors
certain outcomes. - How to avoid bias? Elaborate balancing?
7Principle 2 Randomization
- Assign treatments randomly.
- Fair -- doesnt give an treatment a systematic
advantage. - But randomization balances out well only in the
long run. So
8Principle 3 Replication, or Sample size
Use sample sizes big enough so that we will be
able to distinguish a real effect from random
luck.
9Its hard to be random
0 0 1 1 1 1 0 1 0 1 0 0 0 0 0 0 1 1 0 1 1 0 1 0
1 0 0 1 0 1 1 0 0 0 0 0 1 0 0 1 1 1 1 1 0 0 0 0 0
1 1 0 0 1 0 0 1 1 0 1 0 0 1 1 0 0 1 1 0 0 0 0 1
1 0 0 0 1 1 0 1 1 0 1 1 1 1 1 1 1 0 0 1 0 0 1 0
1 1 0 1 0 0 1 0 1 1 0 1 1 0 1 1 0 0 0 1 0 1 1 0
0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 1 1 1 1 0
1 0 0 0 1 0 0 0 0 1 1 1 0 1 1 0 1 0 1 1 0 0 0 1
1 0 0 1 1 1 0 1 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 1
0 1 0 1 1 0 0
10Its hard to be random
not very creative
0 0 1 1 1 1 0 1 0 1 0 0 0 0 0 0 1 1 0 1 1 0 1 0
1 0 0 1 0 1 1 0 0 0 0 0 1 0 0 1 1 1 1 1 0 0 0 0 0
1 1 0 0 1 0 0 1 1 0 1 0 0 1 1 0 0 1 1 0 0 0 0 1
1 0 0 0 1 1 0 1 1 0 1 1 1 1 1 1 1 0 0 1 0 0 1 0
1 1 0 1 0 0 1 0 1 1 0 1 1 0 1 1 0 0 0 1 0 1 1 0
0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 1 1 1 1 0
1 0 0 0 1 0 0 0 0 1 1 1 0 1 1 0 1 0 1 1 0 0 0 1
1 0 0 1 1 1 0 1 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 1
0 1 0 1 1 0 0
11Its hard to be random
not very creative
0 0 1 1 1 1 0 1 0 1 0 0 0 0 0 0 1 1 0 1 1 0 1 0
1 0 0 1 0 1 1 0 0 0 0 0 1 0 0 1 1 1 1 1 0 0 0 0 0
1 1 0 0 1 0 0 1 1 0 1 0 0 1 1 0 0 1 1 0 0 0 0 1
1 0 0 0 1 1 0 1 1 0 1 1 1 1 1 1 1 0 0 1 0 0 1 0
1 1 0 1 0 0 1 0 1 1 0 1 1 0 1 1 0 0 0 1 0 1 1 0
0 1 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 1 1 1 1 0
1 0 0 0 1 0 0 0 0 1 1 1 0 1 1 0 1 0 1 1 0 0 0 1
1 0 0 1 1 1 0 1 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 1
0 1 0 1 1 0 0
getting tired
12Simple random samples
- Def A simple random sample of size n is a set of
n individuals from a population chosen in such a
way that each set of n individuals has an equal
chance to be the sample actually selected.
Abbreviate simple random sample ? SRS
13How to randomize
A natural way label individuals with 0, 1, 2,,
9.Take individual 1, then 9, then 2, then 3.
What if we had 25 individuals and wanted a SRS
of size 4?
19, 22, 05, 13
14Blobs
What is the average area?
E.g. throwing darts leads to size-biased sampling.
15Buses
Suppose average time between bus arrivals at a
stop is 20 minutes. You arrive at a random time.
What is your average waiting time until the next
bus?
10 minutes?
No -- in general its more.
Analogous to blobs
16Sampling distributions
Say we want to estimate parameter p Probvote
for Bush
Here p 0.5. Pretend we dont know this.
17Sampling distributions (cont.)
List possible SRSs and the corresponding
estimates.
Ind Vote 1 Bush2 Bush3 Gore4
Gore
18Sampling distrib of p-hat from SRSs of size 2
19Bias and variability of an estimator
E.g. recall true value was p 0.5. Sampling
distrib
Unbiased Mean of sampling distrib 0.5 true
value
Variability SD of sampling distrib ? 0.3
20How about with SRSs of size 1?
Ind Vote 1 Bush2 Bush3 Gore4
Gore
21Bias? Variability?
n 2
n 1
Neither is biased. Case n 2 has less
variability.
22Bias and Variability
- Bias of an estimator (mean of sampling
distrib) ? (true value of parameter)Statisti
c is unbiased if bias 0. - Variability of an estimator (SD of sampling
distrib)Depends on sample size.
23An example of a simulation
- Bias of estimators of variance -- use Minitab.
24Stratified sampling
- E.g. estimate avg. salary of engineers at a
company.Suppose 2 types of engineers junior
and senior.Suppose company has 200 of each
type. Want to est avg salary with a sample of
size 10. - Stratification idea combinea SRS of size 5 from
junior engineers, anda SRS of size 5 from senior
engineers. - Is this a SRS of size 10?
25Why stratify vs. take a SRS?
- Whats the advantage of stratifying?
- Bias?
- Variability?
26Blocking in experimental design
3 types of seeds (treatments) A, B, C.And some
land to try them on
27Blocking (cont.)
Suppose Worry about a fertility gradient ??
Believe field homogeneous ??
Partition experimental units into blocks.Assign
treatments randomly within each block.