Title: Producing Data
1Last Time
2Data Collection
- Problems with Anecdotal Information
- Possibly biased
- (not representative of the overall trend)
- Tends to be dramatized and even inaccurate
- May be confounded with other variables
3We/they needsystematic statistical designsto
collect data
- Sampling ask a group of people
- Depending on how group is chosen (sampled) we can
- make statements only about that group
- make statements (inferences) about a larger
population
- Experiments
- Manipulate a variable and
- have experimental units go through differential
treatments - Depending on how the back treatments are
administered we can or cannot - make causal statements about treatment
effectiveness
4We/they needsystematic statistical designsto
collect data
- Goal Draw Conclusions
- about the population of potential/likely voters
- about the effectiveness of the back treatment
- i.e. make a
- statistical inference
- from a carefully collected set of data
- to a larger population
- and
- provide a statement about
- how confident we can be in the stated
conclusions.
5Sampling
Population (we want to make statements about this)
E.g., take list of registered voters and
number them. Then draw n numbers using a random
number table.
HOW??
Sample
Simple Random Sample (SRS) Every sample of size
n has the same chance of being drawn
6Sampling Distributions
i.e., distributions that matter in sampling
Population
Population characterized by parameters p
proportion of population, say, opposing
abortion mean and variance of a normal
distribution of, say, peoples heights
Sample Data
7Sampling Distributions
The politician wants to know the proportion p of
voters in the population who favor abortion. A
random sample of n voters is drawn and their
opinions recorded. Suppose in that random
sample 30 oppose abortion and 70 favor abortion
The team that collected the random sample reports
to the politician that, based on their random
sample, their best guess at the population
proportion p is that p is 70
8Sampling Distributions
The politician is suspicious of statistics. He
asks a second team to investigate the
issue. Another random sample of n voters is
drawn and their opinions recorded. Suppose in
that random sample 88 oppose abortion and 12
favor abortion
The team that collected this random sample
reports to the politician that, based on their
random sample, their best guess at the
population proportion p is that p is 12
9Sampling Distributions
The sampling distribution of the sample
proportion is the histogram that you would
obtain if you generated a new sample (of the
given size) infinitely often.
We will study the sampling distribution of sample
proportions more precisely in a future class,
when we talk about the binomial distribution.
10Statistics
- A statistic is something that you calculate from
a random sample. - The value of the statistic varies from one random
sample (of size n) to another random sample (also
of size n). - Over infinitely many samples (each of size n) the
statistic has a distribution. - Many statistics have a normal distribution.
11Unbiased Statistics(Centered at the true
parameter value)
Distribution of Sample Statistic (e.g., sample
proportion )
Population Parameter (e.g., true value of p)
12Variability of a Statistic
- Depends only on sample size and true parameter
value(s). Larger sample sizes provide smaller
variability of the sample statistics. Larger
(more expensive) samples provide more accurate
assessment of the true population parameter. -
- Does not depend on size of the population that
we are making inferences about.
13How to randomize?
- Assign numbers to all experimental participants.
- Use a random generator / random number table to
assign subjects to experimental groups.
14Today
- Chapter 4
- Probability
- The structure of randomness
15Simple Experiment
- A Simple Experiment is
- a process with an uncertain outcome,
- but with an outcome from a well specified
- set of possible outcomes.
The (well specified) set of possible outcomes of
a simple experiment is called the sample space of
the simple experiment and denoted by S.
16We are particularly concerned with simple
experiments in which the outcome is determined by
some chance process that we engage in. Text
refers to random phenomena for simple experiments
in which the outcome is determined by a chance
mechanism.
17Examples
18- PROBABILITY
- subjective measure of belief
- (long run frequency, symmetry, historical
data, opinions) - quantifying uncertainty
- There is a need for precision and clarity
regarding uncertainty.
What is the interpretation of probabilities in
weather forecasting?
19- VOCABULARY
- Experiment
- die roll coin toss response to test
item - Elementary outcome (short outcome, some say
simple event) - 2 heads Yes
- Sample space ( the set of all possible
outcomes) S -
- Event ( a collection of outcomes)
A,B,E - A even roll A heads
A definite answer -
20More examples
- Deck of cards
- SH,D,C,S
- SRed, Black
- Sset of all cards
- Measure reaction time of a person
- Sset of all reaction times, say S0,1,,100
- Sum of two tossed dice
- S
- Flip coin three times
- SHHH,HHT,HTH,HTT,THH,TTH,THT,TTT
- S0,1,2,3
21More examples of Events
- Deck of Cards
- ARedH,D
- ASRed, Black
- AAce
- Measure reaction time of a person
- A50, 51, , 60
- Sum of two tossed dice
- Aeven number
- Flip coin three times
- Aat least two heads
- Aat least two heads
22Probabilities of Events
- We will a assign number to each event to indicate
the probability that this event occurs. - Example Randomly draw a card from a deck
- S deck of cards, A red, B face card
- P(S)
- P(A)
- P(B)
23- PROPERTIES (AXIOMS) OF PROBABILITY
- Notation P(A) denotes the probability of event
A -
- the probability of an event A cannot be less
than 0 or more than 100 - 0 ? P(A) ? 1
- the total amount of probability over all
elementary outcomes must be 100 - (one and only one of the elementary outcomes
must happen). - P(S) 1
- (Recall that S denotes the sample space, i.e.,
the set of all elementary outcomes) -
- If A and B are disjoint events (i.e. if they
share no outcomes), then - P(A or B) P(A U B) P(A) P(B)
24Example of P(A U B) for disjoint events A, B
- A even roll on a die
- B roll a one on a die
- A is disjoint from B
- A U B
- P(AUB) P(A) P(B)
25The calculations we justmade intuitively have a
solid foundation
When A is finite
This follows from repeated use of my third axiom
of probability If A, B are disjoint events, then
P(A or B) P(A U B) P(A) P(B)
26VENN DIAGRAMS graphical depiction of the sample
space and events
S
The box represents the sample space S.
A
Shaded area represents event A.
The probability of an event, P(A), corresponds to
its area in the diagram. (Thus, the area of the
box is equal to one because P(S)1)
27VENN DIAGRAMS graphical depiction of the sample
space and events
A
S
28THE COMPLEMENT of an event A is an event
consisting of all outcomes that are not in
A Notation A or Ac
or A A-bar
A-complement not-A
A
29THE COMPLEMENT of an event A is an event
consisting of all outcomes that are not in
A Notation A or Ac
or A A-bar
A-complement not-A
A
P ( ) (Total area) - (area of A) P(S) -
P(A) 1- P(A) P( ) 1 - P(A)
30 THE COMPLEMENT Experiment Event A
What is ? die roll 2,4 coin toss
heads, tails
The probability of the complement Experiment
Event A P(A) Event A P(A)
die roll 2,4 coin toss
heads, tails
31THE INTERSECTION of events A and B is the event
consisting of all outcomes that are in both A and
B Notation A?B (book notation
A and B) A intersect B
B
A
B
A
B
A
B
A
32THE INTERSECTION P(A?B) P(A) P(A?B)
P(B) Example die roll A even 2,4,6
B high 4,5,6 P(A ? B)
33DISJOINT / MUTUALLY EXCLUSIVE EVENTS Events A and
B are disjoint / mutually exclusive if they
cannot occur simultaneously (i.e., when one event
occurs, the other cannot). Thus, A and B are
mutually exclusive if and only if A ? B ?.
A collection of (two or more) events is mutually
exclusive if all pairs of events (from the
collection) are mutually exclusive.
34- MUTUALLY EXCLUSIVE EVENTS
- Example die roll
- Collection
Disjoint/Mutually exclusive? (Why?) - 2,4,6, 1,6
- 1,3,5, 2,4,6
- 2, 5, 4,6
- 1,2, 4,5, 2,6
- 1, 2, 3, 4, 5, 6
- Are events A and mutually exclusive?
- Are elementary outcomes mutually exclusive?
35THE UNION of events A and B is the event
consisting of all outcomes that are in A or B (or
both). Notation A?B (book
notation A or B) A union B
A
B
We will see later that P(A ? B) P(A) P(B) -
P(A?B)
36THE UNION other possibilities
B
A
B
B
A
A
- Example (die roll) A even roll2,4,6, Bhigh
roll4,5,6 - A ? B P(A ? B)
- A ? B P(A ? B)
- In general
- P(A ? )
37- COLLECTIVELY EXHAUSTIVE EVENTS
- A collection of events is collectively
exhaustive if one of the events must occur. - Thus, events A1, A2, ,Ak are collectively
exhaustive if and only if - A1? A2 ? ? Ak S
- Is the collection of all elementary outcomes
collectively exhaustive? - Are A and collectively exhaustive?
Example (die roll) A1,2,3, B4,5,6,
C2,4,6, D1,3,5, E1,5,6 Which of the
following are collectively exhaustive? C,D
B,D,E A,C,E