Title: Probability
1Probability
Statistics 111 - Lecture 6
- Introduction to Probability,
- Conditional Probability and
- Random Variables
2Administrative Note
- Homework 2 due Monday, June 8th
- Look at the questions now!
- Prepare to have your minds blown today
3Course Overview
Collecting Data
Exploring Data
Probability Intro.
Inference
Comparing Variables
Relationships between Variables
Means
Proportions
Regression
Contingency Tables
4Why do we need Probability?
- We have several graphical and numerical
statistics for summarizing our data - We want to make probability statements about the
significance of our statistics - Eg. In Stat111, mean(height) 66.7 inches
- What is the chance that the true height of Penn
students is between 60 and 70 inches? - Eg. r -0.22 for draft order and birthday
- What is the chance that the true correlation is
significantly different from zero?
5Deterministic vs. Random Processes
- In deterministic processes, the outcome can be
predicted exactly in advance - Eg. Force mass x acceleration. If we are
given values for mass and acceleration, we
exactly know the value of force - In random processes, the outcome is not known
exactly, but we can still describe the
probability distribution of possible outcomes - Eg. 10 coin tosses we dont know exactly how
many heads we will get, but we can calculate the
probability of getting a certain number of heads
6Events
- An event is an outcome or a set of outcomes of a
random process - Example Tossing a coin three times
- Event A getting exactly two heads HTH, HHT,
THH - Example Picking real number X between 1 and 20
- Event A chosen number is at most 8.23 X
8.23 - Example Tossing a fair dice
- Event A result is an even number 2, 4, 6
- Notation P(A) Probability of event A
- Probability Rule 1
- 0 P(A) 1 for any event A
7Sample Space
- The sample space S of a random process is the set
of all possible outcomes - Example one coin toss
- S H,T
- Example three coin tosses
- S HHH, HTH, HHT, TTT, HTT, THT, TTH, THH
- Example roll a six-sided dice
- S 1, 2, 3, 4, 5, 6
- Example Pick a real number X between 1 and 20
- S all real numbers between 1 and 20
- Probability Rule 2 The probability of the whole
sample space is 1 - P(S) 1
8Combinations of Events
- The complement Ac of an event A is the event that
A does not occur - Probability Rule 3
- P(Ac) 1 - P(A)
- The union of two events A and B is the event that
either A or B or both occurs - The intersection of two events A and B is the
event that both A and B occur
Event A
Complement of A
Union of A and B
Intersection of A and B
9Disjoint Events
- Two events are called disjoint if they can not
happen at the same time - Events A and B are disjoint means that the
intersection of A and B is zero - Example coin is tossed twice
- S HH,TH,HT,TT
- Events AHH and BTT are disjoint
- Events AHH,HT and B HH are not disjoint
- Probability Rule 4 If A and B are disjoint
events then - P(A or B) P(A) P(B)
10Independent events
- Events A and B are independent if knowing that A
occurs does not affect the probability that B
occurs - Example tossing two coins
- Event A first coin is a head
- Event B second coin is a head
- Disjoint events cannot be independent!
- If A and B can not occur together (disjoint),
then knowing that A occurs does change
probability that B occurs - Probability Rule 5 If A and B are independent
- P(A and B) P(A) x P(B)
Independent
multiplication rule for independent events
11Equally Likely Outcomes Rule
- If all possible outcomes from a random process
have the same probability, then - P(A) ( of outcomes in A)/( of outcomes in S)
- Example One Dice Tossed
- P(even number) 2,4,6 / 1,2,3,4,5,6
- Note equal outcomes rule only works if the
number of outcomes is countable - Eg. of an uncountable process is sampling any
fraction between 0 and 1. Impossible to count
all possible fractions !
12Combining Probability Rules Together
- Initial screening for HIV in the blood first uses
an enzyme immunoassay test (EIA) - Even if an individual is HIV-negative, EIA has
probability of 0.006 of giving a positive result - Suppose 100 people are tested who are all
HIV-negative. What is probability that at least
one will show positive on the test? - First, use complement rule
- P(at least one positive) 1 - P(all negative)
13Combining Probability Rules Together
- Now, we assume that each individual is
independent and use the multiplication rule for
independent events - P(all negative) P(test 1 negative) P(test
100 negative) - P(test negative) 1 - P(test positive) 0.994
- P(all negative) 0.994 0.994 (0.994)100
- So, we finally we have
- P(at least one positive) 1- (0.994)100 0.452
14Curse of the Bambino
- Boston Red Sox traded Babe
- Ruth after 1918 and did not
- win a World Series again until
- 2004 (86 years later)
- What are the chances that a team will go 86 years
without winning a world series? - Simplifying assumptions
- Baseball has always had 30 teams
- Each team has equal chance of winning each year
15Curse of the Bambino
- With 30 teams that are equally likely to win in
a year, we have - P(no WS in a year) 29/30 0.97
- If we also assume that each year is independent,
we can use multiplication rule - P(no WS in 86 years)
- P(no WS in year 1) x xP(no WS in year 86)
- (0.97) x x (0.97)
- (0.97)86 0.05 (only 5 chance!)
16Break
17Outline
- Moore, McCabe and Craig Section 4.3,4.5
- Conditional Probability
- Discrete Random Variables
- Continuous Random Variables
- Properties of Random Variables
- Means of Random Variables
- Variances of Random Variables
18Conditional Probabilities
- The notion of conditional probability can be
found in many different types of problems - Eg. imperfect diagnostic test for a disease
- What is probability that a person has the
disease? Answer 40/100 0.4 - What is the probability that a person has the
disease given that they tested positive? - More Complicated !
19Definition Conditional Probability
- Let A and B be two events in sample space
- The conditional probability that event B occurs
given that event A has occurred is - P(AB) P(A and B) / P(B)
- Eg. probability of disease given test positive
- P(disease test ) P(disease and test ) /
P(test ) (30/100)/(40/100) .75
20Independent vs. Non-independent Events
- If A and B are independent, then
- P(A and B) P(A) x P(B)
- which means that conditional probability is
- P(B A) P(A and B) / P(A) P(A)P(B)/P(A)
P(B) - We have a more general multiplication rule for
events that are not independent - P(A and B) P(B A) P(A)
21Random variables
- A random variable is a numerical outcome of a
random process or random event - Example three tosses of a coin
- S HHH,THH,HTH,HHT,HTT,THT,TTH,TTT
- Random variable X number of observed tails
- Possible values for X 0,1, 2, 3
- Why do we need random variables?
- We use them as a model for our observed data
22Discrete Random Variables
- A discrete random variable has a finite or
countable number of distinct values - Discrete random variables can be summarized by
listing all values along with the probabilities - Called a probability distribution
- Example number of members in US families
23Another Example
- Random variable X the sum of two dice
- X takes on values from 2 to 12
- Use equally-likely outcomes rule to calculate
the probability distribution - If discrete r.v. takes on many values, it is
better to use a probability histogram
24Probability Histograms
- Probability histogram of sum of two dice
- Using the disjoint addition rule, probabilities
for discrete random variables are calculated by
adding up the bars of this histogram - P(sum gt 10) P(sum 11) P(sum 12) 3/36
25Continuous Random Variables
- Continuous random variables have a non-countable
number of values - Cant list the entire probability distribution,
so we use a density curve instead of a histogram - Eg. Normal density curve
26Calculating Continuous Probabilities
- Discrete case add up bars from probability
histogram - Continuous case we have to use integration to
calculate the area under the density curve - Although it seems more complicated, it is often
easier to integrate than add up discrete bars - If a discrete r.v. has many possible values, we
often treat that variable as continuous instead
27Example Normal Distribution
- We will use the normal distribution throughout
- this course for two reasons
- It is usually good approximation to real data
- We have tables of calculated areas under the
normal curve, so we avoid doing integration!
28Mean of a Random Variable
- Average of all possible values of a random
variable (often called expected value) - Notation dont want to confuse random variables
with our collected data variables - ? mean of random variable
- x mean of a data variable
- For continuous r.v, we again need integration to
calculate the mean - For discrete r.v., we can calculate the mean by
hand since we can list all probabilities
29Mean of Discrete random variables
- Mean is the sum of all possible values, with each
value weighted by its probability - µ S xiP(xi) x1P(x1) x12P(x12)
- Example X sum of two dice
- µ 2 (1/36) 3 (2/36) 4 (3/36) 12
(1/36) - 252/36 7
30Variance of a Random Variable
- Spread of all possible values of a random
variable around its mean? - Again, we dont want to confuse random variables
with our collected data variables - ?2 variance of random variable
- s2 variance of a data variable
- For continuous r.v, again need integration to
calculate the variance - For discrete r.v., can calculate the variance by
hand since we can list all probabilities
31Variance of Discrete r.v.s
- Variance is the sum of the squared deviations
away from the mean of all possible values,
weighted by the values probability - µ S(xi-µ)P(xi) (x1-µ)P(x1)
(x12-µ)P(x12) - Example X sum of two dice
- s2 (2 - 7)2(1/36) (3- 7)2(2/36) (12 -
7)2(1/36) - 210/36 5.83
32Next Class - Lecture 7
- Standardization and the Normal Distribution
- Moore and McCabe Section 4.3,1.3