Title: Statistics
1Statistics Data Analysis
- Course Number B01.1305
- Course Section 31
- Meeting Time Wednesday 6-850 pm
CLASS 4
2Class 4 Outline
- Brief review of last class
- Questions on homework
- Chapter 5 Special Distributions
3Review of Last Class
- Probability trees
- Probability distribution functions
- Expected value
- Standard deviation
4Chapter 5
- Some Special Probability Distributions
5Chapter Goals
- Introduce some special, often used distributions
- Understand methods for counting the number of
sequences - Understand situations consisting of a specified
number of distinct success/failure trials - Understanding random variables that follow a
bell-shaped distribution
6Counting Possible Outcomes
- In order to calculate probabilities, we often
need to count how many different ways there are
to do some activity - For example, how many different outcomes are
there from tossing a coin three times? - To help us to count accurately, we need to learn
some counting rules - Multiplication Rule If there are m ways of
doing one thing and n ways of doing another
thing, there are m times n ways of doing both
7Example
- An auto dealer wants to advertise that for 20G
you can buy either a convertible or 4-door car
with your choice of either wire or solid wheel
covers. - How many different arrangements of models and
wheel covers can the dealer offer?
8Counting Rules
- Recall the classical interpretation of
probabilityP(event) number of outcomes
favoring event / total number of outcomes - Need methods for counting possible outcomes
without the labor of listing entire sample space - Counting methods arise as answers to
- How many sequences of k symbols can be formed
from a set of r distinct symbols using each
symbol no more than once? - How many subsets of k symbols can be formed from
a set of r distinct symbols using each symbol no
more than once? - Difference between a sequence and a subset is
that order matters for a sequence, but not for a
subset
9Counting Rules (cont)
- Create all k3 letter subsets and sequences of
the r5 letters A, B, C, D and E - How many sequences are there?
- How many subsets are there?
10Counting Rules (cont)
11Review Sequence and Subset
- For a sequence, the order of the objects for each
possible outcome is different - For a subset, order of the objects is not
important
12Example
- A group of three electronic parts is to be
assembled into a plug-in unit for a TV set - The parts can be assembled in any order
- How many different ways can they be assembled?
- There are eight machines but only three spaces on
the machine shop floor. - How many different ways can eight machines be
arranged in the three available spaces? - The paint department needs to assign color codes
for 42 different parts. Three colors are to be
used for each part. How many colors, taken three
at a time would be adequate to color-code the 42
parts?
13Binomial Distribution
- Percentages play a major role in business
- When percentage is determined by counting the
number of times something happens out of the
total possibilities, the occurrences might
following a binomial distribution - Examples
- Number of defective products out of 10 items
- Of 100 people interviewed, number who expressed
intention to buy - Number of female employees in a group of 75
people - Of all the stocks trades on the NYSE, the number
that went up yesterday
14Binomial Distribution (cont)
- Each time the random experiment is run, either
the event happens or it doesnt - The random variable X, defined as the number of
occurrences of a particular event out of n trials
has a binomial distribution if - For each of the n trials, the event always has
the same probability ? of happening - The trials are independent of one another
15Example Binomial Distribution
- You are interested in the next n3 calls to a
catalog order desk and know from experience that
60 of calls will result in an order - What can we say about the number of calls that
will result in an order? - Questions
- Create a probability tree
- Create a probability distribution table
- What is the expected number of calls resulting in
an order? - What is the standard deviation?
16Binomial Distribution the Easy Way
Number of Occurrences, X Proportion or Percentage
Mean E(X) n ? E(p) ?
Standard Deviation ?X(n ?(1- ?))0.5 ?p(?(1- ?)/n)0.5
17Finding Binomial Probabilities
18Example Binomial Probabilities
- How many of your n6 major customers will call
tomorrow? - There is a 25 chance that each will call
- Questions
- How many do you expect to call?
- What is the standard deviation?
- What is the probability that exactly 2 call?
- What is the probability that more than 4 call?
19Example
- Its been a terrible day for the capital markets
with losers beating winners 4 to 1 - You are evaluating a mutual fund comprised of 15
randomly selected stocks and will assume a
binomial distribution for the number of
securities that lost value - Questions
- What assumptions are being made?
- What is the random variable?
- How many securities do you expect to lose value?
- What is the standard deviation of the random
variable? - Find the probability that 8 securities lose value
- What is the probability that 12 or more lose
value?
20The Normal Distribution
21Normal Distribution
- The normal distribution is sometimes called a
Gaussian Distribution, after its inventor, C. F.
Gauss (1777- 1855). - Well-known bell-shaped distribution
- Mean and standard deviation determine center and
spread of the distribution curve - The mathematical formula for the normal f (y) is
given in HO, p. 157. We won't be needing this
formula just tables of areas under the curve. - The empirical rule holds for all normal
distributions - Probability of an event corresponds to area under
the distribution curve
22Standard Normal Distribution
- Normal Distribution with ?0 and ?1
- Letter Z is used to denote a random variable that
follows a Standard Normal Distribution
23Visualization
Symmetrical
Tail
Tail
Mean, Median and Mode
24Characteristics
- Bell-shaped with a single peak at the exact
center of the distribution - Mean, median and mode are equal and located at
the peak - Symmetrical about the mean
- Falls off smoothly in both directions, but the
curve never actually touches the X-axis
25Why Its Important
- Many psychological and educational variables are
distributed approximately normally - Measures of reading ability, introversion, job
satisfaction, and memory are among the many
psychological variables approximately normally
distributed - Although the distributions are only approximately
normal, they are usually quite close. - It is easy for mathematical statisticians to work
with - This means that many kinds of statistical tests
can be derived for normal distributions - Almost all statistical tests discussed in this
text assume normal distributions - These tests work very well even if the
distribution is only approximately normally
distributed.
26More Visualizations
?3.1 years, Plant A
?3.9 years, Plant B
?5 years, Plant C
27Z-score
- Compute probabilities using tables or computer
- Convert to z-score
- Look up CUMULATIVE PROBABILITY ON TABLE
28Determining Probabilities
29LOOKUP Table
Standard Normal Lookup Table
30Example
- Sales forecasts are assumed to follow a normal
distribution - Target, or expected value is 20M with a 3M
standard deviation - What is the probability of sales lower than 15M?
- What is the probability sales exceed 25M?
- What is the probability sales are between 15M
and 25M ?? - What is the value of k such that the sales
forecast exceeds k is 60 ?
31Example
- Benefits compensation costs for employees with a
certain financial services firm are approximately
normally distributed with a mean of 18,600 and
standard deviation of 2,700. - Find the probability that an employee chosen at
random has an benefits package that costs less
than 15,000 - Find the probability that an employee chosen at
random has an benefits package that costs more
than 21,000 - What is the value of k such that the benefits
compensation exceeds k is 95 ?
32Example
- A telephone-sales firm is considering purchasing
a machine that randomly selects and automatically
dials telephone numbers - The firm would be using the machine to call
residences during the evening calls to business
phones would be wasted. - The manufacturer of the machine claims that its
programming reduces the business-phone rate to
15 - As a test, 100 phone numbers are to be selected
at random from a very large set of possible
numbers - Are the binomial assumptions satisfied?
- Find the probability that at least 24 of the
numbers belong to business phones - If in fact 24 of the 100 numbers turn out to be
business phones, does that cast series doubt on
the manufacturers claim? - Find the expected value and standard deviation of
the number of business phone numbers in the
sample
33Example
- Assumed the stock market closed at 8,000
yesterday. - Today you expect the market to rise a mean of 1
point, with a standard deviation of 34 points.
Assume a normal distribution. - What is the probability the market goes down
tomorrow? - What is the probability the market goes up more
than 10 points tomorrow? - What is the probability the market goes up more
than 40 points tomorrow? - What is the probability the market goes up more
than 60 points tomorrow? - Find the probability that the market changes by
more than 20 points in either direction. - What is the value of k such that the market close
exceeds k is 75 ?
34Using R
- factorial(n) n!
- dbinom(x, n, p) binomial probability
distribution function - pbinom(x, n, p) binomial cumulative
distribution function - pnorm(q, mean, sd) normal cumulative
distribution function - qnorm(p, mean, sd) inverse CDF
35Homework 4
- Hildebrand/Ott
- 5.2, page 141
- 5.3, page 141
- 5.9, page 150
- 5.14, page 150
- 5.32, page 163
- 5.33, page 163
- 5.34, page 163
- Reading Chapter 6 (all) and 7 (all).