Title: ECON 2300 LEC
1ECON 2300 LEC 4
2Discussion Topics
- Elements and Variables
- Cross sectional and Time series data
- Sample and Population
- Relative and Percent Frequency distribution
- Cumulative distribution Ogive
- Examples Last class topics
3Elements vs. Variables
- Elements Entities on which data are collected
- Variable Characteristics of interest for the
elements
4(No Transcript)
5(No Transcript)
6Statistical Inference
- Population Set of all elements of interest in a
particular study - Sample Subset of population
- Estimating and testing hypotheses about the
characteristics of a population using data from a
sample - Example data collected for height of 10
students in class 58, 59, 6, 62,
55, 57, 59,61, 52, 6
7Statistical Inference
- Data can be used to estimate and test hypotheses
about the height of the whole class - The average height of the 10 students is equal to
5 8 - Using this value, we can possibly say that
average height of the students in class is 58
with a margin of -2 inches giving an interval
of 56 and 510
8Example
- In the Fall of 2003, Arnold Schwarzenegger
challenged Governor Gray Davis for the governor
of California. A policy institute of California
survey of registered voters reported Arnold
Schwarzenegger in the lead with an estimated 54
of the vote. - What was the population for this survey?
- What was the sample for this survey?
- Why was a sample used in this situation? Explain
9Types of data
- Cross-Sectional and Time series data
- Cross-Sectional
- Data collected at the same time or approximately
same point of time - Stock prices on the same day
- Time series
- Data collected over a number of time periods
- Average sales over six months
10(No Transcript)
11Frequency Distribution
12Relative frequency Percent frequency
- Relative frequency-fraction or proportion of
items in each class - Relative freq of classfrequency of class/n
(total number of items) - Percent frequencyRelative frequency100
13Relative frequency Percent frequency
14Example
15Frequency distribution
- Number of classes- number of data items 50 - 5
classes - Width of the class
- Approximate width(33-12)/54.2
- Round it upto 5
- In practice the above two parameters trial and
error - Class limits 10-14, 15-19, 20-24, 25-29, 30-34
16Frequency distribution
17Relative frequency Percent frequency
18Cumulative distributions
- Shows the number of data items with values less
than or equal to upper class limits - Can be used to graphically represent frequency,
relative frequency and percent frequency
distributions.
19Cumulative distributions
20Ogive
- Graph of cumulative distribution
- Data values on X-axis and frequencies on Y-axis
- A point is plotted corresponding to the
cumulative frequency of each class - Gaps between classes eliminated by considering
points halfway between class limits
21(No Transcript)
22Example
- Sorting through Unsolicited e-mail and spam
affects the productivity of office workers. An
Insight/Express survey monitored office workers
to determine the unproductive time per day
devoted to unsolicited e-mail and spam. The
following data show a sample of time in minutes
devoted to this task. - 2 4 8 4 8 1 2 32 12 1 5 7 5 5 3 4 24 19 4
14 - Frequency distribution (Classes 1-5, 6-10, 11-15
16-20 and so on) - A relative frequency distribution
- A cumulative frequency distribution
- A cumulative relative frequency distribution
- An ogive
- What percentage of workers spend 5 minutes or
less on unsolicited e-mail and spam? What
percentage of office workers spend more than 10
minutes a day on this task.
23Detecting Outliers
- Outliers - observations with unusually large or
small values compared to rest of the data set. - Could be
- Incorrectly recorded data (can be removed)
- Incorrectly included data (can be removed)
- Correct but unusual data (cannot be removed)
- z-scores can be used to identify outliers
- Any value with z-score less than -3 and greater
than 3 outlier (Empirical Rule)
24Detecting Outliers
- Such data values need to be evaluated to
determine their validity in belonging to the data
set.
25z-scores
- It gives the relative location of values within a
data set - z-score (zi) number of standard deviations
- xi is from the mean
- Two observations in two different data sets with
same z-score same relative location (same number
of standard deviations from the mean) -
-
26z-score
Consider the data for class sizes 46, 54, 42,
46, 32
27(No Transcript)
28Probability
- Real world problems
- Lot of uncertainties
- Decision making difficult
- Managers decision analysis of uncertainties
- What are the chances that sales will decrease if
we increase prices - What is the likelihood the new car design will be
a success - How likely is it that the project will be
finished on time - Numerical measure of the likelihood that event
will occur probability - Measure of degree of uncertainty associated with
events
29Probability
- Always assigned value on a scale from 0 to 1
- Values near zero event unlikely to occur
- Values near one event almost certain
- Problem A spinner has 4 equal sectors colored
yellow, blue, green and red. What are the chances
of landing on blue after spinning the spinner.
What are the chances of landing on red. - Solution Any guesses?
- The chances of landing on blue are 1 in 4
- What about red?
- The chances of landing on red are 1 in 4 again
30Definitions
- Experiment Situation involving chance or
probability that leads to results called
outcomes. - Example Spinning the spinner
- Rolling a dice
- Outcome Result of a single trial of an
experiment - Example 1, 2, 3, 4, 5 or 6-rolling a dice
- - Yellow, blue, red or
green-spinning - spinner
- - Head or tail-tossing a coin
31Probability
- Event One or more outcomes of an experiment
landing on red - Sample space Set of all experimental outcomes
- yellow, red, blue, green
- - 1, 2, 3, 4, 5, 6
- Sample point An experimental outcome
32Probability
- Probability of an event A is the number of ways
event A can occur divided by the total number of
possible outcomes - Experiment A spinner has 4 equal sectors colored
yellow, blue, green and red. After spinning, what
is the probability of landing on each color. - Outcomes Possible outcomes are yellow, red, blue
and green
33Probability
34Probability
- Experiment A coin is tossed. What is the
probability of each outcome. - Outcomes possible outcomes of this experiment
are head and tail - Probabilities
35Probability
36Counting Rules
- Identifying and counting experimental
outcomes-necessary step in assigning
probabilities - Multistep experiments-Experiment performed in
multiple steps - For example Tossing two coins
- Experimental Outcomes Pattern of
- heads and tails appearing on upward
- faces of the two coins
37Counting Rules
- Experiment of tossing two coins two step
process - Sample space for this experiment
- S(H,H), (H,T), (T,H), (T,T)
- (H,H)- Head appearing on first coin
- Head appearing on second coin
- Counting Rule
- For an experiment consisting of k steps with n1
possible outcomes on first step, n2 outcomes on
second and so on - The total number of possible outcomes is given by
n1.n2..nk
38Counting Rules
- Experiment of tossing two coins, here k-2
- Tossing first coin, n12
- Tossing second coin, n22
- The total number of possible outcomes 4
- Tree diagram Graphical representation that
helps in visualizing a multi-step experiment
39(No Transcript)
40Combinations
- Useful when experiment involves selecting n
objects from a set of N objects - Counting rule for combinations
- The number of combinations of N
- objects taken n at a time is given by
41Combinations
Example Suppose you had to choose 2 coins out of
a total of 4 coins randomly. Then number of ways
to do this are
42Permutations
- Compute number of experimental outcomes when n
objects are to be selected from a set of N
objects where order of selection is important - The number of permutations of N objects taken n
at a time is given by
43Permutations
Example Suppose you have to select 2 parts out
of 5 for inspection. Let parts be labeled A, B,
C, D, E. The number of permutations
possible AB, BA, AC, CA, AD, DA, AE, EA, BC,
CB, BD, DB, BE, EB, CD, DC, CE, EC, DE, ED
44Assigning Probabilities
- Basic requirements
- Probability assigned to each outcome must be
between 0 and 1 inclusively - The sum of probabilities of all the experimental
outcomes must equal 1.0
45Assigning Probabilities
- For n experimental outcomes
46Assigning Probabilities - Approaches
- Three most commonly used approaches
- Classical method-Appropriate when all
experimental outcomes equally likely - Relative frequency method Appropriate when data
are available to estimate the proportion of the
time when experimental outcome will occur if the
experiment is repeated a large number of times - Subjective method Appropriate when relevant data
not available and outcomes are not equally likely
47Events and their probabilities
- Event Collection of sample point (outcome)
- Example Kentucky Power Light company is
- starting a project designed to increase the
- generating capacity of one of its plants in
- northern Kentucky. Project is divided into two
- sequential steps stage 1 (design) and stage 2
- (construction)
48Events and their probabilities
- Management cannot predict before hand the exact
time required to complete each stage of the
project - Analysis of similar construction projects
revealed possible completion times of 2, 3 or 4
months for design stage and 6, 7 or 8 months for
construction stage. - Thus event of completing project in 10 months
denoted by A would consist of following sample
points - A(2,6), (2,7), (2,8), (3,6), (3,7), (4,6)
49(No Transcript)
50(No Transcript)
51Events and their probabilities
- P(A)P(2,6)P(2,7)P(3,6)P(3,7)P
- 4,6).15.15.05.1.2.05.70
52Basic Relationships - Probability
- Complement of an event
- Event consisting of all sample points that
- are not in event under consideration
Set of all possible outcomes
53Basic Relationships - Probability
- P(A)P(Ac)1
- Therefore P(A)1- P(Ac)
54Addition Law
55Mutually Exclusive Events
- Two events are said to be mutually exclusive if
they have no sample points in common
56Union of events
- Union of A and B is the event containing all
sample points belonging to A or B or both. It is
denoted by
57Intersection of events
- Intersection of events A and B is the event
containing sample points belonging to both A and
B. It is denoted by
58Conditional Probability
- Sometimes probability of an event is influenced
by whether a related event already occurred. - Probability of event A given that B occurred
conditional probability. - It is denoted by P(AB)
59Conditional Probability
- Example Consider the situation of promotion
status of male and female police officers. Police
force consists of 1200 officers, 960 men and 240
women. Over the past two years, 324 officers
received promotions (288-male officers, 36 female
officers). Argument made that discrimination is
done to women
60Conditional Probability
- Let
- Mevent an officer is a male
- Wevent an officer is a female
- Aevent an officer is promoted
- Acevent an officer is not promoted
61Conditional Probability
62Conditional Probability
Thus conditional probability does not itself
prove that discrimination exists but it supports
the argument presented by female officers
63Independent Events
- Two events A and B are independent if the fact
that A occurs does not affect the probability of
B occuring. - Mathematical terms Two events A and
- B are independent if
64Multiplication Law
- Used to compute probability of intersection of
events - Based on definition of conditional probability
65Multiplication Law
Example In a Ford dealership, manager knows that
60 of customers buy Mustang Aevent that the
first customer buys Mustang Bevent that the
second customer buys Mustang Probability that
next two customers buy Mustang