Title: ST3905
1ST3905
- Lecturer Supratik Roy
- Email s.roy_at_ucc.ie
- (Unix) supratik_at_stat.ucc.ie
- Phone ext. 3626
2What do we want to do?
- What is statistics?
- Describing Information
- Summarization, Visual and non-Visual
representation - Drawing conclusion from information
- Managing uncertainty and incompleteness of
information
3Resources
- Recommended textbook Probability and Statistics
for Engineering and the Sciences Jay L. Devore.
International Thomson Publishing. - Software R homepage www.r-project.home
4Describing Information
- Why summarization of information?
- Visual representation (aka graphical Descriptive
Statistics) - Non-visual representation (numerical measures)
- Classical techniques vs modern IT
5Stem and Leaf Plot
Decimal point is 2 places to the right of the
colon 0 8 1 000011122233333333333344444
1 55555566666677777778888888899999999999
2 0000000111111111111222222233333333444444444
2 555556666666666777778889999999999999999 3
000000001111112222333333333444 3
55555555666667777777888888899999999 4
0122234 4 55555678888889 5 111111134
5 555667 6 44 6 7
6Pie-Chart
7DotChart
8Histogram
9Histogram-Categorical
10Rules for Histograms
- Height of Rectangle proportional to frequency of
class - No. of classes proportional to sqrt(total no. of
observations) not a hard and fast rule - In case of categorical data, keep rectangle
widths identical, and base of rectangles
separate. - Best, if possible, let the software do it.
11Data
-0.053626486 -0.828128399 0.214910482
0.346570399 5 -0.849316517 0.001077376
0.736191791 1.417540397 9 -2.382332275
-2.699019949 -0.111907192 1.384903284 13
2.113286699 -1.828108272 -1.108280724
0.131883612 17 -0.394494473 0.829806888
0.023178033 0.019839537 21 -0.346280222
-0.251981108 1.159853307 -0.249501904 25
-1.342704742 -2.012653224 -1.535503208
0.869806233 29 -1.313495887 -0.244408426
-0.998886998 -1.446769605 33 1.224528053
-0.410163230 0.032230907 -0.137297112 37
-2.717620031 -0.728570438 0.034697116
2.202863874 41 -0.170794163 0.353651680
-0.673296374 3.136364814 45 -1.260108638
-0.367334893 -0.652217259 -0.301847039 49
0.315180215 0.190766333
12Tabulation
Class freq
-3,-2 //// 4
-2,-1 //// // 7
-1,0 //// //// //// /// 18
0,1 //// //// //// 14
1,2 //// 4
2,3 // 2
3,4 / 1
Total 50
13Box-Plot - I
14Box Plot II
15Box Plot III
16Non-Visual (numerical measures)
- Pictures vs. quantitative measures
- Criteria for selection of a measure purpose of
study - Qualities that a measure should have
- We live in an uncertain world chances of error
17Measures of Location
- Mean
- Mode
- Median
18Location mean, median
algebra test scores 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20 43 50 41 69 52
38 51 54 43 47 54 51 70 58 44 54 52 32 42 70 21
22 23 24 25 50 49 56 59 38 Mean 50.68 10
trimmed mean of scores 50.33333 Median 51
19Location Non-classical
An M-estimate of location is a solution mu of the
equation sum(psi(
(y-mu)/s )) 0. Data set car.miles (bisquare)
204.5395 (Hubers ) 204.2571
20Tabular method of computing
Class freq Class-midpt Rel. freq r.f X midpt
-3,-2 4 -2.5 0.08 -0.20
-2,-1 7 -1.5 0.14 -0.21
-1,0 18 -0.5 0.36 -0.18
0,1 14 0.5 0.28 0.14
1,2 4 1.5 0.08 0.12
2,3 2 2.5 0.04 0.10
3,4 1 3.5 0.02 0.07
50 -0.16
21Tabular method of computing
Class freq Class-midpt(x) A-0.5 x-A/d Rel. freq r.f X x
-3,-2 4 -2.5 -2 0.08 -0.16
-2,-1 7 -1.5 -1 0.14 -0.14
-1,0 18 -0.5 0 0.36 0
0,1 14 0.5 1 0.28 0.28
1,2 4 1.5 2 0.08 0.16
2,3 2 2.5 3 0.04 0.12
3,4 1 3.5 4 0.02 0.08
50 0.34
22Measures of Scale (aka Dispersion)
- Variance (unbiased) sum((x-mean(x))2)/(N-1)
- Variance (biased) sum((x-mean(x))2)/(N)
- Standard Deviation sqrt( variance)
23Tabular method of computing
Class Class-midpt(x) A-0.5 x(x-A)/d x2 Rel. freq r.f X x2
-3,-2 -2.5 -2 4 0.08 0.32
-2,-1 -1.5 -1 1 0.14 0.14
-1,0 -0.5 0 0 0.36 0
0,1 0.5 1 1 0.28 0.28
1,2 1.5 2 4 0.08 0.32
2,3 2.5 3 9 0.04 0.36
3,4 3.5 4 16 0.02 0.32
1.74
24Robust measures of scale
- The MAD scale estimate generally has very small
bias compared with other scale estimators when
there is "contamination" in the data. - Tau-estimates and A-estimates also have 50
breakdown, but are more efficient for Gaussian
data. - The A-estimate that scale.a computes is
redescending, so it is inappropriate if it
necessary that the scale estimate always be
increasing as the size of a datapoint is
increased. However, the A-estimate is very good
if all of the contamination is far from the
"good" data.
25Comparison of scale measures
MAD(corn.yield) 4.15128 scale.tau(corn.yield)
4.027753 scale.a(corn.yield) 4.040902 var(corn.y
ield) 19.04191 sqrt(var(corn.yield))
4.363703 N.B. To really compare you have to
compare for various probability distributions as
well as various sample sizes.
26Probability
- Concept of an Experiment on Random observables
- Sets and Events, Random variables, Probability
(a).Set of all basic outcomes Sample space
S (b).An element of S or union of elements in S
An event (Asingleton event simple event, else
compound) (c) A numerical function that
associates an event with a number(s) Random
Variable (d) A map from E onto 0,1 obeying
certain rules probability
27Examples of Probability
- Consider toss of single coin
- A single throw Only two possible outcomes
Head or Tail - Two consecutive throws Four possible outcomes
(Head, Head), (Head, Tail), (Tail, Head), (Tail,
Tail) - Unbiased coin P(Head turns up) 0.5
- Define R.V. X to be X(Head)1, X(Tail)0.
P(X1)0.5, P(X0)0.5.
28Axioms of Probability
- 0 lt P(A) lt 1 for any event A
- PA ? B PAPB if A,B are disjoint
sets/events - PS 1
29Basic Formulae-I
- PA 1- PA
- PA ? B 0 if A,B are disjoint
- PA ? B PAPB-PA ? B
- PA ? B ? C PAPB PC
- -PA ? B PA ? C PB ? C
- PA ? B ? C
30Basic Formulae-I-Examples-1
- Consider the coin tossing experiment with three
consecutive tosses, and Head or Tail being
equally likely in any throw. - Sample space HHH,HHT,HTH,HTT,THT,THH,TTH,TTT
- Define A there are at least 2 Heads
P(A)0.5 - Define B there are at least 1 Tail
P(B)0.875 - A ? B HHT,THH,HTH PA ? B 3/8
- PA ? B PAPB-PA ? B 1
31Basic Formulae-I-Examples-2
Venn Diagrams
A?B
B
A
32Basic Formulae-I-Examples-3
Venn Diagrams
A?B
B
A
33Basic Formulae-I-Examples-4
Venn Diagrams
B or complement of B
B
A
34Basic Formulae I-Examples
- A family that owns 2 cars is selected, and for
both the older car and the newer car we note
whether the car was manufactured in America,
Europe or Asia. (a) what are the possible
outcomes of this experiment (b) which outcomes
are contained in the event that one car is
American and the other is non-American? ( c)
which outcomes are contained in the event that at
least one car is non-American? - In a certain residential suburb, 60 of all
households subscribe to the metropolitan
newspaper published in a nearby city, 80
subscribe to the local afternoon paper, and 50
of all households subscribe to both papers. If a
household is selected at random, what is the
probability that it subscribes to (1) at least
one of the two (2) exactly one newspaper?
35Basic Formulae - II
- Counting Principle For an ordered sequence to
be formed from N groups G1,G2,.GN with sizes
k1,k2,.kN, the total no. of sequences that can
be formed are k1 x k2 x .kN. - For any positive integer m, m! is read as
m-factorial and defined by m!m(m-1)(m-2)3.2.1 - An ordered sequence of k objects taken from a set
of n distinct objects is called a Permutation of
size k of the objects, and is denoted by Pk,n
n(n-1)(n-k1) n!/(n-k)! - Any unordered subset of size k from a set of n
distinct objects is called a Combination, denoted
Ck,n. Pk,n /k! n!/k! (n-k)!
36Basic Formulae II-Example
- A student wishes to commute first to a junior
college for two years and then to a state college
campus. Within commuting range there are four
junior colleges and three state colleges. How
many choices of junior college and state college
are available to her? I f junior colleges are
denoted by 1,2,3,4 and state colleges by a,b,c,
choices are (1,a),(1,b),,(4,c), a total of 12
choices. With n1 4 and n23, Nn1n212 without a
list. - There are 8 teaching assistants available for
grading papers in a particular course. The first
exam consists of 4 questions, and the professor
wishes to select a different assistant to grade
each question (only 1 assistant per question). In
how many ways can assistants be chosen to grade
the exam? Ans. P4,8 (8)(7)(6)(5)1680.
37Basic Formulae II-Examples
- Consider the set A,B,C,D,E. We know that there
are - 5!/(5-3)! 60 permutations of size 3. There are 6
permutations of size 3 consisting of the elements
A,B,C since these 3 can be ordered 3.2.1 3!
6 ways (A,B,C), (A,C,B), (B,A,C),(B,C,A),
(C,A,B) and (C,B,A). These 6 permutations are
equivalent to the single combination A,B,C.
Similarly for any other combination of size 3,
there are 3! Permutations, each obtained by
ordering the 3 objects. Thus, - 60 P3,5 C3,5 .3! So C3,5 60 / 3! 10.
- These 10 combinations are
A,B,C,A,B,D,A,B,E,A,C,D,A,C,E,A,D,E,B
,C,D,B,C,E,B,D,E,C,D,E.
38Basic Formulae II-Example
- The student Engineers council at a certain
college has one student representative from each
of the 6 engineering majors (civil, food,
electrical, industrial, materials, and
mechanical). In how many ways can (a) Both a
council president and a vice president be
selected? (b) A president, a vice-president, and
a secretary be selected? ( c) Two members be
selected for the Presidents Council? - A real estate agent is showing homes to a
prospective buyer. There are 10 homes in the
desired price range listed in the area. The buyer
has time to visit only 3 of them. (a) In how many
ways could the 3 homes be chosen if the order of
visiting is considered? (b) how many ways could
the 3 homes chosen if the order is unimportant?
If 4 of the homes are new and 6 been previously
occupied and if 3 homes to visit are randomly
chosen, what is the prob. That all 3 are new?
39Basic Formulae-III
- Pk,n n!/(n-k)!
- Ck,n n!/k!(n-k)!
- For any two events A and B with P(B)gt0, the
Conditional Probability of A given (that ) B (has
occurred)is defined by P(AB) P(A ? B)/P(B) 0
if P(B)0 - Let A,B be disjoint and C be any event with
PCgt0. Then P(C)P(CA)P(A)P(CB)P(B) Law of
Total Probability - Let A,B be disjoint and C be any event with
PCgt0. Then P(AC)P(CA)P(A)/P(CA)P(A)P(CB)P
(B). Bayes Theorem
40Basic Formulae-III-examples
- Suppose that of all individuals buying a certain
PC, 60 include a word processing program in
their purchase, 40 include a spreadsheet
program, and 30 include both types of programs.
Consider randomly selecting a purchaser and let
Aword processing program included and
Bspreadsheet program included. Then P(A)0.6,
p(B)0.4, and P(both included)P(A?B)0.30. Given
that the selected individual included a
spreadsheet program, the probability that a word
program was also included is P(AB) P(A ?
B)/P(B) 0.30/0.40 0.75.
41Basic Formulae-III-examples
- A chain of video stores sells 3 different brands
of VCRs. Of its VCR sales, 50 are brand 1 (the
least expensive), 30 are brand 2, and 20 are
brand 3. Each manufacturer offers a 1-year
warranty on parts and labour. It is known that
25 of brand 1s VCRs require warranty repair
work, whereas the corresponding percentages for
brands 2 and 3 are 20 and 10, respectively. (a)
What is the probability that a randomly selected
purchaser has a VCR that will need repair while
under warranty? - Let Ai brand I is purchased for i1,2,3. Let
Bneeds repair, the given data implies
p(BA1)0.25, P(BA2)0.20,P(BA3)0.10.
42Basic Formulae-III-examples
- Only 1 in 100 adults is afflicted with a rare
disease for which a diagnostic test has been
developed. The test is such that when an
individual actually has the disease, a positive
result will occur 99 of the time, while an
individual without the disease will show a
positive test result only 2 of the time. If a
randomly selected individual is tested and the
result is positive, what is the probability that
the individual has the disease? Let A1
individual has the disease, A2 individual
does not have disease, Bpositive test result.
Then P(A1)0.001, P(A2)0.999, P(BA1)0.99, and
P(BA2)0.02. P(B)0.02097, P(A1B)P(A1?B)/P(B)0
.047
43Basic Formulae-IV
- Two events A and B are independent if P(AB)
P(A ? B)/P(B) P(A) and are dependent otherwise. - Two events A and B are independent if and only if
P(A?B) P(A)P(B).
44Random Variables - Discrete
- A discrete set is a set such that either it is
finite or there exists a map from each element of
the set into a subset of the set of Natural
numbers. - A discrete random variable is a r.v. which takes
values in a discrete set consisting of numbers. - The probability distribution or probability mass
function (pmf) of a discrete r.v. X is defined
for every number x by p(x)P(Xx)P(all s ? S
X(s)x)
PXx is read the probability that the
r.v. X assumes the value x. Note, p(x) gt 0, sum
of p(x) over all possible x is 1
45Random Variables - Discrete
- Bernoulli trials (Coin toss is a particular
example). The random variable X takes two values
1, and 0. - Notation PX1p, 0ltplt1 (Note that this
automatically implies PX01-p) - A general (arbitrary) discrete random variable
can be denoted by an uppercase letter, say, X - The discrete values that can be taken by X are
x1,x2,x3,xn (assuming that total no. of values
possible is n) - Typically, the corresponding probability masses
are denoted by p1,p2,,pn
46Cumulative Distribution Function
- The probability distribution or probability mass
function of a discrete r.v. is defined for every
number x by p(x) P(Xx) P(all s ? S X(s)x). - The Cumulative distribution function (cdf) F(x)
of a discrete r.v. X with pmf (probability mass
function) p(x) is defined for every number x by
F(x)P(X?x)?y y ? x p(y)
- For any number x, F(x) is the probability that
the observed value of X will be at most x. - For any two numbers a,b with a ? b, P(a ? X ? b)
F(b)-F(a-) where a- represents the largest
possible X value that is strictly less than a.
47Discrete R.V.-illustration
- Consider the Bernoulli r.v. X with PX1p,
0ltplt1. The probability mass function can be given
by px(1-p)1-x - The Cumulative distribution function (cdf) F(x)
P(X?x) ?y y ? x p(y) (1-p)1xlt1
.1xgt0 -
0
1
48Operations on RVs
- Expectation of a RV
- Expectations of functions of RVs
- Special Cases Moments, Covariance
49Expected Values of Random Variables
- Let X be a discrete r.v. with set of possible
values D and pmf p(x). The expected value or mean
value of X, denoted by E(X) or ?X , is E(X) ?X
?x?D x.p(x) - Note that E(X) may not always exists. Consider
p(x)k/x2 - For Bernoulli X, E(X)p.1(1-p).0 p
- E(a bX) abE(X) linearity property of
expectation
50Expected Values of functions of Random Variables
- Let X be a discrete r.v. with set of possible
values D and pmf p(x). The expected value or mean
value of f(X), denoted by E(f(X)) or ? f(X) , is
E(f(X)) ?x?D f(x).p(x) - Example Variance.
Var(X)V(X)EX-E(X)2E(X2)-E(X)2 - Variance of Bernoulli X E(X-p)2 E(X2)-p2
1.p p2 p(1-p) - Classical expression of variance of n numbers
x1,x2,xn is simply the variance of a r.v. X that
takes the values x1,x2,xn , each with
probability 1/n.
51Expected Values of functions of Random Variables
- EabXabEX Var(abX)b2Var(X)
- Standard deviation aka s.d. is ?Var(X)
- Let X be the r.v. with pmf
x 3 4 5
p(x) .3 .4 .3
E(X)3?0.3 4?0.4 5?0.34.0 Var(X) (3-4)2 ?0.3
(4-4)2 ?0.4 (5-4)2 ?0.3 0.6 s.d. (X) 0.77
52Expected Values of functions of Random Variables
- Let X be the r.v. with pmf
x 0 1 2 3 4
p(x) .08 .15 .45 0.27 0.05
Find E(X), Var(X), s.d.(X)
53R.V.D - Binomial
- Binomial experiment total number of a
particular outcome in a sequence of trials with
only two possible outcomes. - The Binomial r.v. X, with parameters, (n,p)
denoted for short by BIN(n,p) is defined by
PXxCk,n px(1-p)n-x ,
x0,1,2,,n - XX1X2Xn, where Xks are independent
Bernoulli r.v.s. - EX np (Exercise!) VarX np(1-p)
54R.V.D Binomial-2
Consider the outcome for a binomial experiment
with 4 trials
55R.V.D Binomial-3
56R.V.D Binomial-4
57R.V.D Binomial-5
58R.V.D Poisson
- Poisson r.v. can be thought of as a limit of
Binomial experiment where n is very large, and np
approaches a limit , say ?. - The Poisson r.v. X, with parameter, ?, denoted
for short by POI(?) is defined by
PXxe-? ?x /x!
, x0,1,2, - EX ? (Exercise!) VarX ?
59R.V.D Poisson-2
60R.V.D Poisson-3
61R.V.D Poisson-4
62Random Variables - Continuous
- A continuous random variable is a r.v. which
takes values in an interval on the real number
line. (If multivariate then on the two
dimensional real plane, etc.). - The probability distribution or probability
density function (pdf) of a continuous r.v. X is
defined by, a function, say, f(x) such that Pa
?X ? b ? a b f(x)dx - i.e. the probability that X lies between a and b
is given by the area under the graph of f(x)
enclosed on the x-axis by a and b. - If X is a continuous r.v., then for any constant
c, PXc0.
63Cumulative distribution functions
- The cumulative distribution function (c.d.f) of a
continuous r.v. X with pdf f(x) is defined by
F(x) PX ? x ?-?x f(x)dx - The density, f(x) is obtained by differentiating
F(x) as a function of x. - The expectation of a continuous r.v. X with pdf
f(x) is defined by E(X) ?-?? xf(x)dx - The variance of a continuous r.v. X with pdf
f(x), and expectation ? is defined by Var(X)
?-?? x- ?2f(x)dx
64R.V.C - Uniform
- The Uniform r.v. X, with parameters, (a,b)
denoted for short by UNIF(a,b), with altb, is
defined by the density
f(x)1/b-a if altxltb, and 0 otherwise. - F(x) 0 if xlta,
x/b-a
if altxltb,
1, if xgtb - EX (b-a)/2 (Exercise!) VarX ? (Find
out!)
65R.V.C - Exponential
- The Exponential r.v. X, with parameter ?, denoted
for short by EXP(?), with ?gt0, is defined by the
density f(x)(1/ ?
)exp(- x/?) if 0ltx, and 0 otherwise. - F(x) 0 if xlt0,
1- exp(-
x/?) , 0ltx
- EX ? (Exercise!) VarX ? (Find out!)
66R.V.C Normal or Gaussian
- The Normal r.v. X, with parameters (?, ?2),
denoted for short by N(?, ?2), with ?2 gt0, and
-?lt ? lt ?, is defined by the density
- f(x)(1/ ??(2?))exp(- (x- ? )2/ (2?2))
- EX ? (Exercise!) VarX ?2 (Exercise!)
- It has a symmetric density function (about its
mean). All the measures of central tendency,
mean, mode, median are the same. - It occurs as the most common limiting
distribution for averages of random variables,
i.e., averages of large no. of r.v.s can be well
approximated by it, for most r.v.s.
67R.V.C Normal or Gaussian-2
- If X follows N(?1, ?12) and Y follows N(?2, ?22)
then aXbY, where a,b are real constants, follows
N(a?1 b?2, a2?12 b2?22) - The above result can be extended to any finite
number of independent Normal random variables. - If X follows N(0, 1), then X is called a standard
normal r.v., and the corresponding distribution
function is called a standard normal
distribution. - If X follows N(?, ?2), then Y(X- ?)/ ? follows
N(0, 1).
68Percentiles of a continuous R.V.
1. Let p be a no. between 0 and 1. The (100p)th
percentile of the distribution of a continuous
r.v. X, with density f(x), denoted by ?(p), is
defined by p ?-? ?(p) f(x)dx
69Gaussian or Normal Distribution
70Sample as Random Observables
71Parametric Inference
72Tests of Hypothesis
73Hypothesis Tests for Normal Population