Title: From histograms to probability densities cont.
1LECTURE 6
- From histograms to probability densities (cont.)
- Probability density definition and computing
rules - Some important probability density function (pdf)
families and their application - Symmetric pdfs (normal, logistic, Cauchy)
- Location and scale parameters
- Fitting a normal density to data
(msft.returns.5yr) - Quantiles of distributions
2P(data lt a) sum of relative frequencies of
cells to left of a
P(b lt data lt c) sum of relative frequencies
of cells between b
and c
3Probability density smooth limit of a histogram
Hand draw during lecture
4Probability density function (pdf)
Definition
Basic computing rules
Disjoint intervals
5Terminology
- The following terms all mean the same
- probability density (or density for short)
- probability density function (or pdf for short)
- We say that the distribution of the data
- is described by the density
- We refer to the distribution of the data by
various names, e.g., normal, exponential, and
say things like the data has a normal
distribution or the data has a normal density
6Qualitative features of histograms/densities
- Devore and Farnum, p. 18 (show
picture) - Unimodal versus bimodal
- Symmetric (about a center/location)
- Postively skewed, negatively skewed
- Strictly positive
73.4 Probability Density Families
- Symmetric pdfs
- Uniform on finite interval
- Normal (Gaussian)
- Logistic
- Cauchy
- Asymmetric pdfs (discuss in Lecture 7)
- Log-normal
- Exponential
- Weibull
8Histogram for1000 spins of ideal spinnera0,
b1
Uniform Density Family
a
b
9Comments on the uniform distribution
- It is primarily used as a textbook example,
because of its simplicity, to illustrate various
concepts, computations - However, there are some important real-world
applications! - Round-off errors in numercial computing
- Random phases in communications systems
10Three Location and Scale FamilesNormal
(Gaussian), Logistic, Cauchy
General form of these densities
11Examples of Changing Location and Scale
Add pictures to slide or sketch during lecture
12Normal Density (pdf)
Standard normal density
13Logistic Density (pdf)
Standard form
14Cauchy Density (pdf)
Standard form
15Standard normal
Standard logistic
Standard Cauchy
But the above is not a fair comparison because
the densities are not matched, e.g., they do
not all have the same quartiles
16A script to make the previous plot
xseq(-4,4,.1) plot(x,dnorm(x),type"l",ylimc(0,.
5),xlab"",
ylab"",lwd2) lines(x,dlogis(x,0,slogis),lty2,lw
d3) lines(x,dcauchy(x,0,scauchy),lty4,lwd2) tit
le(mainSTANDARD NORMAL,LOGISTIC AND CAUCHY
DENSITIES")
Nearly identical to the instructions in Computer
Assignment 1 Differs by creating x so you
dont have to type so much later on
17Standard normal (solid)
Tails decay at different rates
Matched logistic (dotted)
Matched Cauchy (dashed)
Now the relative tail thickness/heavyness is more
clear
18A script to make the previous plot
xseq(-4,4,.1) plot(x,dnorm(x),type"l",ylimc(0,.
5),xlab"",
ylab"",lwd2) qnorm.25qnorm(.25) qlogis.25qlogi
s(.25) qcauchy.25qcauchy(.25) slogisqnorm.25/qlo
gis.25 scauchyqnorm.25/qcauchy.25 lines(x,dlogis(
x,0,slogis),lty2,lwd3) lines(x,dcauchy(x,0,scauc
hy),lty4,lwd2) title(main"NORMAL, LOGISTIC AND
CAUCHY DENSITIES\n WITH MATCHED
QUARTILES")
19The Previous Slide
- It tells you how to make the plots so that each
density has the same lower and upper quartiles,
-.6745 and .6745 - But it does not explain why it works
- We will explain this a little later, when we
discuss quantiles of distributions
20A good way to get a feeling for these
distributions
- Use the S-PLUS functions
- rnorm(n,0,1)
- rlogis(n,0,.6139)
- Rcauchy(n,0,.6745)
- to generate samples of sizes n 25, 50, 100,
etc., and make histograms
21Fitting Densities to Data
- This is a topic you will learn much more about
later on in the course - For now we focus on fitting a normal density in
the classic way, using the sample mean and
standard deviation to estimate the location and
scale parameters - We will justify this later on
- Our main goal now is to indicate how this gives
you a predictive model for tail probabilities
22Fitting a Normal Density
An unjustifed rule. Just do it for now. Later we
find out it is the best you can do if the data is
really normally distributed
msft.returns.5yr Fit is quite good!
23The Predictive Normal Distribution
msft.returns.5yr
P(lose gt 15) .02
P(gain gt 15) .09
24Comments on Fitting a Normal Density
- The method is time-honored and works very well
when the data is symmetric and there are no
outliers - But you can get burned if the data contains
outliers - Alternative symmetric densities such as the
logistic and Cauchy are often useful models when
outliers are present - But do not use the sample mean and std.
deviation! - You need to estimate location and scale
differently
25Quantiles of a Distribution (definition)
26Shapes of quantile functions
pseq(.01,.99,.01) qqunif(p,-1,2) plot(p,q,type"
l",xlab"probability p",ylab"quantile function
x(p)") title(main"QUANTILES OF A UNIFORM DENSITY
ON (-1,2)")
27pseq(.01,.99,.01) qqnorm(p,10,2) plot(p,q,type
"l",xlab"probability p",ylab"quantile function
x(p)") title(main"QUANTILES OF A NORMAL
DISTRIBUTION N(10,2)") abline(10,0,lty4)
28Exponential Density (pdf)
The most important asymmetric density! A model
for failure times, inter-arrival times
Standard form