Bernoulli Trials - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Bernoulli Trials

Description:

Bernoulli Trials A Bernoulli trial is an experiment with only two outcomes: Success: This event occurs with probability p Failure: This event occurs with probability ... – PowerPoint PPT presentation

Number of Views:399
Avg rating:3.0/5.0
Slides: 16
Provided by: Wombat
Category:

less

Transcript and Presenter's Notes

Title: Bernoulli Trials


1
Bernoulli Trials
A Bernoulli trial is an experiment with only two
outcomes
  • Success This event occurs with probability p
  • Failure This event occurs with probability q
    1 - p
  • A coin flip is the classic example of a Bernoulli
    trial
  • In the context of finding a thymine or CT in
    DNA, it could mean
  • T is success, any other nucleotide is failure
  • A pyrimidine is success, a purine is failure

When we refer to trials in the plural, its
assumed they are independent and that they have
the same probabilities
2
Geometric Distribution
The geometric distribution follows immediately
from the idea of conducting multiple Bernoulli
trials
Q whats the probability that it takes k trials
to get a success??
  • Before we can succeed at trial k, we must first
    have had k-1 failures!!
  • Each failure occurred with probability q, so
    there is a term with
  • qk-1
  • Finally, a single success occurs with probability
    p, so there is a term
  • p1

But each trial is mutually independent, so we can
write PrXk qk-1p
3
Geometric Distribution
Geometric distribution for p 1/5
Image from encyclopedia of math
PrXk qk-1p What is the probability that the
random variable X takes on the value k? X
represents the number of trials required to get a
success
4
Binomial Distribution
The binomial distribution also arises naturally
from the idea of conducting multiple Bernoulli
trials
Q whats the probability that well get k
successes in n trials??
  • Again trials are mutually independent, so we can
    write
  • qn-kpk

BUT WAIT!!! Were not done!
5
Binomial Distribution
We also need to consider how many different ways
we can generate those k successes from n trials.
Here we show all the ways you can get 3 Ts in a
total of 5 nucleotides (symbol V here is the
non-T nucleotide ambiguity code a failure in
our Bernoulli trials) 'TTTVV', 'TTVTV',
'TTVVT', 'TVTTV', 'TVTVT', TVVTT', VTTTV',
VTTVT', VTVTT', VVTTT'
6
Binomial Distribution
Putting it all together
( )
qn-kpk
PrXk
What is the probability that the random variable
X takes on the value k? X represents the number
of trials k out of a total of n that were
successes
One small fly in the ointment.. DNA has four
bases, not just two really we want a multinomial
distribution -- a generalization of the binomial
distribution. But close enough for government
work, eh?
7
Binomial Distribution
Image from zoonek2.free.fr
n 100, p 0.5
What is the expected value of these
distributions? Normal curve with same mean and SD
drawn over top
8
Poisson Distribution
Another common limiting case of binomial is when
we have large N and small p such that the
expected (mean) value is a moderate number
(between 0 and 5-10). Then the distribution is
close to a Poisson distribution
Binomial(10,.1)
Poisson(1)
9
Characteristics of Poisson
Single parameter (mean) l Np P( k l )
exp(- l) lk/k! Variance Mean l SD vl
For l gt 10, Normal approximation N(l, l) is
fine
l 3
l 7
10
Scientific Computing in Python
SciPy
..and more
NumPy
Matplotlib
http//scipy.org
11
Scientific Computing in Python
numPy implements very efficient low-level
n-dimensional array processing and other basic
numerical routines
Our interest in numPy is mostly restricted to the
fact that both the sciPy library and matplotlib
depend on numPy
http//docs.scipy.org/doc/scipy-0.13.0/reference/
12
Scientific Computing in Python
SciPy is the name for the whole ecosystem and a
specific scientific computing library!
The sciPy library has many numerical algorithms,
but also domain specific toolboxes Our interest
is primarily in the statistics toolbox
http//docs.scipy.org/doc/scipy-0.13.0/reference/
13
Distributions in scipy.stats
SciPy supports both continuous and discrete
random variables and associated distributions
each distribution in turn supports a number of
methods , e.g.
  • rvs Random variates
  • pdf Probability Density Function
  • cdf Cumulative Distribution Function
  • sf Survival Function (1-CDF)
  • ppf Percent Point Function (Inverse of CDF)
  • isf Inverse Survival Function (Inverse of SF)
  • stats Return mean, variance, (Fishers) skew, or
    (Fishers) kurtosis
  • moment non-central moments of the distribution

http//docs.scipy.org/doc/scipy/reference/tutorial
/stats.html
14
Distributions in Python
The scipy.stats library supports a rich
collection of distributions and their methods.
One example
from scipy.stats import binom n, p
100,0.5 myList random_var binom(n,
p) for k in xrange(100) myList.append(random
_var.pmf(k)) .pmf is the probability mass
function
myList here will contain the probabilities
associated with the first 100 values of k, and if
plotted would recapitulate the earlier binomial
distribution histogram
http//docs.scipy.org/doc/scipy/reference/stats.ht
ml
15
The matplotlib python library
An very powerful tool for professional-quality
plots
Many usage examples are given in the documentation
http//matplotlib.org/
Write a Comment
User Comments (0)
About PowerShow.com