Title: Genomics, Computing, Economics
1Genomics, Computing, Economics
10 AM Tue 13-Feb
Harvard Biophysics 101Â (MIT-OCW Health Sciences
Technology 508) http//openwetware.org/wiki/Har
vardBiophysics_101/2007
2Binomial, Poisson, Normal
3Binomial frequency distribution as a function of
X ÃŽ int 0 ... n
p and q 0 p q 1 q
1 p two types of object or
event. Factorials 0! 1 n!
n(n-1)! Combinatorics (C subsets of size X are
possible from a set of total size of n)
n! X!(n-X)! C(n,X) B(X) C(n, X) pX
qn-X m np s2 npq (pq)n å B(X)
1
B(X 350, n 700, p 0.1) 1.5314810-157
PDF BinomialDistribution700, 0.1,
350 Mathematica
0.00 BINOMDIST(350,700,0.1,0) Excel
4Poisson frequency distribution as a function of
X ÃŽ int 0 ...
P(X) P(X-1) m/X mx e-m/ X! s2 m n
large p small P(X) _at_ B(X) m np For
example, estimating the expected number of
positives in a given sized library of cDNAs,
genomic clones, combinatorial chemistry, etc.
X of hits. Zero hit term e-m
5Normal frequency distribution as a function of X
ÃŽ -...
Z (X-m)/s Normalized (standardized) variables
N(X) exp(-Z2/2) / (2ps)1/2 probability density
function npq large N(X) _at_ B(X)
6 Mean, variance, linear correlation
coefficient
Expectation E (rth moment) of random variables X
for any distribution f(X) First moment
Mean m variance s2 and standard deviation
s E(Xr) å Xr f(X) m E(X) s2
E(X-m)2 Pearson correlation coefficient C
cov(X,Y) E(X-mX )(Y-mY)/(sX sY) Independent
X,Y implies C 0, but C 0 does not imply
independent X,Y. (e.g. YX2) P
TDIST(Csqrt((N-2)/(1-C2)) with dof N-2 and two
tails. where N is the sample size.
www.stat.unipg.it/IASC/Misc-stat-soft.html
7One form of HIV-1 Resistance
8Association test for CCR-5 HIV resistance
Samson et al. Nature 1996 382722-5
9Association test for CCR-5 HIV resistance
Samson et al. Nature 1996 382722-5
10But what if we test more than one locus?
The future of genetic studies of complex human
diseases. Ref (Note above graphs are active
spreadsheets -- just click)
GRR Genotypic relative risk
11 Class outline
(1) Topic priorities for homework since last
class (2) Quantitative exercises so far
psycho-statistics, combinatorials,
exponential/logistic, bits, association
multi-hypotheses (3) Project level presentation
discussion (4) Discuss communication/presentation
tools Spontaneous chalkboard discussions of
t-test, genetic code, non-coding RNAs
predicting deleteriousness of various mutation
types.