Title: Maximum Likelihood
1Maximum Likelihood
- Lhypothesis data Prdata hypothesis
- Allows you to solve problems and test hypotheses
that would be extremely difficult in any other way
2Example
- What proportion of this class has shoplifted an
item worth more than 10? - Flip a coin
- Dont tell ANYONE the result
- If heads, answer heads
- If tails, answer heads if youve shoplifted
something, tails otherwise
3Pseudoreplication
- The error that occurs when samples are not
independent, but are treated as though they are
4Example The transylvania effect
A study of 130,000 calls for police assistance
in 1980 found that they were more likely than
chance to occur during a full moon.
5Example The transylvania effect
A study of 130,000 calls for police assistance
in 1980 found that they were more likely than
chance to occur during a full moon.
Problem There may have been 130,000 calls in
the data set, but there were only 13 full moons
in 1980. These data are not independent.
6Pseudoreplication
- We are making a false claim about the number of
independent samples in our data - Very common mistake in biology
- Easiest solution use the average of all the
pseudoreplicates
7Very small samples and assumptions
- Question from the class
- Say there's a test which you desire to carry
out which is expensive and therefore you can
afford only 2 treatments, each with two
replicates. How would we go about analysing any
difference, because are sample would be so small
that we wouldn't be able to know if our data
followed a normal distribution, right? and would
these tests be worth carrying out since they
would have pretty low power? - Answer most scientists will just proceed with
the test - Interpret the results as if our assumptions are
true (and we have no idea), then
8Very small samples and assumptions
- Example does the Earth have more species of
living things than other planets in the solar
system? - Data Earth10,000,000-100,000,000
- Mercury, Venus, Mars, Jupiter, Saturn, Uranus,
Neptune0 (as far as we know)
9Hypothesis testing
- Null hypothesis are usually very simple, and
often known beforehand to be false - You will eventually reject them if you have a big
enough sample size
10Example
- Study on logging
- Ho The density of large trees is greater in
unlogged versus logged areas
11Fewer trees
12Statistical significance ? Biological importance
- Statistically significant means P lt 0.05
- But it does not necessarily mean important!
- Likewise, nonsignificant results can be
biologically important - Its always useful to estimate a parameter or
effect size, with a confidence interval
13Examples
- Some studies of thousands of children have found
statistically significant associations of IQ with
birth order - These differences are on the order of 1-2 IQ
points - Such differences are not biologically important
for individuals, and cant explain why your
sister is smarter than you!
14Examples
- Large study of hormone replacement therapy showed
no significant benefit of HRT to post-menopausal
women - Confidence interval for the effect size suggested
that any possible undetected effect is likely to
be extremely small
15Correlation does not require causation
16Correlation and Causation
Ice cream
Violent crime
Hot weather
17Data for many countries
18Confounding variables
- Variables that mask or distort the association
between measured variables in a study - Two approaches
- Try to measure them all
- Do an experiment
19Make a Plan
- Develop a clear statement of the question
- List possible outcomes
- Develop an experimental plan
- Keep the design as simple as possible
- Check for common design problems
- Is sample size big enough?
- Discuss with other people!
20The importance of controls
- Placebo effect - an improvement in a medical
condition that results from the psychological
effects of medical treatment - Most people get better over time
- Humans like to please others, including their
doctors - Benefits of doctors beyond drugs
- Direct psychological effects on health
21The importance of controls
- Well-documented for pain relief
- Up to 40 of people report improvement in pain
when given sugar pills - Drugs and treatments must be analyzed in this
context
22Head On stick of wax
23Im addicted to placebos. I could quit but it
wouldnt matter.
Steven Wright
24Mistakes
- Two types of mistakes
- Experimental mistakes
- Statistical mistakes (Type III error)
25Mistakes
- Two types of mistakes
- Experimental mistakes
- Statistical mistakes (Type III error)
26Experimental Mistakes
27Mistakes
- Two types of mistakes
- Experimental mistakes
- Statistical mistakes (Type III error)
28Statistical Mistakes
- 1/3 to 1/2 of scientific papers that use
statistics make at lease minor mistakes - 8 major mistakes - enough to alter the
conclusions of the paper - Be careful when reading papers
- Be careful with your own work!
29Data dredging
- The process of carrying out statistical tests on
your data until you come up with a statistically
significant result.
30P 0.05
second digit
31(No Transcript)
32Beware multiple comparisons
Probability of a Type I error in N tests
1-(1-a)N
For 20 tests, the probability of at least one
Type I error is 65.
33Example - ESP
34Six or more correct answers you have ESP!
35Bonferroni correction
Anyone in the class have 8 or more correct?
36Garbage-in, garbage-out
- Small P-values do not rescue a poor measurement
- Example IQ test bias
37Aboriginal-based IQ Test
- 1.What number comes next in the sequence, one,
two, three, __________?
MANY
38Aboriginal-based IQ Test
- 2. As wallaby is to animal so cigarette is to
__________
TREE
39Aboriginal-based IQ Test
- 3. Three of the following items may be classified
with salt-water crocodile. Which are they? - marine turtle brolga
- frilled lizard black snake
40Fraud happens
Original
Haeckel's copy
(echidna embryos)
41Recent Fraud Example
- Woo Sek Hwang, human cloning
- Much of the data suspected to be fabricated
42Regression to the mean
- When repeated measurements are taken over time
- Individuals with extreme values for the first
measurement tend to be nearer to the mean for the
second measurement
43Regression to the Mean
44Regression to the Mean
The sophomore slump
45Publication bias
Papers are more likely to be published if Plt0.05
This causes a bias in the science reported in the
literature.
46Meta-analysis
- Compiles all known scientific studies testing the
same null hypothesis and quantitavely combines
them to give an overall estimate of the effect
and its statistical properties - This is a GREAT honours project