Mahasin Mujahid - PowerPoint PPT Presentation

1 / 68
About This Presentation
Title:

Mahasin Mujahid

Description:

Mahasin Mujahid – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 69
Provided by: university57
Category:

less

Transcript and Presenter's Notes

Title: Mahasin Mujahid


1

Biostatistics Academic Preview Session 1
Introduction to Biostatistics
  • Mahasin Mujahid
  • Doctoral Candidate
  • Department of Epidemiology

2
Outline
  • Survey
  • Math refresher quiz
  • Introduction to Statistics
  • The what and why of biostatistics
  • Misconceptions and misuses of biostatistics
  • How to properly use biostatistics
  • Overview of BIOS 503/553
  • Study Design
  • Types of epidemiologic studies
  • Statistical Sampling
  • Probability
  • Basic definition and rules or probability
  • Commonly used probability distributions

3
What is biostatistics
  • Statistics is the science and art of collecting,
    summarizing, and analyzing data that are subject
    to random variation
  • Biostatistics is the application of statistics
    and mathematical methods to the design and
    analysis of health, biomedical, and biological
    studies

4
Statistics is an art
  • Whether you apply statistics to biological or
    other processes, it is the art of decision making
    in the face of uncertainty.

5
Components of Biostatistics
6
Why Biostatistics
  • Its a required element of your training here at
    the University of Michigan School of Public
    Health
  • You will all either do statistical analysis or
    hire a biostatistician as a public health
    professional
  • If you dont know it and understand it, there
    will be consequences

7
What biostatistics has to offer
  • Help in developing concrete objectives and data
    acquisition methods that meet the objectives
  • Appropriate experimental and study design
  • Source of bias
  • Measurement issues
  • Efficiency/power
  • Maximizing use of a given number of subjects
  • Interpretability of findings
  • Reproducibility of analyses

8
What biostatistics has to offer (cont)
  • Increase the likelihood that the sample will
    yield estimates of adequate precision to make
    experiments conclusive/affect medical practice
  • More efficient use of the data
  • Formulate analysis plans without making
    inappropriate assumptions
  • Estimate sample size

9
What biostatistics has to offer (cont)
  • Place limits on effects of chance in small sample
    experiments
  • Determine sample size needed to detect clinically
    relevant effects
  • Control for confounding variables
  • Measure intangible information such as
    intelligence, depression, and well being, stress

10
Example
  • How are my 10 patients doing after I put them on
    an anti-hypertensive medications?
  • Describe the results of your 10 patients
  • How do patients with high blood pressure respond
    after being on anti-hypertensive medications (for
    some follow up period)?
  • Need to take a sample of patients of interest to
    approximate what would be observed had all such
    patients been treated that way

11
Example
  • What is the in hospital mortality rate after open
    heart surgery at my hospital so far this year
  • Describe the mortality
  • What is the in hospital mortality after open
    heart surgery likely to be this year, given
    results from last year
  • Estimate probability of death for patients like
    those seen in the previous year.

12
The bottom line
  • When analyzing data, your goal is simple You
    wish to make the strongest possible conclusions
    from limited amounts of data.
  • How does one achieve this goal?

13
Biostatistics and Public Health
Epidemiology
Biostatistics
Health Management and Policy
Health Behavior and Health Education
Environmental Health Sciences
14
Misuse of statistics
  • About 25 of biological research is flawed
    because of incorrect conclusions drawn from
    confounded experimental designs and misuse of
    statistical methods

15
Misuse
  • Statistics is not your worst nightmare or the
    answer to all of your problems
  • Statistical significance does not equal clinical
    or theoretical significance.

16
How to properly useBiostatistics
  • Develop an underlying question of interest
  • Generate a hypothesis
  • Design a study
  • Collect Data
  • Analyze Data
  • Descriptive statistics
  • Statistical Inference

17
Define problems, Questions, and Research aims
The Role of Statistics in the Scientific Method
Review the literature
Statististical Methods, Measurement tools and
models
Develop a hypothesis
Design experiments or other tests
Revise or modify Protocol
Collect and record data
Peer review
Replication of results
Analyze and interpret data
Public understanding of research
Scientific impact of research
Disseminate results
Investigating Research Integrity (2001)
18
Screening Statistics
  • Who said so
  • How do they know
  • Whats missing
  • Does the conclusion make sense
  • Association does not establish causation

19
How to lie with statistics by Darrel Huff
  • The secret language of statistics, so appealing
    in a fact-minded culture, is employed to
    sensationalize, inflate, confuse, and
    oversimplify

20
A little humor!!!!!
  • There are three types of lies white lies, damn
    lies, and statistics (Benjamin Disraeli)
  • Statistics are like a bikini. What they reveal is
    suggestive, but what they conceal is vital.
    (Aaron Levenstein)

21
A little humor (cont)!!!!!
  • Top ten reasons to be a statistician
  • Estimating parameters is easier than dealing with
    real life.
  • Statisticians are significant
  • I always wanted to learn the entire Greek
    alphabet.
  • The probability a statistician major will get a
    job is gt .9999.
  • If I flunk out I can always transfer to
    Engineering.
  • We do it with confidence, frequency, and
    variability.
  • You never have to be right - only close.
  • We're normal and everyone else is skewed.
  • The regression line looks better than the
    unemployment line.
  • No one knows what we do so we are always right.
  • http//www.workjoke.com/projoke48.htm

22
Choosing BIOS 503/553
  • Biostatistics 503 and 553 are both introductory
    courses for non-majors that assume no prior
    course work in biostatistics or statistics.
    Either course satisfies the school-wide
    requirement for biostatistics however, some
    departments require 553 instead of 503. While
    both courses cover the same statistical topics
    and methods, 553 assumes stronger preparation in
    mathematics and goes into greater depth in
    statistics. The prerequisite for 503 is
    elementary (high school) algebra. Students taking
    553 need to have had one term of calculus and be
    comfortable with function notation and algebra.
    The stronger mathematical prerequisite for 553
    allows time for more detailed study. Students who
    have satisfied the calculus prerequisite are
    strongly encouraged to enroll in 553.
    Biostatistics 503 and 553 are offered only in the
    fall term.

23
Biostatistics 503 Applied Biostatistics Course
Outline Fall, 2003 Lecture Four days a week,
M-Th, SPH II Auditorium. Section 1 meets 8-9am,
Section 2 9-10am. Students may attend either
lecture regardless of how they are
registered. Lab Meets once per week. Students
must attend lab they are registered for. Labs
start Thursday, September 4. Required texts
Introduction to the Practice of Statistics, 4th
Edition. Moore McCabe, CoursePack Computer Lab
Manual, available at Ulrichs. Recommended
text SPSS Manual for Moore McCabes
Introduction to the Practice of Statistics, 4th
Ed., Rogness, Stephenson Stephenson Prerequisit
es Knowledge of algebra (GRE/GMAT quantitative
score above the 50th-ile, or passing the algebra
placement exam). Calculator square root, log
(natural), exponential, y to the x. Web Site We
make extensive use of a Coursetools web site. Go
to www.coursetools.ummu.umich.edu then go to
Biostatistics 503. The site is also linked from
my homepage, www.sph.umich.edu/nichols.
24
Final Grade The final grade is an equal
weighting Homework and Labs 25 and each of the
three exams (3x25). Exams/Grading Wt. Homework
labs 25 Handed in every week Exam 1 25
Covers weeks 1-5. Exam 2 25 Covers weeks
6-10. Final 25 Comprehensive, but emphasis on
weeks 11-15
25
BIOS 503 Outline
Introduction to the Practice of Statistics, 4th
Edition. Moore McCabe,
26
Success in Biostatistics 503/553
  • Dos
  • Do required reading before class
  • Attend all lectures
  • Get very familiar with your calculator
  • Write out all given information when working out
    a problem
  • Ask questions!!!!!!!!
  • Donts
  • Skip class
  • Skip reading assignments
  • Skip homework assignments
  • Rely only on the lecture to prepare for exams
  • Wait until after the 1st exam to panic

27
Study Design
28
Designing a study
  • Statistical designs for producing trustworthy
    data are perhaps the single most influential
    contribution of statistics to the advancement of
    knowledge (Moore and McCabe 1993)

29
Statistical Considerations in Study Design
  • Type of study design
  • Experiment
  • Observational study
  • Sample size/power analysis
  • How many individuals to include in your study
  • Sampling techniques
  • How to identify a sample of individuals to
    include in your study

30
Epidemiologic Study Designs
Observational Studies
Experimental Studies
Descriptive
Analytic
Laboratory Clinical Trials Field
Trials Intervention Trials
Cohort Case-Control
Case Report Case Series Cross-sectional Correlativ
e
31
Sampling
  • The purpose of sampling is to examine some
    portion of the population and to extend the
    knowledge obtained from the sample to the
    population at large.

32
Sampling (cont)
  • It may not be practical or feasible to analyze
    the entire population
  • Physically impossible
  • Ethical Considerations

33
The language of sampling
  • population the entire collection of things of
    interest
  • population parameter a number that results from
    measuring all the units in the population
  • sampling frame the specific data from which the
    sample is drawn
  • unit of analysis the type of object of interest
  • (persons with condition x, animals, genes/cells)
  • sample a subset of some of the units in the
    population
  • statistic is a number that results from
    measuring all the units in the sample

34
Relationship between population and sample
35
Example
  • For example, to find out the average age of all
    motor vehicles in the state in 1997  
  • Populationall motor vehicles in the state in
    1997
  • Sampling frameall motor vehicles registered with
    the DMV on July 1, 1997
  • Unit of analysismotor vehicle
  • Sample300 motor vehicles
  • Statisticthe average age of the 300 motor
    vehicles in the sample
  • Parameterthe true average age of all motor
    vehicles in the state-1997

36
Sampling Techniques
Population
Simple Random Sample
Systematic Sampling
Stratified Random Sample
Convenience Sampling
Cluster Sampling
Bias free sample
Bias free sample
Biased sample
Bias free sample
Biased sample
37
Bias
  • Any trend in the collection, analysis,
    interpretation, publication or review of data
    that can lead to conclusions that are
    systematically different from the truth (Las,
    2001)
  • A systematic error in design or conduct of a
    study (Szklo et al, 2000)

38
Handling bias
  • Design stage
  • Choosing a strong study design (RCT)
  • Selection stage
  • Random sampling, matched pairs
  • Measurement stage
  • Blinding
  • Analysis stage
  • Multivariate analysis

39
Probability
40
Consider the following scenario
  • One theory concerning the etiology of breast
    cancer states that white women are at greater
    risk of developing breast cancer.
  • Suppose we wish to test this hypothesis. We
    identify 1500 women (45-50 years) free of breast
    cancer at baseline.
  • 500 white
  • 500 African American
  • 500 Asian

41
Scenario cont
  • We follow women for 10 years
  • 20 cases of breast cancer (white women)
  • 15 cases of breast cancer (African American
    women)
  • 10 cases of breast cancer (Asian women)
  • Is this difference among the groups enough to
    make you conclude that white women are at greater
    risk?

42
And the answer is
  • Probability can help you rule out chance as an
    explanation.
  • Definitions
  • Probability-The measure of how likely it is that
    an event will occur
  • Sample-all possible outcomes
  • Event-outcome of interest

43
Venn diagram for event E
Sample
44
Basic Properties of Probabilities
Property 1 The probability of an event is always
between 0 and 1. Property 2 The probability of
an event that cannot occur is 0. (An event that
cannot occur is an impossible event) Property 3
The probability that an even must occur is 1 (An
event that must occur is called a certain event)
45
Two events A and B are mutually exclusive if they
cannot both happen at the same time. A and B are
thus said to be mutually exclusive or disjoint.
When two events A and B are mutually exclusive,
the probability of A or B occurring isP
(AUB)P(A)P(B)
46
When two events A and B can both occur
simultaneously the two events are not mutually
exclusive
If P(AnB)0 then events A and B are mutually
exclusiveIf P(AnB)?0 then A and B are not
mutually exclusive
47
The complement of an event is the probability
that an event doesnt occur
EEc1-P(E)
48
The multiplicative law of probability
  • Two events A and B are said to be independent if
    the fact that A occurs has no affect on B
    occurring.
  • P(AnB)P(A) x P(B)

49
Example 1 Considering a deck of playing cards
50
P (king is selected)
51
P (face card is selected)
52
Probability distributions
  • Probability distributions are fundamental to the
    practice of statistics
  • Used in descriptive statistics (i.e the mean is
    based on assuming a normal distribution
  • Used in inferential statistics
  • Estimation (i.e constructing confidence
    intervals)
  • Inference (calculating test statistics and
    p-values for hypothesis testing

53
Probability distribution
  • A probability distribution describes the possible
    events in a sample and the frequency at which
    they occur
  • A probability distribution describes such for a
    random variable

54
Random Variables
A random variable, x is the numerical outcome of
a probability experiment.
  • x The number of people in a hospital.
  • x The time it takes to exercise
  • x The number of trips to doctor you make per
    year

55
Types of Random Variables
A random variable is discrete if the number of
possible outcomes is finite or countable.
Discrete random variables are determined by a
count.
A random variable is continuous if it can take on
any value within an interval. The possible
outcomes cannot be listed. Continuous random
variables are determined by a measure.
56
Types of Random Variables
Identify each random variable as discrete or
continuous.
  • x The number of people in a car.
  • x The time it takes to drive from home to
    school
  • x The number of trips to school you make per
    week

57
Types of probability distributions
  • A probability distribution can either be
    discrete or continuous based on the type of
    random variable it represents.

Discrete
Continuous
Normal Distribution
Binomial Distribution
T or F Distribution
Poisson Distribution
Chi-square Distribution
58
Discrete Probability Distributions
A discrete probability distribution lists each
possible value of the random variable, together
with its probability.
A survey asks a sample of families how many
vehicles each owns.
number of vehicles
59
Example 1
  • Example Suppose that a coin is tossed twice so
    that the sample. Let X (the discrete random
    variable) be the number of heads which can come
    up. Write out the probability distribution for
    the random variable X.

60
Binomial Experiments
Characteristics of a Binomial Experiment
  • There are a fixed number of trials. (n)
  • The n trials are independent and repeated under
    identical conditions
  • Each trial has 2 outcomes, S Success or F
    Failure.
  • The probability of success on a single trial is
    p. P(S) p
  • The probability of failure is q. P(F) q
    where p q 1
  • The central problem is to find the probability of
    x successes out of n trials. Where x 0 or 1 or
    2 n.

The random variable x is a count of the number
of successes in n trials.
61
Continuous Probability Distributions
A continuous probability distribution provides a
shape describing the distribution of a continuous
random variable X. Thus X can take on a range of
values within an interval.
62
Normal Distribution
63
Normal distribution
  • bell-shaped
  • symmetrical about the mean
  • total area under curve 1
  • approximately 68 of distribution is within one
    standard deviation of the mean
  • approximately 95 of distribution is within two
    standard deviations of the mean
  • approximately 99.7 of distribution is within 3
    standard deviations of the mean
  • Mean Median Mode

64
Empirical Rule
65
The Standard Score
The standard score, or z-score, represents the
number of standard deviations a random variable x
falls from the mean.
The test scores for a civil service exam are
normally distributed with a mean of 152 and
standard deviation of 7. Find the standard
z-score for a person with a score of (a) 161
(b) 148 (c) 152
66
The Standard Normal Distribution
The standard normal distribution has a mean of 0
and a standard deviation of 1.
Using z- scores any normal distribution can be
transformed into the standard normal distribution.
z
67
Basic Properties of the Standard Normal Curve
68
Next session
  • Descriptive statistics
  • The what and why of descriptive statistics
  • Types of variables
  • Formulas and interpretations of commonly used
    descriptive statistics
  • Pictorial representations of descriptive
    statistics
  • Examining the relationship between two or more
    variables
Write a Comment
User Comments (0)
About PowerShow.com