MD 5108 Biostatistics for Basic Research - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

MD 5108 Biostatistics for Basic Research

Description:

Biostatistics: Application of statistical methods to biological, medicine and health sciences ... Knowledge of basic biostatistics is required ' ... – PowerPoint PPT presentation

Number of Views:528
Avg rating:3.0/5.0
Slides: 62
Provided by: stak9
Category:

less

Transcript and Presenter's Notes

Title: MD 5108 Biostatistics for Basic Research


1
MD 5108Biostatistics for Basic Research
  • Lecturer Dr K. Mukherjee
  • Office S16-06-100
  • Tel 874 2764
  • Email stamk_at_nus.edu.sg

2
Objectives To train practitioners of the
biomedical sciences in the use and interpretation
of statistical data analysis.
  • explore and present data using tables, charts
    and graphs
  • ability to do simple statistical calculations
    with a calculator
  • carry out data analysis using a statistical
    package such as SPSS
  • pick the right procedure for analysing a set
    of data
  • interpret results correctly and report
    findings
  • avoid misuse and abuse of statistics
  • understand statistical contents of papers in
    medical journals
  • judge claims and statements critically
  • discuss and communicate ideas in a
    quantitative manner

3
  • Teaching approach
  • nonmathematical introduction
  • explanation of concepts rather than proofs
  • emphasis on methodology and procedures
  • emphasise use of statistical package rather
  • than manual calculation
  • emphasis on choosing the right procedure
  • emphasis on correct interpretation of results
  • examples from clinical research literature

4
Topic 1 What is statistics? A branch of
mathematics dealing with the analysis and
interpretation of masses of numerical data
Merrian-Webster Dictionary The field of study
that involves the collection and analysis of
numerical facts or data of any kind Oxford
Dictionary The study of how information should
be employed to reflect on, and give guidance for
action, in a practical situation involving
uncertainty Vic Barnett
Biostatistics Application of statistical methods
to biological, medicine and health sciences
5
Why the need for Statistics in Biomedicine ?
  • Two main reasons
  • Variation
  • attributes differ not only among individuals but
    also within the same individual over time
  • Sampling
  • biomedical research projects mostly carried out
    on small numbers of study subjects
  • challenging problem to project results from small
    samples studies to individuals at large

6
  • Biological Variation

Necessitates the use of statistical methods in
biomedicine to put numerical data into a context
by which we can better judge their meaning
7
From sample to population
Statistical methods used to produce statistical
inferences about a population based on
information from a sample derived from that
population
Population
inductive statistical methods
sample
8
Altman (1991) Practical Statistics for Medical
Research, Chapman and Hall.
9
Bailar Mosteller (1986) Medical Uses of
Statistics, NEJM Books.
10
Many studies have been done on misuse of
statistics in medicine
11
From Altman (1991)
12
Schor and Karten (1966, J. Am. Med. Assoc.)
  • 149 papers classed as analytical studies in 3
    issues of 11 most frequently read medical
    journals
  • assessment criteria
  • Validity with respect to
  • Design of experiment?
  • Type of analysis performed?
  • Applicability of statistical test used?

13
Findings of Schor and Karten
  • 28 of papers acceptable
  • 68 deficient but acceptable if reviewed
  • 4 unsalvageable

Lesson
must be exercised when reading scientific papers
in biomedical journals! Knowledge of basic
biostatistics is required
CARE
14
There are three kinds of lies lies, damned
lies and statistics Benjamin Disraeli   It is
easy to lie with statistics, but it is easier to
lie without them Frederick Mosteller   Statisti
cal thinking will one day be as necessary for
efficient citizenship as the ability to read and
write. H.G. Wells
15
Types of statistical methods
1. Descriptive statistical methods     data
collection and organization     summarizing data
and describing its characteristics    
presentation and publication   2. Exploratory
data analysis     play around and get a feel of
the data     preliminary analysis, often
graphical     looking for patterns and possible
relationships     are assumptions
satisfied?     which model and procedure to use?
16
3. Inductive (inferential) statistical methods
Statistical inferences about a population based
on information from a sample derived from that
population
Population
inductive statistical methods
  • estimation, confidence intervals
  • hypothesis testing
  • prediction, forecasting
  • classification

sample
17
Topic 2 Types of data
Sources of data, the raw materials of
statistics     Routinely kept records, e.g.,
hospital medical records     Surveys    
Experiments     Clinical trials     Data
base     Published reports  
Any characteristic that can be measured or
classified into categories is called a variable
18
Types of variables
(1) Qualitative variables     cannot be measured
numerically     categorical in nature, e.g.,
gender     categories must not overlap and must
cover all possibilities
w  Nominal variables (No inherent ordering of
categories)     M/F, Yes/No     Blood group (A,
B, AB, O)     Ethnic group (Chinese, Malay,
Indian, Others)
w  Ordinal variables (Categories are ordered in
some sense)     response to treatment
unimproved, improved, much improved     pain
severity no pain, slight pain, moderate pain,
severe pain
19
(2) Quantitative variables     can be measured
numerically, e.g., weight, height,
concentration     can be continuous or discrete
w  a continuous variable can take on any value
(subject to precision of measuring instrument)
within some range or interval, e.g., weight,
height, blood pressure, cholesterol level w  a
discrete variable is usually a count of something
and hence takes on integer values only, e.g.,
number of admissions to NUH  
Variable types and measurement types  have
implications on how data should be displayed or
summarized     determines the kind of
statistical procedures that should be used
20
SUMMARY
Variable
Types of variables
  • Qualitative
  • or categorical

Quantitative measurement
Nominal (not ordered) e.g. ethnic group
Ordinal (ordered) e.g. response to treatment
Discrete (count data) e.g. number of admissions
Continuous (real-valued) e.g. height
Measurement scales
21
Topic 3 Presenting data graphically Advantages
of graphical data display
  •     Let data speak for itself
  •     Get a good feel of the data before formal
    analysis
  •     Graphs and plots easier to understand and
    interpret
  •   Reveal patterns in data which may shed light
    on the appropriate model/analysis to use

e.g., Skewed or symmetric distribution
Multiple peaks / mode Are there any
outliers ? Relatioship between variables.
22
Graphs for categorical data
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
  • Comparison of methods
  •     Bar charts can be read more accurately and
    offer better distinction between close together
    values
  • Pie charts especially useful for showing
    percentage distribution
  •     Pie charts can display large and small
    simultaneously without scale break
  •     A single bar chart is preferable to a single
    segmented bar chart
  •     A series of segmented bar charts is easier
    to read than a series of pie charts or ordinary
    bar charts

27
(No Transcript)
28
Variation of the basic bar chart
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
Plotting by sector rather than by
profession      Look at the data from a
different angle      Highlight different aspects
of the data
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
A back to back bar chart
Source JAMA, 1978, vol 239, no 21
38
Comparison of methods Stacked bar chart is also
a bar chart for the combined data Some of the
bars in a stacked bar chart are not aligned Bars
in clustered bar charts are aligned but it is
harder to visualize how the component bars would
stack up Back to back bar charts are applicable
when there are 2 groups only, the aggregated bars
are not aligned Series of stacked or segmented
bar charts useful in showing time trend
39
Time Trend
Exaggerate visually the increase in
prescriptions written per person by starting at 8
rather than 0
40
Stacked bar chart of yearly mortality rate per
1000 births
Pagano Gauvreau (1999) Principles of
Biostatistics, Duxbury.
41
  • Response under two treatments

Response to Treatment None Partial Complete Tot
al
Treatment
A 3 15 9 27
B 2 22 30 54
42
A misleading bar chart
By design, there are twice as many patients
receiving treatment B
43
Can compare the response type percentages for the
two treatments
44
(No Transcript)
45
Graphs for quantitative data     Histogram    
Frequency polygon     Box plot
46
Histogram Divide the range of the data into a
suitably chosen number of intervals/bins, all of
the same width The number of observations that
fall within each interval is plotted
Relative frequency histogram Plot the proportions
of observations that fall within the class
intervals

47
Wild Seber (2000) Chance Encounters, Wiley.
48
(No Transcript)
49
(No Transcript)
50
Comparison of methods
Histogram good at revealing distributional
shape such as symmetry, skewness, number of peaks
etc difficult to superimpose or draw side by
side Frequency polygons  can be superimposed
for easy comparison
51
Wild Seber (2000, p.59)
52
Can be superimposed
Pagano Gauvreau (1999)
53
Wild Seber (2000)
54
The median is the middle value (if n is odd) or
the average of the two middle values (if n is
even), it is a measure of the center of the
data
Median and quartiles
Sort the data in increasing order
  • Quartiles dividing the set of ordered values
    into 4 equal parts

Q2 second quartile median
first 25
second 25
third 25
fourth 25
Q1
Q2
Q3
IQR Interquartile range
55
Box plot Draw a box from the lower quartile to
the upper quartile and a line to mark the
position of the median Extend from both edges of
the box by 1.5 IQR, pull back the lines until
they hit observation Observations more than 1.5
IQR away from the lower or upper quartile are
marked out as outside values for further
investigation and checking
56
How a boxplot is constructed (Wild Seber, 2000,
p.73)
5-Number Summary min, lower quartile, median,
upper quartile, max
57
(No Transcript)
58
Advantages of box plot quick visual summary of a
data set capture prominent features like
location, spread, skewness and outliers can
easily draw a series of box plots side by side
not so for histograms
59
Dataset Hotdogs
60
Graphical Analysis of the Hotdogs data.
61
Parallel Box plots Can Be Quite Revealing
Rice (1995) Mathematical Statistics Data
Analysis, Duxbury Press.
1969
1972
Reduction in concentration through time Higher
during winter months Skewed toward higher
value Spread increases with level
(Parallel histograms much harder to visualise)
Write a Comment
User Comments (0)
About PowerShow.com