Title: INTRODUCTION TO BIOSTATISTICS
1INTRODUCTION TO BIOSTATISTICS
- DR.S.Shaffi Ahamed
- Asst. Professor
- Dept. of Family and Comm. Medicine
- KKUH
2This session covers
- Background and need to know Biostatistics
- Definition of Statistics and Biostatistics
- Types of data
- Graphical representation of a data
- Frequency distribution of a data
3 4 Dynamic nature of the U n i v e r s e the
very continuous change in Nature brings -
uncertainty and - variability in each and
every sphere of the Universe
5We by no mean can control or over-power the
factor of uncertainty but capable of measuring
it in terms of Probability
6Sources of Medical Uncertainties
- Natural variation due to biological,
environmental and sampling factors - Natural variation among methods, observers,
instruments etc. - Errors in measurement or assessment or errors in
knowledge - Incomplete knowledge
7-
- Biostatistics is the science which helps in
managing health care uncertainties
8- Statistics is the science which deals with
collection, classification and tabulation of
numerical facts as the basis for explanation,
description and comparison of phenomenon. - ------ Lovitt
9BIOSTATISICS
- (1) Statistics arising out of biological
sciences, particularly from the fields of
Medicine and public health. - (2) The methods used in dealing with statistics
in the fields of medicine, biology and public
health for planning, conducting and analyzing
data which arise in investigations of these
branches.
10Reasons to know about biostatistics
- Medicine is becoming increasingly quantitative.
- The planning, conduct and interpretation of much
of medical research are becoming increasingly
reliant on the statistical methodology. - Statistics pervades the medical literature.
11CLINICAL MEDICINE
- Documentation of medical history of diseases.
- Planning and conduct of clinical studies.
- Evaluating the merits of different procedures.
- In providing methods for definition of normal
and abnormal.
12PREVENTIVE MEDICINE
- To provide the magnitude of any health problem
in the community. - To find out the basic factors underlying the
ill-health. - To evaluate the health programs which was
introduced in the community (success/failure). - To introduce and promote health legislation.
13BASIC CONCEPTS
Data Set of values of one or more variables
recorded on one or more observational units
Sources of data 1. Routinely kept
records 2. Surveys (census) 3.
Experiments 4. External source
Categories of data 1. Primary data
observation, questionnaire, record form,
interviews, survey, 2. Secondary data census,
medical record,registry
14TYPES OF DATA
- QUALITATIVE DATA
- DISCRETE QUANTITATIVE
- CONTINOUS QUANTITATIVE
15QUALITATIVE
- Nominal
- Example Sex ( M, F)
- Exam result (P, F)
- Blood Group (A,B, O or AB)
- Color of Eyes (blue, green,
- brown,
black)
16- ORDINAL
- Example
- Response to treatment
- (poor, fair, good)
- Severity of disease
- (mild, moderate, severe)
- Income status (low, middle,
- high)
17- QUANTITATIVE (DISCRETE)
-
- Example The no. of family members
- The no. of heart beats
- The no. of admissions in a day
- QUANTITATIVE (CONTINOUS)
-
- Example Height, Weight, Age, BP, Serum
- Cholesterol and BMI
18Discrete data -- Gaps between possible values
Number of Children
Continuous data -- Theoretically, no gaps between
possible values
Hb
19Scale of measurement
Qualitative variable A categorical
variable Nominal (classificatory) scale -
gender, marital status, race Ordinal (ranking)
scale - severity scale, good/better/best
20Scale of measurement
Quantitative variable A numerical variable
discrete continuous Interval scale Data is
placed in meaningful intervals and order. The
unit of measurement are arbitrary. -
Temperature (37º C -- 36º C 38º C-- 37º C are
equal) and No implication of ratio (30º C
is not twice as hot as 15º C)
21- Ratio scale
- Data is presented in frequency distribution in
logical order. A meaningful ratio exists. - - Age, weight, height, pulse rate
- - pulse rate of 120 is twice as fast as 60
- - person with weight of 80kg is twice as heavy
as the one with weight of 40 kg.
22Scales of Measure
- Nominal qualitative classification of equal
value gender, race, color, city - Ordinal - qualitative classification which can
be rank ordered socioeconomic status of
families - Interval - Numerical or quantitative data can
be rank ordered and sizes compared temperature
- Ratio - Quantitative interval data along with
ratio time, age.
23- CONTINUOUS DATA
-
-
- QUALITATIVE DATA
- wt. (in Kg.) under wt, normal over wt.
- Ht. (in cm.) short, medium tall
24(No Transcript)
25CLINIMETRICS
- A science called clinimetrics in which qualities
are converted to meaningful quantities by using
the scoring system. - Examples (1) Apgar score based on appearance,
pulse, grimace, activity and respiration is used
for neonatal prognosis. - (2) Smoking Index no. of cigarettes, duration,
filter or not, whether pipe, cigar etc., - (3) APACHE( Acute Physiology and Chronic Health
Evaluation) score to quantify the severity of
condition of a patient
26(No Transcript)
27(No Transcript)
28(No Transcript)
29INVESTIGATION
30- Frequency Distributions
- A Picture is Worth a Thousand Words
31Frequency Distributions
- What is a frequency distribution? A frequency
distribution is an organization of raw data in
tabular form, using classes (or intervals) and
frequencies. - What is a frequency count? The frequency or the
frequency count for a data value is the number of
times the value occurs in the data set.
32 Frequency Distributions
- data distribution pattern of variability.
- the center of a distribution
- the ranges
- the shapes
- simple frequency distributions
- grouped ungrouped frequency distributions
33Categorical or Qualitative Frequency Distributions
- What is a categorical frequency distribution?
- A categorical frequency distribution
represents data that can be placed in specific
categories, such as gender, blood group, hair
color, etc.
34Categorical or Qualitative Frequency
Distributions -- Example
- Example The blood types of 25 blood donors are
given below. Summarize the data using a
frequency distribution. - AB B A O B
- O B O A O
- B O B B B
- A O AB AB O
- A B AB O A
35Categorical Frequency Distribution for the Blood
Types -- Example Continued
Note The classes for the distribution are the
blood types.
36Quantitative Frequency Distributions -- Ungrouped
- What is an ungrouped frequency distribution?
- An ungrouped frequency distribution simply
lists the data values with the corresponding
frequency counts with which each value occurs. -
37Quantitative Frequency Distributions Ungrouped
-- Example
- Example The at-rest pulse rate for 16 athletes
at a meet were 57, 57, 56, 57, 58, 56, 54, 64,
53, 54, 54, 55, 57, 55, 60, and 58. Summarize
the information with an ungrouped frequency
distribution. -
38Quantitative Frequency Distributions Ungrouped
-- Example Continued
Note The (ungrouped) classes are the observed
values themselves.
39Example of a simple frequency distribution
(ungrouped)
- 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1
- f
- 9 3
- 8 2
- 7 2
- 6 1
- 5 4
- 4 4
- 3 3
- 2 3
- 1 3
- ?f 25
40Relative Frequency Distribution
- Proportion of the total N
- Divide the frequency of each score by N
- Rel. f f/N
- Sum of relative frequencies should equal 1.0
- Gives us a frame of reference
41Relative Frequency
Example The relative frequency for the ungrouped
class of 57 will be 4/16 0.25.
42Relative Frequency Distribution
Note The relative frequency for a class is
obtained by computing f/n.
43Example of a simple frequency distribution
- 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1
- f
rel f - 9 3 .12
- 8 2 .08
- 7 2 .08
- 6 1 .04
- 5 4 .16
- 4 4 .16
- 3 3 .12
- 2 3 .12
- 1 3 .12
- ?f 25 ? rel f 1.0
44Cumulative Frequency and Cumulative Relative
Frequency
- NOTE Sometimes frequency distributions are
displayed with cumulative frequencies and
cumulative relative frequencies as well.
45Cumulative Frequency and Cumulative Relative
Frequency
- What is a cumulative frequency for a class? The
cumulative frequency for a specific class in a
frequency table is the sum of the frequencies for
all values at or below the given class. -
46Cumulative Frequency and Cumulative Relative
Frequency
- What is a cumulative relative frequency for a
class? The cumulative relative frequency for a
specific class in a frequency table is the sum of
the relative frequencies for all values at or
below the given class. -
47Cumulative Frequency and Cumulative Relative
Frequency
Note Table with relative and cumulative relative
frequencies.
48Example of a simple frequency distribution
(ungrouped)
- 5 7 8 1 5 9 3 4 2 2 3 4 9 7 1 4 5 6 8 9 4 3 5 2 1
- f cf
rel f rel. cf - 9 3 3 .12 .12
- 8 2 5 .08 .20
- 7 2 7 .08 .28
- 6 1 8 .04 .32
- 5 4 12 .16 .48
- 4 4 16 .16 .64
- 3 3 19 .12 .76
- 2 3 22 .12 .88
- 1 3 25 .12 1.0
- ?f 25 ? rel f 1.0
49Quantitative Frequency Distributions -- Grouped
- What is a grouped frequency distribution? A
grouped frequency distribution is obtained by
constructing classes (or intervals) for the data,
and then listing the corresponding number of
values (frequency counts) in each interval.
50Tabulate the hemoglobin values of 30 adult male
patients listed below
Patient No Hb (g/dl) Patient No Hb (g/dl) Patient No Hb (g/dl)
1 12.0 11 11.2 21 14.9
2 11.9 12 13.6 22 12.2
3 11.5 13 10.8 23 12.2
4 14.2 14 12.3 24 11.4
5 12.3 15 12.3 25 10.7
6 13.0 16 15.7 26 12.5
7 10.5 17 12.6 27 11.8
8 12.8 18 9.1 28 15.1
9 13.2 19 12.9 29 13.4
10 11.2 20 14.6 30 13.1
51Steps for making a table
- Step1 Find Minimum (9.1) Maximum (15.7)
- Step2 Calculate difference 15.7 9.1 6.6
- Step3 Decide the number and width of
- the classes (7 c.l) 9.0 -9.9,
10.0-10.9,---- - Step4 Prepare dummy table
- Hb (g/dl), Tally mark, No. patients
52 DUMMY TABLE
Tall Marks TABLE
53Table Frequency distribution of 30 adult male
patients by Hb
54Table Frequency distribution of adult patients
by Hb and gender
55Elements of a Table
Ideal table should have Number
Title Column headings
Foot-notes Number Table number
for identification in a report Title,place
- Describe the body of the table,
variables, Time period (What, how
classified, where and when) Column -
Variable name, No. , Percentages (),
etc., Heading Foot-note(s) - to describe some
column/row headings, special cells,
source, etc.,
56Table II. Distribution of 120 (Madras)
Corporation divisions according to annual death
rate based on registered deaths in 1975 and 1976
Figures in parentheses indicate percentages
57DIAGRAMS/GRAPHS
- Discrete data
- --- Bar charts (one or two groups)
- Continuous data
- --- Histogram
- --- Frequency polygon (curve)
- --- Stem-and leaf plot
- --- Box-and-whisker plot
58Example data
68 63 42 27 30 36 28 32 79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31 28 25 45 12 57 51 12 3
2 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43
27 49 28 23 19 11 52 46 31 30 43 49 12
59Histogram
Figure 1 Histogram of ages of 60 subjects
60Polygon
61Example data
68 63 42 27 30 36 28 32 79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31 28 25 45 12 57 51 12 3
2 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43
27 49 28 23 19 11 52 46 31 30 43 49 12
62Stem and leaf plot
Stem-and-leaf of Age N 60 Leaf Unit
1.0 6 1 122269 19 2
1223344555777788888 11 3 00111226688 13
4 2223334567999 5 5 01127 4
6 3458 2 7 49
63Box plot
64Descriptive statistics report Boxplot
- - minimum score
- maximum score
- lower quartile
- upper quartile
- median
- - mean
- the skew of the distribution positive
skew mean gt median high-score whisker is
longer negative skew mean lt median
low-score whisker is longer
65Pie Chart
- Circular diagram total -100
- Divided into segments each representing a
category - Decide adjacent category
- The amount for each category is proportional to
slice of the pie
The prevalence of different degree of
Hypertension in the population
66Bar Graphs
Heights of the bar indicates frequency Frequency
in the Y axis and categories of variable in the X
axis The bars should be of equal width and no
touching the other bars
The distribution of risk factor among cases with
Cardio vascular Diseases
67HIV cases enrolment in USA by gender
Bar chart
68HIV cases Enrollment in USA by gender
Stocked bar chart
69Graphic Presentation of Data
the frequency polygon (quantitative data)
the histogram (quantitative data)
the bar graph (qualitative data)
70(No Transcript)
71General rules for designing graphs
- A graph should have a self-explanatory legend
- A graph should help reader to understand data
- Axis labeled, units of measurement indicated
- Scales important. Start with zero (otherwise //
break) - Avoid graphs with three-dimensional impression,
it may be misleading (reader visualize less easily
72