Title: Introduction to Biostatistics
1Introduction to Biostatistics By Desta
Markos (MPH/ Epidemiology Biostatistics)
Wolaita Sodo University School of Public Health
Department of Epidemiology and Biostatistics
1
2Learning Objectives
- Define Statistics and Biostatistics
- Identify the branches of statistics
- Enumerate the importance and limitations of
statistics - Define and Identify the different types of data
and understand why we need to classifying
variables - Identify the different methods of data collection
and criterion that we use to select a method of
data collection - Define a questionnaire, identify the different
parts of a questionnaire and indicate the
procedures to prepare a questionnaire
2
3Introduction to biostatistics
- Statistics is the science of gaining information
from data through - Collection
- Organization
- Presentation
- Analysis and drawing conclusion (inferences) from
data. - Statistics is the summary of information (data)
in a meaningful fashion, and its appropriate
presentation.
3
4Introduction biostatistics
- Biostatistics is the segment of statistics that
deals with data arising from biological
processes or medical experiments. - When the data being analyzed are derived from the
biological sciences and medicine, we use the
term biostatistics. - Has central role in medical investigations
- Concerned with interpretation of biological data
the - communication of information about data.
4
5Branches of Statistics
- 1. Descriptive statistics
- refers to the different methods applied in order
to organize, summarize and present data in a
form which will make them easier to analyze and
interpret. - (tabulation, graphical presentation, computation o
f - averages as well as measures of variability).
- Ways of organizing and summarizing data.
- Helps to identify the general features and trends
in a set of data and extracting useful
information.
5
6Branches of Biostatistics
- 2. Inferential statistics
- Is the process of generalizing or drawing
conclusions about the target population on the
basis of results obtained from a sample. - The inferences are drawn from particular
properties of sample to particular properties of
population. - Inferential statistics builds upon descriptive
statistics. - Example Principles of probability, estimation,
hypothesis testing, etc. 6
7Statistical Methods
Biostatistics
Descriptive Statistics
Inferential Statistics
collection organizing summarizing presenting
of data
making inferences hypothesis testing
determining relationship making the prediction
8Uses of biostatistics
- Provide a way of organizing information
- Assessment of health status
- Health program evaluation
- Resource allocation
- Magnitude of association
- Strong vs weak association between exposure and
outcome
9Uses of biostatistics
- Assessing risk factors
- Cause effect relationship
- E.g. Evaluation of a new vaccine or drug
- How effective is the vaccine (drug)?
- Is the effect due to chance or some bias?
- Drawing of inferences
- Information from sample to population
10Limitations of statistics
- It deals with only those subjects of inquiry that
are capable of being quantitatively measured and
numerically expressed. - It deals with aggregates of facts and no
importance is attached to individual items - Statistical data are only approximately and not
mathematically correct.
11Variable
- Variable A variable is a characteristic of a pers
on, object, or - phenomenon that can take on different values.
- Any aspect of an individual or object that is
measured (e.g. BP) or recorded (e.g. age, sex)
and takes any value. - There may be one variable in a study or many.
- Variables can be broadly classified into
- Categorical (or Qualitative) and
- Numerical variables(or Quantitative).
121. Categorical variable
- A variable which can not be measured in
quantitative form but can only be sorted by name
or categories - Not able to be measured as we measure height or
weight - The notion of magnitude is absent or implicit.
13Categorical variable is divided into two
- I) Nominal
- The simplest type of data, in which the values
fall into un-ordered categories or classes - Uses names, labels or symbols to assign each
measurement. - Examples Blood type, sex, race, marital status
- II) Ordinal
- The observations are classified into categories
that can be ordered - in an ascending series.
- Although non-numerical, can have a natural
ordering - The spaces or intervals b/n the categories are
not necessarily equal. - Examples Patient status, cancer stages,
social class
142. Quantitative variable
- A variable that can be measured expressed
numerically. - Height, weight, of children, etc.
- Has the notion of magnitude.
or counted and
15Quantitative variable is divided into two
- 1. Discrete when numbers represent actual
measurable quantities rather than mere labels. - Discrete data are restricted to taking only
specified values often integers or counts that
differ by fixed amounts. - E.g. the number of episodes of diarrhoea a
child has had in a year. - Characterized by gaps or interruptions in the
values. - Both the order and magnitude of the values
matter. - The values are not just labels, but are actual mea
surable quantities.
162. Continuous variable
- represent measurable quantities but are not
restricted to taking on certain specific values
i.e. fractional values are possible - It can have an infinite number of possible values
in any given interval. - Both the magnitude and the order of the values
matter - Does not possess the gaps or interruptions
- E.g. Weight, Height, etc.
17Types of Variable
Qualitative or categorical
Quantitative measurement
Nominal (not ordered) e.g. ethnic group
Continuous (real-valued) e.g. height
Ordinal (ordered)
Discrete (count data) e.g. of admissions
e.g. response to
treatment
18Depending on scales of measurement we have
- All measurements are not the same.
- Measuring weight e.g. 40kg
- Measuring the status of a patient on scale
improved, stable, not improved. - There are four types of scales of measurement.
191. Nominal scale
- The simplest type of data, in which the values fal
l into un- ordered categories or classes - Uses names, labels or symbols to assign each
measurement. - Categories of the variable that are exhaustive and
mutually exclusive - They are the lowest level of measurements
- Examples Blood type, sex, race, marital status
20Example of nominal Scale
- Race/Ethnicity
- Black
- White
- Latino
- Other
- The numbers have NO meaning
- They are labels only
21- If nominal data take only two possible values,
they are called - dichotomous or binary.
- E.g. sex is dichotomous (male or female).
- Yes/no questions
- E.g., cured from TB at 6 months of Rx
- With nominal scale data the obvious and
intuitive descriptive summary measure is the
proportion or percentage of subjects who exhibit
the attribute.
222. Ordinal scale
- Assigns each measurement to one of a limited
number of categories that are ranked in terms of
order. - Although non-numerical, can be considered to have
a natural - ordering
- Examples Patient status, severity of an illnes
s may be categorized as mild, moderate or severe.
23Example of ordinal scale
- The numbers have LIMITED meaning 4gt3gt2gt1 is all
we know apart from their utility as labels
- Pain level
- None
- Mild
- Moderate
- Severe
243. Interval scale
- Measured on a continuum
- Differences between any two numbers on a scale are
of known size.
254. Ratio scale - Measurement begins at a true
zero point and the scale has equal
space. - Examples Height, weight, BP, etc.
26Nominal
Interval
Ordinal
Ratio
Degree of precision in measuring
27Exercises
- Give the correct scales of measurement for each
variable - Blood group
- Temperature (Celsius)
- Hair colour
- Job satisfaction index (1-5)
- Number of heart attacks
- Calendar year
- Serum uric acid (mg/100ml)
- Number of accidents in a 3 - year period
- Number of cases of each reportable disease
reported by a health worker - The average weight gain of 6 1-year old dogs with
- a special diet supplement was 950 grams last
month - Ethnic group
28Definition of terms
- Census Complete enumeration of the population
- Sampling The technique of selecting
representative portion of the entire - population
- Data
- Numbers which can be measurements or can be
obtained by counting - The raw material for statistics
- Can be obtained from
- Routinely kept records , Surveys , Counting
- Experiments , Reports
29
29Sources of Data
- Primary sources of data it needs the involvement
of the researcher himself. Census and sample
survey are sources of primary types of data. - Secondary sources of data In this case data were
obtained from already collected sources like
newspaper, magazines, CSA, DHS, hospital records
and existing data like - Mortality reports
- Morbidity reports
- Epidemic reports
- Reports of laboratory utilization (including
laboratory test results)30
30Techniques of Primary Data collection
- Data collection is a crucial stage in the
planning and - implementation of a study
- If the data collection has been superficial,
biased or incomplete, data analysis becomes
difficult, and the research report will be of
poor quality. - Therefore, we should concentrate all possible
efforts on developing appropriate tools, and
should test them several times.
31
31Techniques of Primary Data collection
- Observation
- is a technique that involves systematically
selecting, watching and recording behavior and
characteristics of living beings, objects or
phenomena. - It can be undertaken in different ways
- Participant observation The observer takes part
in the situation he or she observes. - Non-participant observation The observer watches
the situation,
32
openly or concealed, but does not participate.
32Techniques of Primary Data collection
? Observations can give additional, more accurate
information on behavior of people than
interviews or questionnaires ? Observations can
also be made on objects ? For example, the
presence or absence of a latrine and its state of
cleanliness may be observed. ? observation
would be the major research technique
33Techniques of Primary Data collection
- Interview (face-to-face)
- Is a data-collection technique that involves oral
questioning of respondents, either individually
or as a group. - Answers to the questions posed during an
interview can be recorded by - writing them down (either during the interview
itself or immediately after the interview) or - by tape-recording the responses, or by a
combination of both.
34Techniques of Primary Data collection
- Administered written questionnaire is a data
collection tool in which written questions are
presented that are to be answered by the
respondents in written form. - It can be administered in different ways, such as
by - Sending questionnaires by mail with clear
instructions on how to answer the questions and
asking for mailed responses - Gathering all or part of the respondents in one
place at one time, giving oral or written
instructions, and letting the respondents fill
out the questionnaires - Hand-delivering questionnaires to respondents and
collecting them later
35
35Techniques of Primary Data collection
- Focus group discussions
- It allows a group of 8 - 12 informants to freely
discuss a - certain subject with the guidance of a
facilitator or reporter - In-depth interview
- It is a conversion between the researcher and the
respondent about the research area or topic. - It is designed to allow the respondent to tell
their story in their own way - Issues covered in detail respondent leads the
- interviews/sets the agenda no fixed order 36
36Types of questions
- Depending on how questions are asked and recorded
- we can distinguish two major possibilities
- 1. Open-ended questions (allowing for completely
open as well as partially categorized answers). - It permit free responses which should be recorded
in the respondents' own words.
37Types of questions
- Such questions are useful for obtaining in-depth
information on - facts with which the researcher is not very
familiar, - opinions, attitudes and suggestions of
informants, or - sensitive issues.
38Types of questions
- Example
- 'What is your opinion on the services provided in
the ANC?' - (Explain why.)
- 'What do you think are the reasons some
adolescents in this - area start using drugs?
- 'What would you do if you noticed that your
daughter (school girl) had a relationship with a
teacher?'
39Types of questions
- Advantage of open-ended questions
- Allow you to probe more deeply into issues of
interest being raised. - Information provided in the respondents' own words
might be useful - Risks of completely open-ended questions
- A big risk is incomplete recording of all
relevant issues covered in - the discussion.
- Analysis is time-consuming and requires experience
otherwise important data may be lost.
40Types of questions
- 2. Closed questions have a list of possible
options or answers from which the respondents
must choose. - Closed questions are most commonly used for
background variables such as age, marital status
or education, although in the case of age and
education you may also take the exact values and
categorize them during data analysis
41Types of questions
- Example
- Women who have induced abortion should be severel
y punished."
42Types of questions
- Advantages of closed ended questions
- It saves time
- Comparing responses of different groups, or of
the same group - over time, becomes easier.
- Risks of closed ended questions
- In case of illiterate respondents, bias will be
introduce - Many choices can be confusing
- Can't tell if respondent misinterpreted the
question - Fine distinctions may be lost
43Steps in designing questionnaire
Step 1 Content Step 2 Formulating questions
Step 3 Sequencing the questions Step 4
Formatting the questionnaire Step 5
Translation Step 6 pre-test
44Steps in designing questionnaire
- 1. Content Take your objectives and variables as
a starting point - Decide what questions will be needed to measure
or to define your variables and reach your
objectives. - Formulating questions Formulate one or more
questions that will provide the information
needed for each variable. - Check whether each question measures one thing at
a time. - Avoid leading questions.
- Ask sensitive questions in a socially acceptable
way. - Take care that questions are specific and precise
enough that different respondents do not
interpret them differently.
45Cont
- For example, the question, ''How large an
interval would you and your husband prefer
between two successive births?'' would better be
divided into two questions because husband and
wife may have different opinions on the
preferred interval. - A question is leading if it suggests a certain
answer. For example, the question, ''Do you
agree that the district health team should visit
each health center monthly?'' hardly leaves room
for no or for other options. - Better would be Do you think that district
health teams should visit each health center? If
yes, how often?
46Steps in designing questionnaire
- 3. Sequencing the questions Design your
interview schedule or questionnaire to be
'informant friendly" - Arrange questions in logical sequence
- Group questions by topic, and place a few sentence
s of transition between topics
47Cont
- Pose more sensitive questions as late as possible
in the interview (e.g., questions pertaining to
income, sexual behavior, or diseases with stigma
attached to them, etc. - Use simple everyday language.
48Cont"d
- 4. Formatting the questionnaire
- When you finalize your questionnaire, be sure
that - A separate, introductory page is attached to each
questionnaire
49Steps in designing questionnaire
- explaining the purpose of the study
- requesting the informant's consent to be
interviewed - assuring confidentiality of the data obtained.
- Each questionnaire has a heading and space to
insert the number, - date and location of the interview
- You may add the name of the interviewer, to facili
tate quality - control.
50Steps
- 5. Translation
- If interview will be conducted in one or more
local languages, the questionnaire has to be
translated to standardize the way questions will
be asked. - After having it translated you should have it
retranslated into the original language. You can
then compare the two versions for differences
and make a decision concerning the final phrasing
of difficult concepts. - 6. Pre-test
- Include thank you after the last question
51(No Transcript)