Title: HIM 3200 Midterm Review
1HIM 3200Midterm Review
2Mid-term review
- Types of data
- Normal distribution
- Variance
- Standard deviation and z scores
- 2 X 2 table
- Hypothesis testing H0 HA
- t-test
- Pearson r/Linear regression
- Chi square
3Measurements
- Frequency
- Incidence
- The frequency of new occurrences of disease,
injury, or death in the study population during
the time being examined. - Prevalence
- The number of persons in defined population that
had a specified disease or condition - Point prevalence (at a particular point in time.)
- Period prevalence (the sum of the point
prevalence at the beginning of the interval plus
the incidence during the interval.)
4Measurements
- Frequency
- Incidence
- Prevalence
- Risk
- The proportion of persons who are unaffected at
the beginning of a study period but who undergo
the risk event during the study period.
5- Risk event
- Death
- Disease
- Injury
- Cohort
- Persons at risk for the event .
6Measurements
- Frequency
- Incidence
- Prevalence
- Risk
- The proportion of persons who are uneffected at
the beginning of a study period but who undergo
the risk event during the study period. - Rates
- The frequency of events that occur in a defined
time period, divided by the average population at
risk.
7Rates
Numerator
Rate ------------------- x Constant
multiplier
Denominator
- The constant multiplier is usually 100, 1000,
10,000 or 100,000. - Types of rates
- Incidence rates (i.e. Per 1000)
- Prevalence rates (Proportional i.e. 20)
- Incidence density (frequency of new events per
person time)
8- Equations for the most commonly used population
data. - (Mortality) Table 1 10 p.18 Osborn text
- (Morbidity) Table 1 11 p. 21 Osborn text
9Differential and nondifferential error
- Bias is a differential error
- A nonrandom, systematic, or consistent error in
which the values tend to be inaccurate in a
particular direction. - Nondifferential are random errors
10Bias
- Three most problematic forms of bias in medicine
- 1. Selection (Sampling) Bias
The following are biases that
distort results because of the selection process
- Admission rate (Berksons) bias
- Distortions in risk ratios occur as a result of
different hospital admission rate among cases
with the risk factor, cases without the risk
factor, and controls with the risk factor
causing greatly different risk-factor
probabilities to interfere with the outcome of
interest. - Nonresponse bias
- i.e. noncompliance of people who have scheduled
interviews in their home. - Lead time bias
- A time differential between diagnosis and
treatment among sample subjects may result in
erroneous attribution of higher survival rates to
superior treatment rather than early detection.
11Bias
- Three most problematic forms of bias in medicine
- 1. Selection (Sampling) Bias
- Admission rate (Berksons) bias
- Nonresponse bias
- Lead time bias
- 2. Information (misclassification) Bias
- Recall bias
- Differentials in memory capabilities of sample
subjects - Interview bias
- blinding of interviewers to diseased and control
subjects is often difficult. - Unacceptability bias
- Patients reply with desirable answers
12Bias
- Three most problematic forms of bias in medicine
- 1. Selection (Sampling) Bias
- Admission rate (Berksons) bias
- Nonresponse bias
- Lead time bias
- 2. Information (misclassification) Bias
- Recall bias
- Interview bias
- Unacceptability bias
- 3. Confounding
- A confounding variable has a relationship with
both the dependent and independent variables that
masks or potentiates the effect of the variable
on the study.
13Neyman bias
- late look bias if it results in selecting fewer
individuals with severe disease because they died
before detection. - length bias in screening programs which tend to
select less aggressive cases for treatment.
142 X 2 Tablecomparing the test results of two
observers
Observer No. 1
Positive
Negative
Total
a
b
a b
Positive
Observer No. 2
d
c
c d
Negative
a c
b d
abcd
Total
15 _ A B
A B - C
D C D
A C B D
- Sensitivity A/(A C)
- Specificity D/(B D)
- False- positive rate B/(B D)
- False-negative rate C/(A C)
- Positive predictive value A/(A B)
- Negative predictive value D/ (D C)
- Accuracy (A D) / (A B C D)
16Types of Variation
- Nominal variables
- Dichotomous (Binary) variables
- Ordinal (Ranked) variables
- Continuous (Dimensional) variables
- Ratio variables
- Risks and Proportions as variables
17Nominal
A
Social Security Number
O
123 45 6789 312 65 8432 555 44 7777
Blood Type
B
AB
18Types of Variation
- Nominal variables
- Dichotomous (Binary) variables
- Ordinal (Ranked) variables
- Continuous (Dimensional) variables
- Ratio variables
- Risks and Proportions as variables
19Dichotomous (Binary) variables
WNL Not WNL
Normal Abnormal
Accept Reject
20Types of Variation
- Nominal variables
- Dichotomous (Binary) variables
- Ordinal (Ranked) variables
- Continuous (Dimensional) variables
- Ratio variables
- Risks and Proportions as variables
21Ordinal (Ranked) variables
Strongly agree, agree, neutral, disagree,
strongly disagree
a b c d e
1 2 3 4 5
22Types of Variation
- Nominal variables
- Dichotomous (Binary) variables
- Discrete variables
- Ordinal (Ranked) variables
- Continuous (Dimensional) variables
- Ratio variables
- Risks and Proportions as variables
23Continuous (Dimensional) variables
Temperature 32 F
Height Blood Pressure Weight
24Types of Variation
- Nominal variables
- Dichotomous (Binary) variables
- Discrete variables
- Ordinal (Ranked) variables
- Continuous (Dimensional) variables
- Ratio variables
- Risks and Proportions as variables
25Ratio variables
- A continuous scale that has a true zero point
26Measures of Central Tendency
- Mode the value with the highest number of
observations in a data set. - Median the middle observation when data have
been arranged from highest to lowest. - Mean (arithmetic) the average value of all
observed values.
? (xi)
Mean x
Ni
Sum ? Observed values xi Total number of
observations Ni
27Raw data and results of Cholesterol levels in
26 subjects p.115
- Number of observations or N 26
- Initial HDL values 31, 41, 44, 46, 47, 47, 48,
49, 52, 53, 54, 57, 58, 58, 60, 60, 62, 63,
64, 67, 69, 70, - 77, 78, 81, 90 mg/dl
- Highest values 90 mg/dl
- Lowest value 31 mg/dl
- Mode 47, 48, 58, 60 mg/dl
- Median (57 58)/2 57.5 mg/dl
- Sum of the values ? (xi) 1496 mg/dl
- Means, x 1496/26 57.5 mg/dl
-
28Percentiles (quantiles)
- The median is the 50
- The 75th percentile is the point where 75 of
observations lie below and 25 are above. (3rd
quartile, Q3) - The 25th percentile is the point where 25 of
observations lie below and 75 are above. (1st
quartile, Q1) - Interquartile range (Q3 Q1)
29Raw data and results of Cholesterol levels in
26 subjects p.115
- Number of observations or N 26
- Initial HDL values 31, 41, 44, 46, 47, 47, 48,
48, 49, 52, 53, 54, 57, 58, 58, 60, 60, 62,
63, 64, 67, 69, 70, - 77, 78, 81, 90 mg/dl
- Highest values 90 mg/dl
- Lowest value 31 mg/dl
- Mode 47, 48, 58, 60 mg/dl
- Median (57 58)/2 57.5 mg/dl
- Sum of the values ? (xi) 1496 mg/dl
- Means, x 1496/26 57.5 mg/dl
- Interquartile range 64 48 16 mg/dl
-
30Measures of dispersion based on the Mean.
- Mean deviation
- Variance
- Standard deviation s
31Raw data and results of Cholesterol levels in
26 subjects p.115
- Number of observations or N 26
- Initial HDL values 31, 41, 44, 46, 47, 47, 48,
48, 49, 52, 53, 54, 57, 58, 58, 60, 60, 62,
63, 64, 67, 69, 70, - 77, 78, 81, 90 mg/dl
- Highest values 90 mg/dl
- Lowest value 31 mg/dl
- Mode 47, 48, 58, 60 mg/dl
- Median (57 58)/2 57.5 mg/dl
- Sum of the values ? (xi) 1496 mg/dl
- Means, x 1496/26 57.5 mg/dl
- Interquartile range 64 48 16 mg/dl
- Sum of squares (TSS) 4,298.46 mg/dl
- Variance, s squared 171.94 mg/dl
- Standard Deviation, s 171.94 mg/dl 13.1
mg/dl -
32Theoretical normal (gaussian) distribution
- ? stands for the mean in a theoretical
distribution - ? stands for the standard deviation in a
theoretical population.
33Theoretical normal distribution with standard
deviations
-3?
-2?
-?
?
?
2?
3?
-3
-2
-1
1
2
3
0
Z scores
34Three Common Areas Under the Curve
- Three Normal distributions with different areas
35 Process of Testing Hypotheses
- Test are designed to determine the probability
that a finding represents the true deviation from
what is expected. - This chapter focuses on the justification for and
interpretation of the p value designed to
minimized type I error. - Science is based of the following principles
- Previous experience serves as the basis for
developing hypotheses - Hypotheses serve as the basis for developing
predictions - Predictions must be subjected to experimental or
observational testing.
36Hypothesis testing
Truth
H0 True
H0 False
a
b
Type II Error
Correct
Accept H0
Decision
d
c
Correct
Type I Error
Reject H0
Alpha error rejecting the null H0 when it is
true
Beta error accepting the null H0 when it is
false
37The power of a test
- (probability that a test detects differences that
actually exist) can be determined by using the
formula 1 beta (1 - ?) - 80 is usually acceptable
38Hypothesis Testing
- 1. State question in terms of
- H0 no difference or relationship (null)
- Ha is difference or relationship (alternative)
- 2. Decide on appropriate research design and
statistic - Select significance (alpha) level and N
- Collect data
- Analyze and perform calculation to get P-value
- Draw and state conclusions by comparing alpha
with P-value
39Theoretical normal distribution with standard
deviations
-3?
-2?
-?
?
?
2?
3?
Z scores
-3
-2
-1
1
2
3
0
Probability
Upper tail .1587 .02288
.0013 Two-tailed .3173 .0455 .0027
40When is a specific test used?
- Students t test to compare the means of two
small (n lt 30) independent samples. - Paired t-test to compare the means of two
paired samples (e.g. before and after) - F test to compare means of three or more
samples or groups. - Chi-Square test comparing two or more
independent proportions. - Correlation coefficient measures the strength
of the association between two variables. - Regression analysis Provides an equation that
estimates the change in a dependent variable (y)
per unit change in an independent variable (x).