Title: Materials and Methods
1 Materials and Methods
Abe E. Sahmoun, Ph.D. Assistant Professor
Epidemiology Department of Internal Medicine
2ContentsMaterials and Methods Section
- Function
- Content
- Study design
- Variables definition
- Data analysis
- Statistical details
- Length
- Examples
- What variables should be collected and why?
- P-value
- Commonly Used Statistical Test
- Correlation coefficient
- T-test
- Chi-Square test
3Materials Methods (MM)Function
- The aim of MM section is to tell the reader what
you did in terms of study design, setting, and
variables collected to answer your question. - MM section should include sufficient detail and
references to permit a scientist to evaluate your
work fully or to repeat the study exactly as you
did.
4Materials Methods study Design
- What you did
- Study Design This should include the following
info - Independent variables
- Dependent variable
- Sample size
- What the association consisted of
- Outcome
- Exposure
- Duration
- Outcome
- Exposure
5Materials MethodsVariables Definition-IRB
statement
- State how you calculated derived variables (e.g.
BMI, drug). - Human subjects Give enough information about
age, gender, race, BMI, disease, and specific
medical and surgical management to be of use to
researchers who want to compare your data with
theirs, or to clinicians who want to see if your
findings are applicable to their patients. - The study was approved by the Institutional
Review Boards of the Hospital and the University
of North Dakota
6Materials MethodsData Analysis
- State how you summarized your data.
- Provide information about both the magnitude of
the data and the variability. - When data are normally distributed, we can use
mean and standard deviation to summarize the
data. The mean provides a description of the
overall magnitude of the data. The standard
deviation provides a measure of the variability
in the sample. - If data has a skewed distribution, you should
report the median and the interquartile range
(range between 25th and 75th percentiles)
7Materials MethodsStatistical details
- State which software you used to analyze your
data (including version or release number) - State p-value at which you considered differences
statistically significant. - A p-value is not always sufficient to determine
whether you fail to reject or reject a
hypothesis. A difference can be statistically
significant because the sample size is large
rather than because a treatment has a large
effect. - We assess the size of the difference in
comparison with the variability in the data
sample by calculating the 95 C.I.
8Materials MethodsLength
- The methods section should be as long as
necessary to describe fully and accurately what
was done and how it was done. - Methods are reported in the past tense (e.g. we
measured..)
9ExampleThe right way
Magura L, et al. Hypercholesterolemia and
prostate cancer a hospital-based case-control
study. Cancer Causes Control 2008
Dec19(10)1259-66.
10Methods
- We performed a retrospective analysis of medical
charts of patients newly diagnosed with prostate
cancer between 2004 and 2006. Cases were
identified from the cancer registry of Meritcare
hospital, North Dakota, USA. Controls were
identified from the primary care database of the
same hospital. This facility serves the Fargo
Metropolitan Area comprising all of Cass County,
North Dakota and Clay County, Minnesota. Its
population, according to the 2006 estimate, is
approximately 200,000. The majority (96) of the
population served in this area is White. The
North Dakota Cancer Registry releases annual
cancer statistics when the registrys data is
estimated to be 95 complete for any given
cancer-reporting year. The study was approved by
the Institutional Review Boards of the Hospital
and the University of North Dakota.
11Demographic and Clinical Variables
- Data on age, family history of prostate cancer,
histology, stage at diagnosis (TNM system), body
mass index, occupation, smoking status, Prostate
Specific Antigen (PSA), Gleason score, lipid
profiles, statins use, non-steroidal
anti-inflammatory use (NSAIDs), comorbidities,
and multivitamin use were abstracted using
electronic records and medical charts. - Covariates information was obtained for the
period prior to diagnosis for cases and prior to
exam for controls. Inclusion and exclusion
criteria were as follows
12Inclusion and exclusion criteriaCases
- The inclusion criteria for cases were men with
incident, histologically confirmed prostate
cancer as a primary site with cancer diagnosed
between 2004 and 2006 using a pathology report
present in the medical records, age between 50
and 74 and date of lipid profiles tests within a
year prior to the diagnosis of prostate cancer. - The exclusion criteria included diagnosis of any
cancer other than primary prostate cancer and
race other than Caucasian (excluded because of
small numbers lt6 of residents of Fargo-Moorhead
are non-Caucasian).
13Inclusion and exclusion criteriaControls
- The inclusion criteria for controls were men who
had an annual physical exam between 2004 and 2006
at the same hospital as cases, age between 50 and
74, without cancer seen at the same hospital as
cases, and date of lipid profiles tests within a
year of the annual physical exam. - The exclusion criteria included diagnosis of any
cancer, prostate specific antigen 4 ng/l (in
order to exclude undiagnosed prostate cancer),
and race other than Caucasian.
14Exposure Definition
- We used the NCEP definition of hypercholesterolemi
a as total cholesterol greater than 5.17 (mmol/l)
23. For comparison with previous studies, the
prevalence of hypercholesterolemia was also
calculated using a cutpoint of6.2 (mmol/l). - Statin use was classified as hydrophobic only
users (lovastatin, simvastatin, atorvastatin, or
fluvastatin) or hydrophilic only users
(pravastatin or rosuvastatin) as reported
elsewhere 24. No other lipid lowering agents
were in use among this study population. - Factors that may confound the association between
cholesterol and prostate cancer, such as family
history of prostate cancer, body mass index,
statins use, smoking, type 2 diabetes, and
multivitamin use were included in our analyses as
potential confounders.
15Statistical Analyses
- Odds ratios (OR) and 95 confidence intervals
(CI) were estimated using unconditional multiple
logistic regression, including terms for age,
family history of prostate cancer, body mass
index (BMI), smoking, type 2 diabetes and
multivitamin use. - All p-values are two-sided (p lt 0.05 is
considered significant). All two-way interactions
involving hypercholesterolemia were assessed.
Tests for interaction were assessed by
introducing a multiplicative term between the two
variables in the multivariable model using a Wald
test. - Analyses were performed using SAS software V9.1.3
(SAS Institute, Cary, NC, USA).
16ExamplesNeed improvements
17METHODS Study Population The patients reviewed
were those who presented to a local clinic and
received a RADT during the months of March,
April, and May of 2004. The patients were
categorized by ages of lt15 years, 15-45 years and
gt45 years. Of the 211 subjects, 37.4 were lt 15
years old. The majority of patients, 53.6, fell
into the 15-45 year age group. And only 9 were
gt45 years old. Of the patients to receive an
RADT, 24.1 of those less than 15 years old
tested positive. 19.9 of those in the 15-45 age
range tested positive. And 10.5 of the patients
older than 45 years of age were positive.
18METHODS Data obtained for this study was taken
from the North Dakota Department of Health,
Division of Vital Statistics birth records from
January 1, 1996 through December 31, 2003.
During this timeframe, 63,344 live births
occurred 53, 416 of these records were
included in this studys data set due to
exclusions.
19ACCP Guidelines
20ExampleCodification of the variables
21Codification of the variables
- Gender Weight number in pounds.
- MaleM Or
- FemaleF Weight number in Kilograms.
- Or Do not compute BMI yourself!
- Gender
- Male1 Family history of cancer
- Female2 Yes
- No
- Age number in years. Or
- Height number in inches. Family history of
cancer - Or Yes1
- Height number in meters. No2
22.
.
.
.
.
.
23Why should we collect other variables in addition
to the exposure?
24Mortality in Area A and Area B
Suppose you surveyed how many people died per
year in area A and B (both population10,000),
and results were as indicated in the table. Do
people in area B have higher risk of death?
25If you categorize the populations by age and
compare mortality of Area A with that of Area B
If you categorize the population according to the
age, population A and B has same risks under the
age of 60 years old. There are no people over 60
years old in area A. The difference of
mortality in the previous slide is due to the
difference in age distribution of the population.
26Association Between Smoking and Lung Cancer
- Smokers 1,000
- Non-smokers 1,000
- Same age range, gender ratio.
- Status of lung cancer for 5 years were observed.
27Results in 5 years
P-value lt 0.0001
50 lung cancer cases in smokers and 10 in
non-smokers were observed. If we compare the
morbidity by smokers/non-smokers5/15, it
means that smokers are 5 times more likely to
have lung cancer than non-smokers. If p-value
0.05, what do you conclude?
28(No Transcript)
29 If the sample size of both groups is not 1,000,
but 100.
Plt 0.212 by Fisher's exact test
Suppose that the sample size of both groups is
not 1,000 but 100, and 5 lung cancer cases in
smokers and 1 in non-smokers group were observed.
In this case, ratio of smokers/not
smokers5/15 stays same. However, if
statistical test was conducted, p-value0.212.
From this study, you can not conclude that
there is a significant association between
smoking and lung cancer. How did it happen? In
epidemiologic study, to detect a certain level in
difference of outcome, relatively large sample
sizes are required. If the study is conducted in
a small sample size like this, sometimes true
results are not drawn. In such a case, it is
nothing more than a waste of time and money. We
have to be careful about sample size when we
making conclusions!!!.
30 The factors that should be taken into
consideration when you look at an
association Age Gender ComorbidityBehavioral
factors. . .
31Commonly Used Statistical Tests
- Correlation a linear association.
- T-test difference between means.
- Chi-square Test for proportions.
32Correlation Coefficient
- Correlation coefficient is a summary of the
strength of a linear association between two
variables. If the variables tend to go up and
down together, the correlation coefficient will
be positive. If the variables tend to go up and
down in opposition with low values of one
variable associated with high values of the
other, the correlation coefficient will be
negative.
33t-Test
- Suppose a sample of eight 35- to 39-year-old
non-pregnant, pre-menopause OC users who work in
a company are identified who have mean SBP of
132.86 mm Hg and sample standard deviation of
15.34 mm Hg. A sample of twenty-one 35- to
39-year-old non-pregnant, pre-menopause non-OC
users are similarly identified who have mean SBP
of 127.44 mm Hg and sample standard deviation of
18.23 mm Hg. - What can be said about the underlying mean
difference in blood pressure between the two
groups?
34(No Transcript)
35- Essentials of Writing Biomedical Research Papers.
Mimi Zeiger, 2nd edition
36Questions
- Dr. Abe E. Sahmoun
- asahmoun_at_medicine.nodak.edu
- Dr. James R. Beal
- jrbeal_at_medicine.nodak.edu