Data Collection and Management for Clinical Research - PowerPoint PPT Presentation

1 / 73
About This Presentation
Title:

Data Collection and Management for Clinical Research

Description:

To enroll patients, paramedics contacted base hospital physicians by radio. ... Keep identifying data (name, SSN, MRN) in a separate table. ... – PowerPoint PPT presentation

Number of Views:2197
Avg rating:3.0/5.0
Slides: 74
Provided by: michae194
Category:

less

Transcript and Presenter's Notes

Title: Data Collection and Management for Clinical Research


1
Data Collection and Management for Clinical
Research
  • Michael A. Kohn, MD, MPP
  • 25 August 2008

2
Clinical Research
  • Choose the study design, and define the study
    population, predictor variables, and outcome
    variables
  • measure these variables and anticipate problems
    with measurement
  • analyze the results
  • Today, we discuss the nitty gritty of
    collecting, storing, updating, and monitoring the
    study measurements.

Private companies that make data management
systems for clinical research understand
clinical research to include only RCTs
preparatory to FDA drug or device approval, not
observational studies.
3
Outline
  • Paper data collection forms and data tables (rows
    and columns)
  • Normalization of tables
  • Online data collection forms
  • Data entry options for coded responses
  • Questionnaires
  • Data management protocol
  • Queries (?)

4
Paper Forms
5
One-Sentence Description of Study
  • The cute acronym study is a design study of
    the association between predictor and outcome
    in study population.

6
PHTSE
  • The PHTSE (Pre-Hospital Treatment of Status
    Epilepticus) Study was a randomized blinded trial
    of lorazepam, diazepam, or placebo in the
    treatment of pre-hospital status epilepticus.
    The primary endpoint was termination of
    convulsions by hospital arrival. To enroll
    patients, paramedics contacted base hospital
    physicians by radio.

Lowenstein DH, et al. Control Clin Trials. 2001
Jun22(3)290-309.
Alldredge BK, et al. N Engl J Med. 2001 Aug
30345(9)631-7.
7
(No Transcript)
8
(No Transcript)
9
Display the data from these 2 data collection
forms in a 2-row data table.
10
Create a 9-field data dictionary for the data
table.
11
The paper data collection forms were completed by
busy base hospital physicians who were called
from the emergency department to a radio room.
What are the advantages and disadvantages of
using an on-screen computer form instead of a
paper form? If you designed the study, which
would you use?
12
Advantages of an on-screen form
  • No need for transcription from paper forms into
    the computer data tables
  • Immediate feedback on invalid entries
  • Programmed skip logic (if seizure stopped during
    prehospital course, computer form prompts for
    time seizure stopped, otherwise this field is
    disabled and skipped)
  • Can be made available via a web-browser at
    multiple sites simultaneously

13
Disadvantages of an on-screen form
  • Hardware requirement -- a computer workstation
  • Some user training required

14
Advantages of a paper form
  • Ease and speed of use
  • Portability
  • Ability to enter unanticipated information or
    unstructured data (notes in the margin, responses
    that were not otherwise considered, etc.)
  • Hardware requirement a pen
  • User training received by all data entry
    personnel in elementary school

15
Disadvantages of a paper form
  • Requires subsequent transcription into the
    computer database
  • No interactive feedback or automated skip logic
  • Data viewing and entry limited to one person in
    one place

16
Which would you choose?
  • Although data entry via on-screen data collection
    forms has many advantages and we recommend it for
    most research studies, in this study it is
    impractical. The simplest, fastest, and most
    user-friendly way to capture data on a
    non-volatile medium is still to use a pen and
    paper.

17
Paper Data Collection Forms
  • Paper forms filled out by an examiner or
    interviewer
  • Paper forms filled out by the patient or subject
  • How do you get the data off of the paper forms
    and into the data table(s)?
  • Study data are stored in one or more data tables
    in which

18
Data Tables
  • Study data are stored in one or more data tables
    in which
  • Rows Records Entities
  • Columns Fields Attributes

19
Rows Records Entities
  • Columns Fields Attributes

20
Jaundice and Infant Feeding Study Cohort study to
determine the association between neonatal
jaundice or feeding disorders and 5-year
neuropsychological scores in children born at
Northern California Kaiser Hospitals.
Newman TB, et al. N Engl J Med. 2006 May
4354(18)1889-900.
21
Infant Jaundice Study (Our fictional version of
JIFee)
Methods Design-Double Cohort study. Setting-Sing
le, urban medical center Subjects-Infants with
neonatal jaundice and randomly selected
non-jaundiced infants Predictor Variable-Presence
or absence of jaundice Outcome Variable- IQ score
(ranging from 55 to 145) at age 5 years Analysis-
?
22
(No Transcript)
23
Where is the Predictor Field?
  • What is the predictor in the Infant Jaundice
    Study?
  • Where is it on the data collection form?

24
Demonstration Creating a Data Table
Label columns and enter rows of data in datasheet
view
25
Demonstration Data Dictionary
  • Table design view
  • field (column) names,
  • data types,
  • definitions,
  • validation rules
  • (More on data types, free-text vs. coded
    responses, later)

26
(No Transcript)
27
Demonstration Data Validation
Disallowed values
Duplicate primary keys
28
Acceptable table showing one set of exam results
per participant. (BabyExamForFigure3)
29
Demonstration Same Table in Excel, Stata
  • Excel
  • Stata
  • Etc
  • Rows Records Entities
  • Columns Fields Attributes

Access and Stata have a special row at the top
for column headings (field names) Excel just
uses the first row.
30
Table of Study Subjects
Table of Study Subjects Row Individual
Infant Columns ID, Name, DOB, Sex, Jaundice If
some infants have more than one exam, what do you
do?
31
Undesirable table showing multiple exam results
per study participant. (BabyExamForFigure4)
32
Demo
  • Find highest IQ Score
  • Find all exams done in April

33
Common Error
  • If you find yourself creating multiple columns
    for the same measurement, e.g., Date1, Score1,
    Date2, Score2, Date3, Score3,
  • Or if your table is more than about 30 columns
    wide,
  • It is time to restructure your table.

34
Undesirable table with participant-specific data
duplicated for each exam. (Note problem with
Helens DOB.) (ExamBabyForFigure5)
35
Demo
  • Find highest IQ Score
  • Find all exams in a particular month
  • What happened to Alejandro, Ryan, Zachary, and
    Jackson?

36
Normalization
If some infants have multiple exams, normalize
the records into two tables, one for subjects and
one for examinations.
37
Data normalized into two tables one (Baby)
with rows comprising subject-specific
information the other (Exam) with rows
comprising exam-specific information. Note that
Helen can only have one birth date. Subjects
with no exams, e.g. Alejandro, still appear in
the database. SubjectID functions as the
primary key in the Baby table and as the
foreign key in the Exam table.
38
Figure 7. Relationships diagram showing the
one-to-many relationship between the table of
subjects (Baby) and the table of measurements
(Exam).
39
For more on normalization and the Relational
Database Model,
Relational Database Management Systems for
Clinical Research, UCSF EPI 218 Winter 2008 (1
unit)
  • http//www.epibiostat.ucsf.edu/courses/schedule/da
    ta_management.html

40
Start with Data Tables or Data Collection Forms?
  • It doesnt matter as long as the process is
    iterative.
  • Can start with the tables and then develop the
    forms, test the forms, find problems, and update
    the tables.
  • Can start with a word-processed form, create the
    tables, test, and update.

41
Before seeking help with data management
  • Search the internet and ask other researchers for
    already developed data collection forms.
  • Draft your data collection form.
  • Test your data collection form with dummy
    subjects and, even better, with real
    (de-identified) study subjects.
  • Enter your test data into a data table with rows
    corresponding to subjects and columns
    corresponding to data elements. (Use Excel,
    Access, Stata, or even Word.)
  • Create or at least think about a data dictionary.
  • Decide who will collect the data, and when/how
    the data will be collected.

42
On-Screen Data Collection Forms
  • Will demonstrate using the race field from the
    Infant Jaundice Study
  • Free text versus coded response
  • Single response (mutually exclusive choices)
    versus all that apply

43
Free Text vs. Coded Responses
  • Same as Open-Ended vs. Closed-Ended Questions
  • Free text responses useful in developing coded
    response options.

44
Mutually Exclusive, Collectively Exhaustive
Response Options
  • One field (column)
  • Can always make responses exhaustive by including
    an Other response
  • Drop down list (combo box) vs. pick list (field
    list) vs. option group

45
Drop-down List (Combo Box)
  • Saves screen real estate
  • Doesnt work on paper forms

(Master form)
46
Combo Box
47
Combo Box
48
Pick List (Field List)
  • Uses up screen real estate
  • Useful on paper forms

(MasterRaceAsFieldList form)
49
Field List
50
Option Group
  • Radio buttons (by convention)
  • Uses up screen real estate

(MasterRaceAsOptionGroup form)
51
Option Group
52
Mutually Exclusive One Field
53
All that apply
  • Multiple fields ( columns)
  • Use check boxes (by convention)

(MasterRaceAsAllThatApply form)
54
All That Apply
55
All that Apply Multiple Fields
56
From Paper Data Forms to Data Table(s)
  • Transcription directly into the table(s)
  • Transcription via an online (screen) form
  • Scanning using OMR software

Best option Dont use paper data collection
forms at all (more on this later).
57
NHAMCS
  • nhamcs100-ED_2007.pdf

58
On-screen vs. paper forms
  • Minimize the extent to which study measurements
    are recorded on paper forms.
  •  
  • Enter data directly into the computer database or
    move data from paper forms into the computer
    database as close to the data collection time as
    possible.
  • When you define a variable in a computer
    database, you specify both its format and its
    domain or range of allowed values. Using these
    format and domain specifications, computer data
    entry forms give immediate feedback about
    improper formats and values that are out of
    range. The best time to receive this feedback is
    when the study subject is still on site.
  • You can always print out a paper copy of the
    screen form or a report of the exam/interview
    results once the data are collected.
  • Examples ATM Machines printed transaction
    record, Gas Stations printed receipt (but
    neither of these is going to be accessed or
    updated by a research assistant so the printout
    may be less important).

59
Questionnaires
  • Special type of data collection form
  • Patient history (interview), not physical exam or
    diagnostic test results
  • Often put a premium on branching logic.

Requires a subject who can talk. Cant
administer a questionnaire to infants or to
intubated/comatose patients.
See DCR 3rd Ed. Chapter 15
60
Clinical research studies consistently under-plan
and under-budget for data management.
  • One FTE research assistant, including benefits,
    will cost your study 100k/year you should be
    willing to spend one tenth as much for consulting
    on your database system.
  • Dont leave data checking, validation, and
    cleaning for the biostatistician
  • a) it will be too late
  • b) biostatisticians do not view data cleaning
    as part of their job.

61
Data Management Protocol
  • General description of database (including
    relationships diagram)
  • Data collection and entry
  • Error checking and data validation
  • Analysis (e.g., export to Stata)
  • Security/confidentiality
  • Back up

62
General Description of Database
  • DBMS, e.g. MS Access XP
  • of dynamic tables
  • of static lookup tables
  • of forms
  • of reports
  • An appendix should include the relationships
    diagram, the table names and descriptions, and
    the field names and descriptions (data
    dictionary).

63
Data Collection and Entry
  • Import baseline data from existing systems
  • Import lab results, scan results (e.g. DEXA),
    holter monitor data, and other digital data.
  • For each form, who will collect the data?
  • Collect onto paper forms and then transcribe?
    Enter directly using screen forms? Scannable
    forms?

64
Error Checking and Validation
  • Database automatically checks data against the
    range of allowed values.
  • Periodic outlier detection. (Outliers still
    within the range of allowed values.)
  • Calculation checks
  • Double data entry?

65
Analysis
  • How will you get the data out of the database?

66
Security/Confidentiality
  • Keep identifying data (name, SSN, MRN) in a
    separate table.
  • Link rest of DB to this table via a Subject ID
    that has no meaning external to the DB.
  • Restrict access to identifying data.
  • Password protect at both OS and application
    levels.
  • Audit entries and updates.

67
Back ups
  • Ask your system person to restore a file
    periodically. This tests both the back-up and
    restore systems.

68
Assignment?
Write a one-page data management section for your
research study protocol.
69
Queries
70
Select Queries
  • Select queries (aka Views) organize, sort,
    filter, and display data.
  • Queries use Standard Query Language (SQL), but
    you dont have to learn it, because of graphical
    query design tools.
  • A query can join data from two or more tables,
    display only selected fields, and filter for
    records that meet certain criteria.

71
Demonstration
  • Age in months and BMI at exam of subjects who
    were examined in January and February of 2010.

72
Select Queries Produce Table-Like Results
  • Note that the result of a select query that joins
    two tables, displays only certain fields, selects
    rows based on special criteria, and calculates
    age and BMI still looks like a table in datasheet
    view.
  • But, remember that it is merely a view of data
    from the underlying tables.

73
Action Queries Change Data
  • Update Query -- changes the values of specific
    fields in existing records
  • Append Query -- adds new records (rows) to a
    table
  • Delete Query -- deletes records from a table
Write a Comment
User Comments (0)
About PowerShow.com