Title: Informatics for Clinical Research
1Informaticsfor Clinical Research
2Clinical Research as an Activity
- Fundamental to translation of basic research to
medically useful interventions - Big business est. 95 B spent annually in U.S.
in biomedical research/drug and device testing - Academic centers lag behind commercial clinical
trials organizations in knowledge and skills
related to efficient and high quality clinical
research. - Academic center market share of clinical trials
now est. at 20, was 80 in 1990 - Generally inferior performance with respect to
error rates, missing data, timeliness of
submission
3Importance of Informatics to Clinical Research
- Structured observation and structured record
keeping are the essence of science - Primary differentiation between routine clinical
care and research is how processes are controlled
(ie., protocol-driven) and information is managed
to make it useful for analysis
4Classical Data Management Flow for Clinical
Research
Scientific Hypotheses
Specific Data Elements Required to Test Hypotheses
Data Acquisition Instruments (forms)
Computer Data Model and Tool Selection to Support
Model and output to Analytical Software
People and Process Development (Who does What,
When and Where)
Documentation Standard Operating Policies
Procedures
5Research Data Management Goals
- Create processes and systems that result in
research data that is - Accurate
- Complete
- Timely
- Verifiable
- Secure
- Available for analysis
6Regulatory Context Good Clinical Practice
Standards
- General and uniform set of principles for
conducting clinical research - Two themes
- Respecting rights of participants
- Conducting research so that data is accurate and
verifiable - Required by FDA but a good (higher) standard for
NIH and other sponsored research
7GCP Standards Address...
- Responsibilities of participating sites
- Responsibilities of coordinating centers for
multisite trials - Quality Assurance methods for data
- Audits
- Reporting to regulatory agencies
8GCP Principles of Data Management
- All data should be independently verifiable
- Normally done by comparison with locally kept
medical records in interventional trials - Structured approach to record keeping
- Physical structure tabbed participant folders
with dividers for different classes of
information - Logical structure database designs and tracking
systems
9GCP Principles of Data Management
- Research records are separately maintained from
healthcare-related records - Source document place where observation first
recorded - Source document verification comparison of Case
Report Forms (CRFs) with source documents - corollary CRFs are not usually considered source
documents
10GCP standards example Paper Case Report Forms
- Follow instructions
- Write legibly
- Originals normally go to Coordinating Center
copies local - No marginalia (literally outside the box)
- Forms designed so that all variables have a
current value (may be code for Pending, Missing,
Missed) - Correct units of measurement (best included with
value as separate field)
11GCP Standards for Case Report Forms, contd
- Proper methods of correction
- Line through incorrect value (value still
visible) - Correct value added
- Correction initialed
- White-out is always red in an auditors eyes - no
correction fluids or erasures - Check forms for completeness prior to submission
- Double check and verify ID info on CRF
- Submit on time
1221 CFR 11Electronic Records Signatures
- Applies (only) to data submitted to FDA in
support of drug device applications - Address issues related to paperless data
management systems where there is no source
document for verification - Subpart C relates to digital signatures
- Full compliance requires formal software
validation testing and certification - To date, has paradoxically impeded rather than
advanced use of electronic research data
management systems
13General Forms Design Principles
- Have definitions of all data to be collected in
hand before starting the study - Avoids unnecessary forms revisions that often
confuse Clinical Research Associates (CRAs),
participants, and creates statistical
complexities - Avoids fishing expedition approach to iterative
protocol modification
14Web browser (thin client) electronic forms for
data entry and retrieval
- Strengths
- Deploy to any location on the Internet
- Platform independent (sort of be careful and
test all software on all potential clients) - No software to install or license on users
machines - Weaknesses
- Less efficient (compact interface)
- Fewer controls available
- Limited repertoire of widgets (buttons, lists,
etc.) - Slower
- Dependent upon Internet connectivity
15Specialized Software for Clinical Trials
- Registration
- Randomization
- Participant tracking
- Site communications
- Transaction or batch upload of local data to
coordinating center - Websites for protocols, forms, administrative info
16Specialized Software for Clinical Trials, contd
- Performance measures
- Site actual vs. projected accrual
- Data completeness
- Data accuracy
- Data timeliness
- Usually displayed as trends over time
- Performance measures should include reference
values for performance at all sites combined
17Data Acquisition Technologies
18Data acquisition TechnologiesKeyboard Data Entry
- Average keystroke error rates will be 0.1 to 1,
depending upon data type - Improve accuracy over baseline by
- Double entry and file comparison (gold standard
but inefficient and expensive) - Special technologies for referential integrity
items (e.g., barcode visit and participant ID) - Event-driven auditing and source document
verification of scientifically important
variables
19Data acquisition TechnologiesDouble keying
- Common best practice forms entered by two
different data entry operators - Computer generates difference (diff) file
- Third person (usually data manager with clinical
expertise) reviews and resolves differences - Increases personnel costs by factor of 2 - 2.5
over single entry plus sample-based auditing
20Data acquisition Technologies Barcoding
- Applications
- Referential integrity items identifiers for
participant, study, site, protocol, event/visit - Physical object tracking e.g., tissue specimens,
freezer inventory management systems - System-generated barcode labels
- Various barcode standards 3-of-9 generally used
for scientific applications - Produced by TrueType fonts or dedicated barcode
printers
21Data acquisition Technologies Barcoding, contd
- Barcode readers
- Keyboard wedge - wand or handheld scanner
plugged between keyboard and computer - Self-contained scanners with infrared or USB bulk
data upload (derived from warehouse inventory
systems)
22Data acquisition Technologies Mark-sensing
Technologies
- Example Scantron (www.scantronforms.com)
- Strengths
- Mature technology
- Efficient for re-usable form scanning
- Weaknesses
- Low information density poor for most biomedical
uses - Susceptible to frame shift errors by users
- Requires forms printing
- Cost effective at level of 100K forms
23Mark sensing technologies
24Data acquisition Technologies POF Plain Old Fax
- Design issues
- Include signature or initials on faxable forms
- Strengths
- Widely used surrogate for paper
- Weaknesses
- Not considered a source document
- Legibility
- Requires additional effort to enter data into
computable form
25Data acquisition Technologies Fax Optical
Character Recognition
- Example Teleform (www.cardiff.com)
- Strengths
- Can substitute for data entry staff
- Includes design, recognition, and verification
functionality - 90 recognition accuracy depending upon data
type - Weaknesses
- Error rates equivalent to single entry, higher
than double entry - Cost vs. person hours becomes favorable only at
large numbers of forms (50-100K)
26Data acquisition Technologies Direct Computer
Entry by Participants
- Can use thin client (HTML forms) or thick
client i.e., workstation forms (e.g., MS Access) - Strengths
- If well designed, eliminates data entry step
- Can add multimedia explanations and tutorials
- Can be more enjoyable for study participants than
paper forms - Weaknesses
- Requires basic computer skills (mouse /-
keyboard) - Requires literacy skills
- Requires staff assistance and verification
27Data acquisition Technologies Computer to
Computer Messaging
- Example import lab results from lab system
directly into research database for study
participants. - Strengths
- If well designed, eliminates data entry step
- Timeliness
- Accuracy
- Weaknesses
- Requires specialized computer programming
expertise - Requires standards for representing clinical data
(most widely used HL-7) - Requires willingness of systems managers at
source of data (e.g., medical center Information
Services) to allow data connections
28Data acquisition Technologies PDAs
- Example Pendragon software
- Strengths
- Portable, relatively low cost
- Nonprogrammer interfaces to MSAccess
- Weaknesses
- Limited screen size and navigation speed
- Not suitable for text entry
- Security lost or stolen PDA
29Data Archiving and Database Design
30Commonly used data archiving and analysis software
- Single investigator, simple trial
- Spreadsheet (MS Excel)
- Beware using spreadsheets for HIPAA-regulated
data no audit trail capability - Workgroup-capable database management software
(MS Access, Filemaker Pro, 4th Dimension, MS
Visual FoxPro) - Data Center, multiple studies
- Enterprise relational database system
- Sybase, Oracle, MS SQL Server
- Dedicated statistical analysis packages
- SAS, BMDP, SPSS, S Plus, JMP
31Commonly used data archiving and analysis
software, contd
- Pharmaceutical companies - multiple drugs,
multiple sites, multiple studies, FDA audits - Dedicated clinical trials software (e.g., BBN
ClinTrials, Oracle Clinical)
32Sample data model for one-time administration of
a survey
one
Person (Participant) ParticipantID primary
key Last_name First_name Address City State Zip P
hone Fax E-mail MRN Birthdate SSN Gender Last_upda
te Update_by
one
Study_Data ParticipantID Date Answer1 Answer2 An
swer3 Answer4 Last_update Update_by
Best practices store Person table on removable
media with physical security OR store Person
encrypted by private key
33Simple clinical study with a variable number of
identical repeat visits
one
Person (Participant) ParticipantID Last_name Fi
rst_name Address City State Zip Phone Fax E-mail M
RN Birthdate SSN Gender Last_update Update_by
many
Study_Data ParticipantID VisitID VisitDate BPsys
tolic BPdiastolic Weight Sodium Potassium Chloride
Bicarb BUN Creatinine Last_update Update_by
Note In best pactice, primary key of Study_Data
is the combination of Participant ID and the
study visit, which defines a unique protocol
event. VisitDate is the calendar date that event
occurs.
34Clinical study with a baseline evaluation
followed by variable number of identical repeat
visits
one
Baseline ParticipantID VisitDate DataItem1 DataI
tem2 DataItem3 Last_update Update_by
Person (Participant) ParticipantID Last_name Fi
rst_name Address City State Zip Phone Fax E-mail M
RN Birthdate SSN Gender Last_update Update_by
one
many
Follow_Up ParticipantID VisitID VisitDate BPsyst
olic BPdiastolic Weight Sodium Potassium Chloride
Bicarb BUN Creatinine Last_update Update_by
35Data Security
36 Information Security Elements
- Availability - when and where needed
- Authentication -a person or system is who they
purport to be (preceded by Identification) - Access Control - only authorized persons, for
authorized uses - Confidentiality - no unauthorized information
disclosure - Integrity - Information content not alterable
except under authorized circumstances - Attribution/non-repudiation - actions taken are
reliably traceable
37Research Records Security,General Principles
- Physical Security
- Locked file storage for physical files
- Programmable locks best
- Change combination on a regular basis (common
practice twice a year) - Person-identifiable data
- Keep separate from other study data
- Consider additional protections such as two
person access requirements
38Research Records Security, contd
- Electronic Security
- No workstations viewable from public areas
- Password-protected login
- Screensaver timeouts
- Separate login and password for database access
- Store demographics data separately and encrypted
if feasible - Regular backups and offsite backup storage
39Research Records Security, contd
- Network Security
- Safest but least useful disconnect workstations
with research data from network - Keep all workstations and servers patched with
latest security updates - Run antivirus software on all machines
- Consider firewall computer to protect Internet
access point, and/or workstation firewall software
40 Information Security Elements
- Availability - when and where needed
- Authentication -a person or system is who they
purport to be (preceded by Identification) - Access Control - only authorized persons, for
authorized uses - Confidentiality - no unauthorized information
disclosure - Integrity - Information content not alterable
except under authorized circumstances - Attribution/non-repudiation - actions taken are
reliably traceable
41Security Rule Basic Concepts
- Applies security principles well established in
other industries - Like Privacy Rule, affects Covered Entities that
create, store, use or disclose Protected Health
Information (PHI) - Unlike the Privacy Rule, affects only PHI in
electronic format (not oral or paper-based) - Like the Privacy Rule, written for health care
research not the principal focus - Scalable burden relative to size and complexity
of organization
42Two types of Rule elements
- Required standards
- Addressable standards
- CE must decide whether the standard is reasonable
and appropriate to the local setting, and cost to
implement - Can either
- Implement the standard as published
- Implement some alternative (and document why)
- Not implement the standard at all (and document
why)
43Three Categories of Standards
- Administrative safeguards
- Policies and procedures to prevent, detect,
contain and correct information security
violations - Physical Safeguards
- IT equipment and media protections
- Technical Safeguards
- Controls (mostly software) for access,
information integrity, audit trails -
44Administrative Safeguards
- Required
- Risk Analysis
- Risk Management Plan
- Sanctions Policy
- Information System Activity Review (audits)
- Security Incident Response Reporting
- Data Backup Plan
- Disaster Recovery Plan
- Emergency Mode Operations
- Periodic Evaluations of Standards Compliance
45Physical Safeguards
- Required
- Workstation Use Analysis
- Workstation Security
- Disposal of media
- deletion of PHI prior to disposal, or
- Secure disposal so data nonrecoverable
- Media Reuse
- Deletion of PHI prior to re-use
46Technical Safeguards
- Required
- Unique User Identification
- No shared logins
- Emergency access procedures
- Audit controls
- Logs of who created, edited or viewed PHI
- Person and/or Entity Authentication
- No systems without access control
47Implications for Research
- Avoid HIPAA Security Rule entanglements if
possible by - Thoughtful definition of Covered Entity with
respect to research activities - E.g., Vanderbilt is Hybrid Covered Entity
research not a covered function except for
research that uses or creates medical records - Use of de-identified data and/or Limited Data
Sets wherever possible - Not storing PHI in electronic format in research
settings
48If a research project maintains e-PHI
- Responsible group must designate a Security
Officer who has responsibility for implementing
HIPAA-compliant policies and procedures for
research use of e-PHI - Must do and document a risk analysis
- Must create risk management plan based on the
risk analysis - Must create and keep current a HIPAA Security
Rule compliance document that includes
description of how 17 Required elements are met,
and decisions regarding Addressable elements
49Widespread current research practices that dont
meet the standard
- Research workgroups that create or use PHI in
electronic format but have no written security
procedures, policies or training - Workstations with no login security (e.g.,
Windows98) - Data management and analysis applications used to
store PHI that have no ability to generate audit
trails - E.g., Excel spreadsheets with PHI in them
50Using the Internetfor Clinical Research
51Internet Functionality for Clinical Research
- E-mail
- Avoid putting HIPAA PHI in e-mail
- Study participant recruitment
52(No Transcript)
53Internet Functionality for Clinical Research,
contd
- E-mail
- Avoid putting PHI in e-mail
- Study participant recruitment
- Private FTP site as drop box for study related
file communications - encrypt files if they contain PHI
- Data submission and reporting
- Multi-site coordination and administration
54Approved Internet Technologies relevant to
Clinical Research
1containing person-identifiable i.e., HIPAA PHI
2 must be encrypted to HCFA/CMS std
55Sample Project administration website for
multi-center study
56Putting it All TogetherResearch Data Management
- An artful selection of physical and electronic
management methods - Signed informed consent documents
- Paper forms
- Regulatory and project management binders
- Data models and databases
- Data acquisition and display technologies
- Communications technologies for project
management as well as data management
57Attributes of Successful Data Management
- Attention to detail
- Explicit structure and process
- Robust designs
- Anticipate failures, lapses and mistakes
- Design systems that identify and correct them
- Mechanisms for verification
- Well documented
58Lessons Learned about Data Management in Clinical
Research
- Effective data management is a continuous
process, not a point in time analysis - Historically, health care organizations and
providers have invested suboptimally in
information systems and this provides an uneven
infrastructure for clinical research - In health care organizations, data management and
information systems implementation is 20
technology and 80 sociology (R. Gardner) plan
accordingly