Title: Creating a Clinical Data Element Dictionary A Proposal
1Creating a Clinical Data Element DictionaryA
Proposal
- CDISC Board of Directors Meeting
- 10-11 Dec 2007
2Preamble
- CDISC has made progress on many fronts
- There is a CDISC brand
- CDISC has worked on strategies/plans over the
years - Currently a strategy in place
- Operational plans/objectives for 2008 in place
- 2008 budget is in place
- Fundamentally change some of CDISCs approach
3Preamble
- This is a discussion first and foremost about
WHAT we will do. - What one thing, if done well and consistently,
would have the most impact on your business? - Ken Blanchard, Mission Possible
- What is the pearl of great price.
- If we agree on WHAT, then we can discuss HOW.
4Motivation for the WHAT
- What standards work is done today Lilly example
- Lilly Data Element Standards
- \\Rodan\rodan.grp\GCDS_EB_PUBLIC
- 3560 pages in our Dictionary
- 25,000 variables
- Its all pdf (yuk !!!!)
5Motivation for WHAT
- Link to Analysis Dataset Standards
- http//corpweb.d51.lilly.com/statmath/CoE/ADS/ADS_
std.html - Thousands of pages of documentation in our total
ADS specifications - For each study, CROs get hundreds of pages of
requirements that describe the data elements that
we want, variable names, valid values, formats,
etc.
6Motivation for WHAT
- Dozens of people at Lilly and CROs communicate
using these voluminous documents - CROs have dozens of people mapping data to
company-specific formats, naming conventions,
etc. - The CDISC Business Case largely predicated on
eliminating these activities - Reduce mapping data from one form to another to
transfer or to integrate it.
7Motivation for WHAT
- http//www.wikihit.org/wiki/index.php/Main_Page
- The Clinical Data Definitions created in WikiHIT
are not completely useful for clinical research
studies. - caDSR has some useful elements, but is a bit
outdated and not entirely functional for what is
needed for clinical research studies. - Too complex in all its detail
- NCI EVS has some useful elements, but does not
have all the information and functionality that
is needed by companies involved in clinical
research. - Data definitions do not have all information
(e.g. valid values)
8The Language of Clinical Trials
- It is more important to share a common vocabulary
than it is to have agreement on common
grammatical rules. - ? Content is more important than structure.
- Es ist wichtiger, einen allgemeinen Wortschatz,
als zu teilen es Vereinbarung über allgemeine
grammatische Richtlinien haben soll. - A common vocabulary more important for sharing
than understanding of typical rules of grammar.
9CDISC Adoption by Pharma
- Make SDTM more useful, implementable
- Need more specificity
- Need more definitions on variables data
elements - Standard data elements this is what FDA wants,
what pharma wants, what CROs want - FDA under pressure to do something quickly?
- CDISC dealing with healthcare, other standards
organizations, etc. - Dont let the perfect hold up the good
- Need more focus from CDISC
10CDISC Adoption by Pharma
- Its about saving dollars for pharma, CROs and
labs by simplifying interchange of data. - Its about helping companies and FDA integrate
data from regulated clinical research studies. - CDISC Business case has little to do with
healthcare at this point.
11Motivation for WHAT
- Summary
- There is an enormous unmet need for more content.
- The CDISC Business Case is largely dependent on
well defined data element standards being broadly
available. - Others are playing in this space, but do not meet
the needs of pharma clinical research and
regulatory submissions. - CDISC Terminology Program has primarily focused
on controlled terminology supporting SDTM, but
not the data elements themselves.
12What Is a Data Element?
- All the pieces of information (i.e. metadata)
needed to unambiguously describe a concept - English dictionary analogy
- Word desk
- Phonetic spelling desk
- Part of speech noun
- Definition a piece of furniture with a flat top
for writing - could also be thought of as the concept
- Source Latin, discus
- etc.
13Data Element
- A Data Element is a unit of data for which
definition, identification, representation, and
permissible values are specified by means of a
set of attributes the smallest unit of data. - The purpose of a data element definition is to
define a data element with words or phrases that
describe, explain, or make definite and clear its
meaning.
14Data Elements Vertical v. Horizontal
Vertical Data Set Structure
- Valid Values for Variable are
- HR, SBP, DBP.
- A controlled terminology
- For each term, provide the metadata to describe
it - Definition, units, valid values, etc.
15Data Elements Vertical v. Horizontal
Horizontal Data Set Structure
Each variable has a name (terminology) and a
corresponding set of metadata to describe it
(definitions, units, valid values, etc.)
16Clinical Data Element for Pharma
- Variable name (draft)
- Label / concept
- Valid values of the variable itself
- Data type (num, char, date, )
- Units
- Key words (e.g. biomarker, osteoporosis, )
facilitate searches - Source / reference (as needed)
- SDTM data domain
- Regulatory requirement
- A team needs to define what are the essential
metadata pieces of information that are
parsimonious enough to eliminate ambiguity, but
few enough to be useful, consumable,
understandable, burdenless.
17Creating a Clinical Data Element Dictionary (CDED)
- Task Force Members
- Steve Ruberg
- Bron Kisler
- Scott Getzin
- Doug Fridsma
- Chris Chute
- Sue Dubman
- Dave Iberson-Hurst
- Cara Willoughby
18Proposal WHAT - Unmet Need
- Comprehensive, electronically accessible,
organized dictionary of unambiguous data element
standards for our industry - One of the most fundamental problems we all face
within our own pharma companies, but even more
acutely across the pharma industry/enterprise. - Consistent with Strategy Theme 2, 5, 6
- THE place where people go for clinical data
element standards. - THE thing for which CDISC is known ?!?!
19Alignment and Focus
- If additional funding can be secured, standards
specific to therapeutic areas will become part of
the extended CDASH scope. - CDISC Press Release 33
- 15 May 2007
- KEY QUESTION
- Given the importance of this area and the need to
move quickly, should we re-prioritize and divert
resources (people and ) to this effort?
20Alignment and Focus
- FOCUS
- Where do we focus?
- ISO, AHIC, AHRQ, NLMEc, industry architecture,
- Initial focus on meeting pharma industry needs
- If others want to piggyback on that effort, that
is fine. - Initial focus on clinical data and clinical trial
metadata - Initial focus on raw/observed data
- There is a lot of territory to conquer within
this focus area. Other opportunities
(pre-clinical data elements, derived data
elements) can be explored in the future.
21Impact on Other CDISC Teams
- Clinical Data Element Dictionary (CDED)
- Terminology, SDTM, CDASH, LAB and SEND all
converge into a common approach focused on the
data elements and their exquisite definition - Reduces need to harmonize CDISC models if they
all utilize the same data element definitions - Harmonization happens on the front end rather
than after the fact - The transport standard for carrying standardized
content (ODM, HL7, SAS, other???) can be whatever - BRIDG work continues as is
- Tightly coordinate standard data elements with
BRIDG efforts
22Creating a Clinical Data Element Dictionary (CDED)
Content Standards
Transport Standards
Initial Inputs
CDASH
SDTM
SAS
CDED
80
LAB
20
ODM
Protocol
TB?
HL7
CV?
Other Existing?
23Proposal
- HOW - Business model
- An open, electronic, peer production environment
with appropriate governance - Like MedDRA, but open and free
- Like Wikipedia, but more governance
- Like LINUX, but more granular and dynamic
- CDISC must adopt a more flexible and rapid
development process
24Clinical Data Element Standards
Governance
Final
Review
Template
Submission
Anyone
- Downloadable (define.xml)
- Searchable text, key words
- search shows status (submit, review, final)
25Governance for the CDED
Governing Board
2 Full-Time CDISC Employees
Lead 1
Lead 2
Lead 3 ... Lead k
Team 1
Team 2
Team 3
Team k
Lead 1
Lead 2
Lead 3
Lead k
. . .
6-8 SMEs
6-8 SMEs
6-8 SMEs
6-8 SMEs
26Proposal
- WHO - CDISC
- CDISC has the opportunity to assert an even
greater leadership role in this arena. - Leverage CDISCs strengths Strategy Theme 1
- Independence
- Consensus building
- Strong pharmaceutical / clinical research
expertise - Global recognition
- Place substantial priority and focus on this
effort - The pearl of great price
27Proposal
- WHEN - ASAP
- The time is right to charge ahead aggressively
- There is a large, unmet business need
- FDA and others are looking for a content
leader - CDISC has ongoing terminology efforts
- Technology is in place (i.e. wikis)
- Mindset is in place (i.e. people can work
virtually) - Others are advancing on this front and we may be
left out
28Budget
- Transition personnel to this effort
- Continue/finalize ongoing CDASH efforts
- Redirect some Terminology Team efforts
- Need part-time Governance team members
- Contracted for 25 of their time
- SMEs for TA or data domains
- Leverage CROs, software members of CDISC
29Summary
- There remains a clear need to have unambiguous
clinical data element standards (CDES) - Considerable efforts still spent on exchanging
data - Considerable efforts still spent on integrating
data - Needed across the drug development industry
- Broad set of data domains (safety, efficacy,
outcomes, PK, etc.) - Independent of strategies related to messaging or
transport technologies - Lets act decisively and move quickly.
30Benefits of Using Documented CDEs
- Facilitates common data collection by defining
content and scope. - Supports semantic data relationships.
- Defines valid values for enumerated data.
- Improves understanding of data.
- Simplifies and documents data analysis.
- Provides historical context for data collections.
- Encourages reuse of existing data structures.
- Facilitates sharing of data across organizational
entities. - Facilitates integration of data across studies.