Beyond the CDISC SDTM V3.1 Model: Statistical - PowerPoint PPT Presentation

About This Presentation
Title:

Beyond the CDISC SDTM V3.1 Model: Statistical

Description:

2004 FDA/Industry Statistics Workshop Washington, DC September 23, 2004 Beyond the CDISC SDTM V3.1 Model: Statistical & Programming Considerations – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 38
Provided by: WilliamJ173
Category:

less

Transcript and Presenter's Notes

Title: Beyond the CDISC SDTM V3.1 Model: Statistical


1
Beyond the CDISC SDTM V3.1 Model Statistical
Programming Considerations
American Statistical Association 2004
FDA/Industry Statistics Workshop Washington,
DC September 23, 2004
  • William J. Qubeck, IV MS, MBA
  • Electronic Submissions Data Group Leader
  • Global Clinical Data Services, Pfizer Inc.

2
Agenda
  • Model Overview of CDISC SDTM V3.1
  • Programming, Statistical and Submission
    Considerations
  • Cost/Benefits of 3 Implementation Strategies
  • Implications and Summary

3
CDISC SDTM Version 3.1
  • A brief model overview

4
CDISC SDTM Material (www.cdisc.org)
Source of model information www.cdisc.org
5
SDTM V3.1 Characteristics
  • SDTM applies to all Case Report Tabulation (CRT)
    data across all phases of clinical trials
    development and generally refers to collected
    data
  • V3 added new variables to represent additional
    timing descriptions, flags and descriptive
    attributes
  • All variables must come from the SDTM model does
    not allow sponsored defined variables to be added
  • Numerous changes from Version 2 variables and
    labels
  • Removed most, if not all, selection variables
    from domains
  • Added
  • Study Design (planned versus actual) datasets
  • Special Purpose/Relationship Datasets

6
V3 - Study Data Information Model
  • 3 main types of observations (data domains)
  • Interventions, events, findings, and other
  • Interventions
  • Are related to the therapeutic and experimental
    treatments (expanded to include other things)
  • Events
  • Observations from subjects on adverse reactions
  • Findings
  • Evaluations/examinations to address specific
    questions (when in doubt its a finding)

7
SDS V3 Standard Data Structures
Interventions
Events
Findings
IE
AE
EX
LB
DS
CM
VS
SC
SU
MH
PE
EG
8
Standard Model Variables
  • Topic
  • Identifies the focus of the observation
  • Unique identifiers
  • Identifies the subject of the observation
  • Timing
  • Describes the start and end of the observation
  • Qualifiers
  • Describes the traits of the observation

9
An Example Observation
Unique Subject Identifier
Topic
Subject 123 had a severe headache starting on
study day 2
Qualifier
Timing
10
Dataset Structure
Timing
Subject Identifier
Qualifier
Topic
Var Names
USUBJID
AESEV
AETERM
AESTDY
Severity/ Intensity
Study Day of Start of Event
Reported Term for the Adverse Event
Unique Subject Identifier
Labels
Observa tion
123
2
HEADACHE
SEVERE
11
Core Variables Definition
  • A required variable is any variable that is basic
    to the identification of a data record (i.e.,
    essential key identifiers and a topic variable
    that cannot be null)
  • An expected variable is any variable necessary to
    make a record meaningful in the context of a
    specific domain (variable should be included)
    Some values may be null
  • Permissible variables should be used as
    appropriate when collected or derived.
  • Any general timing variable not explicitly
    mentioned in a domain model is permissible to be
    included
  • Only qualifier variables specified in a domain
    model are allowed for that domain.

12
A Brief Look at the Domain Classes
13
Model Topic Variables Qualifiers
  • Events Domain Class
  • Topic Variable --TERM (Reported Term)
  • Approx. 12 qualifiers (e.g., Modified Term,
    Seriousness)
  • Intervention Domain Class
  • Topic Variable --TRT (Treatment)
  • Approx. 6 qualifiers (e.g., Dose, Unit)
  • Findings Domain Class
  • Topic Variable --TESTCD (Test Code)
  • Many qualifiers (e.g., Units, Standardize Results)

14
Example Events Data (MH)
15
Example Findings Data (VS)
16
Creating a New Domain
Superset of Variables
17
Programming, Statistical and Submission
Considerations.
18
PFE CDISC SDTM
  • Pfizer has and continues to contribute to CDISC,
    participated in the FDA pilots and has
    implemented CDISC Version 2.0
  • We delivered our first CDISC SDTM compliant
    submission in August
  • Submitted 5 protocols of partial data
  • Over 11,000 patients worth of data
  • Included all CDISC defined domains plus 5
    additional as well as the define.xml
  • Converted several of the analysis datasets into
    SDTM compliant structures

19
Submission Data Processes
20
Mapping Events Interventions
Internal Dataset
A
Retain the SEQ s
21
Lessons learned
  • Mapping was straight forward
  • eSub data documentation was not affected (e.g.,
    define.pdf)
  • Only a few variables were mapped to SUPPQUAL (the
    exception not the rule)
  • Technical challenges
  • Increase dependencies SUPPQUAL CO become
    dependent on all contributing source datasets 1
    to many (source to target domain)
  • Several defined internal datasets may map to 1
    domain target
  • May rethink how XPTs are generated one at a
    time or in batches
  • No specific statistical considerations

22
Lessons learned
  • May need to rethink how you organize your data
    into CDISC SDTM structures
  • For example,

23
Example Exercise
  • Does each item go into the Demographics Domain?

24
Answer NO
Demographics
Vital Signs
Subject Characteristics
Substance Use
25
Mapping Findings
Internal Dataset
B
Horizontal dataset
Retain the SEQ s
26
Lessons learned
  • It describes the majority of the data in a
    submission
  • More complicated, b/c need to retain the
    transposed information and should be provided in
    define.xml
  • Statistical programming considerations
  • data stored in non-traditional structure
  • The structure is flexible enough to contain both
    collected and analysis data do you continue to
    keep them separate?
  • eSub data documentation is affected
  • Need to change CRF annotations and provide column
    (variable) and record-level

27
An Example Vitals Signs (VS)
Example Dataset
USUBJID VISIT DIABP SYSBP BMI HEIGHT
0001 1 70 110 25.3 55
28
Additional define.pdf/xml Section
29
VS Annotated Page (blankcrf.pdf)
OR VSORRES, where VSTESTCD XYZ
30
Overall Statistical Programming Considerations
  • Where to implement the data standards?
  • At the end (at XPT generation)
  • During the table production process
  • All the way back to the Database
  • Must prioritize whats important
  • Having minimal impact on your internal data
    storage /or table creation process algorithms
  • Implementing versions quickly (Time Resource
    Issues)
  • End-game mapping costs
  • Software re-use?

31
Implementation Strategies
32
Benefit/Cost of Mapping to SDTM Post-CSR
  • Benefits
  • Versions have minimal impact on data storage
    processing
  • Version changes can be quickly implemented
  • Supports early adoption of the standards
  • Costs
  • Mapping Costs (for each study and type of data)
  • Could add time to the critical path
  • Data used to produce the outputs (tables,
    listings and graphs) may not match the submitted
    data (e.g. variable names, data structure, the
    records maybe placed into different domains)
    raises questions regarding data exchanges for
    Rapid Response
  • Additional QC steps

33
Benefits/Costs of Mapping to SDTM within CSR
Process
  • Benefits
  • Data used to produce the outputs matches the
    submitted data
  • Previously developed software can be used to
    answer reviewer questions (supports software
    reuse)
  • Additional time does not have to be added to the
    critical path
  • Costs
  • Version changes affect the application of
    algorithms plus output generation software
  • Mapping the data (for each study and type of
    data)
  • Although time is not added to the end
    additional time is needed to complete the
    mappings
  • Annotated CRFs from the clinical trials database
    do not match the data submitted

34
Benefits/Cost of Mapping to SDTM within Database
(data storage)
  • Benefits
  • The standards would be throughout the entire
    clinical data storage, processing, and reporting
    processes
  • The extra time needed to implement the standards
    is an up front cost
  • No additional QC step because no mapping is
    necessary
  • Supports software reuse
  • Facilitates Electronic Data Interchange - cost
    savings
  • CDISC estimates that the average data transfer
    cost per study is approximately 35k 122.5M
    annually
  • Standardizes the exchange btw researchers, study
    sponsors, regulatory authorities and the applicant

35
Benefits/Cost of Mapping to SDTM within Database
(2)
  • Costs
  • Version changes can have a significant impact
    upon the entire clinical data storage,
    processing, and reporting processes
  • Raises change control and implementation issues
  • Drug development programs may span many different
    versions due to length of time in development
  • Software version control and output
    reproducibility
  • How to roll out new versions of the standards?

36
Implications to the Industry
  • All sponsors are facing implementation strategy
    challenges
  • Analysis Datasets should also be provided in
    addition to the SDTM datasets
  • At this point they dont need to conform to V3
  • Will be provided separately (e.g., in a different
    submission directory)
  • Standardized datasets will enable the use of
    standardized review tools and could lead to more
    thorough and efficient reviews (e.g., decreased
    learning curve)

37
Summary
  • There are significant differences between CDISC
    SDS V2 and V3 in terms of scope, design and
    philosophy
  • For more information regarding SDTM Version 3.1
    www.cdisc.org
  • Thank you!
  • William_J_Qubeck_at_Groton.Pfizer.com
Write a Comment
User Comments (0)
About PowerShow.com