Beyond the CDISC SDTM V3.1 Model: Statistical - PowerPoint PPT Presentation

About This Presentation

Title:

Beyond the CDISC SDTM V3.1 Model: Statistical

Description:

William J. Qubeck, IV MS, MBA. Electronic Submissions Data Group Leader ... Var. Names. Severity/ Intensity. Study Day of Start of Event. Unique Subject Identifier ... – PowerPoint PPT presentation

Number of Views:884

Avg rating:3.0/5.0

Slides: 38

Provided by: williamj4

Category:

more less

Transcript and Presenter's Notes

Title: Beyond the CDISC SDTM V3.1 Model: Statistical

1
Beyond the CDISC SDTM V3.1 Model Statistical
Programming Considerations
American Statistical Association 2004
FDA/Industry Statistics Workshop Washington,
DC September 23, 2004

William J. Qubeck, IV MS, MBA
Electronic Submissions Data Group Leader
Global Clinical Data Services, Pfizer Inc.

2
Agenda

Model Overview of CDISC SDTM V3.1
Programming, Statistical and Submission
Considerations
Cost/Benefits of 3 Implementation Strategies
Implications and Summary

3
CDISC SDTM Version 3.1

A brief model overview

4
CDISC SDTM Material (www.cdisc.org)
Source of model information www.cdisc.org
5
SDTM V3.1 Characteristics

SDTM applies to all Case Report Tabulation (CRT)
data across all phases of clinical trials
development and generally refers to collected
data
V3 added new variables to represent additional
timing descriptions, flags and descriptive
attributes
All variables must come from the SDTM model does
not allow sponsored defined variables to be added
Numerous changes from Version 2 variables and
labels
Removed most, if not all, selection variables
from domains
Added
Study Design (planned versus actual) datasets
Special Purpose/Relationship Datasets

6
V3 - Study Data Information Model

3 main types of observations (data domains)
Interventions, events, findings, and other
Interventions
Are related to the therapeutic and experimental
treatments (expanded to include other things)
Events
Observations from subjects on adverse reactions
Findings
Evaluations/examinations to address specific
questions (when in doubt its a finding)

7
SDS V3 Standard Data Structures
Interventions
Events
Findings
IE
AE
EX
LB
DS
CM
VS
SC
SU
MH
PE
EG
8
Standard Model Variables

Topic
Identifies the focus of the observation
Unique identifiers
Identifies the subject of the observation
Timing
Describes the start and end of the observation
Qualifiers
Describes the traits of the observation

9
An Example Observation
Unique Subject Identifier
Topic
Subject 123 had a severe headache starting on
study day 2
Qualifier
Timing
10
Dataset Structure
Timing
Subject Identifier
Qualifier
Topic
Var Names
USUBJID
AESEV
AETERM
AESTDY
Severity/ Intensity
Study Day of Start of Event
Reported Term for the Adverse Event
Unique Subject Identifier
Labels
Observa tion
123
2
HEADACHE
SEVERE
11
Core Variables Definition

A required variable is any variable that is basic
to the identification of a data record (i.e.,
essential key identifiers and a topic variable
that cannot be null)
An expected variable is any variable necessary to
make a record meaningful in the context of a
specific domain (variable should be included)
Some values may be null
Permissible variables should be used as
appropriate when collected or derived.
Any general timing variable not explicitly
mentioned in a domain model is permissible to be
included
Only qualifier variables specified in a domain
model are allowed for that domain.

12
A Brief Look at the Domain Classes
13
Model Topic Variables Qualifiers

Events Domain Class
Topic Variable --TERM (Reported Term)
Approx. 12 qualifiers (e.g., Modified Term,
Seriousness)
Intervention Domain Class
Topic Variable --TRT (Treatment)
Approx. 6 qualifiers (e.g., Dose, Unit)
Findings Domain Class
Topic Variable --TESTCD (Test Code)
Many qualifiers (e.g., Units, Standardize Results)

14
Example Events Data (MH)
15
Example Findings Data (VS)
16
Creating a New Domain
Superset of Variables
17
Programming, Statistical and Submission
Considerations.
18
PFE CDISC SDTM

Pfizer has and continues to contribute to CDISC,
participated in the FDA pilots and has
implemented CDISC Version 2.0
We delivered our first CDISC SDTM compliant
submission in August
Submitted 5 protocols of partial data
Over 11,000 patients worth of data
Included all CDISC defined domains plus 5
additional as well as the define.xml
Converted several of the analysis datasets into
SDTM compliant structures

19
Submission Data Processes
20
Mapping Events Interventions
Internal Dataset
A
Retain the SEQ s
21
Lessons learned

Mapping was straight forward
eSub data documentation was not affected (e.g.,
define.pdf)
Only a few variables were mapped to SUPPQUAL (the
exception not the rule)
Technical challenges
Increase dependencies SUPPQUAL CO become
dependent on all contributing source datasets 1
to many (source to target domain)
Several defined internal datasets may map to 1
domain target
May rethink how XPTs are generated one at a
time or in batches
No specific statistical considerations

22
Lessons learned

May need to rethink how you organize your data
into CDISC SDTM structures
For example,

23
Example Exercise

Does each item go into the Demographics Domain?

24
Answer NO
Demographics
Vital Signs
Subject Characteristics
Substance Use
25
Mapping Findings
Internal Dataset
B
Horizontal dataset
Retain the SEQ s
26
Lessons learned

It describes the majority of the data in a
submission
More complicated, b/c need to retain the
transposed information and should be provided in
define.xml
Statistical programming considerations
data stored in non-traditional structure
The structure is flexible enough to contain both
collected and analysis data do you continue to
keep them separate?
eSub data documentation is affected
Need to change CRF annotations and provide column
(variable) and record-level

27
An Example Vitals Signs (VS)
Example Dataset
USUBJID VISIT DIABP SYSBP BMI HEIGHT
0001 1 70 110 25.3 55
28
Additional define.pdf/xml Section
29
VS Annotated Page (blankcrf.pdf)
OR VSORRES, where VSTESTCD XYZ
30
Overall Statistical Programming Considerations

Where to implement the data standards?
At the end (at XPT generation)
During the table production process
All the way back to the Database
Must prioritize whats important
Having minimal impact on your internal data
storage /or table creation process algorithms
Implementing versions quickly (Time Resource
Issues)
End-game mapping costs
Software re-use?

31
Implementation Strategies
32
Benefit/Cost of Mapping to SDTM Post-CSR

Benefits
Versions have minimal impact on data storage
processing
Version changes can be quickly implemented
Supports early adoption of the standards
Costs
Mapping Costs (for each study and type of data)
Could add time to the critical path
Data used to produce the outputs (tables,
listings and graphs) may not match the submitted
data (e.g. variable names, data structure, the
records maybe placed into different domains)
raises questions regarding data exchanges for
Rapid Response
Additional QC steps

33
Benefits/Costs of Mapping to SDTM within CSR
Process

Benefits
Data used to produce the outputs matches the
submitted data
Previously developed software can be used to
answer reviewer questions (supports software
reuse)
Additional time does not have to be added to the
critical path
Costs
Version changes affect the application of
algorithms plus output generation software
Mapping the data (for each study and type of
data)
Although time is not added to the end
additional time is needed to complete the
mappings
Annotated CRFs from the clinical trials database
do not match the data submitted

34
Benefits/Cost of Mapping to SDTM within Database
(data storage)

Benefits
The standards would be throughout the entire
clinical data storage, processing, and reporting
processes
The extra time needed to implement the standards
is an up front cost
No additional QC step because no mapping is
necessary
Supports software reuse
Facilitates Electronic Data Interchange - cost
savings
CDISC estimates that the average data transfer
cost per study is approximately 35k 122.5M
annually
Standardizes the exchange btw researchers, study
sponsors, regulatory authorities and the applicant

35
Benefits/Cost of Mapping to SDTM within Database
(2)

Costs
Version changes can have a significant impact
upon the entire clinical data storage,
processing, and reporting processes
Raises change control and implementation issues
Drug development programs may span many different
versions due to length of time in development
Software version control and output
reproducibility
How to roll out new versions of the standards?

36
Implications to the Industry

All sponsors are facing implementation strategy
challenges
Analysis Datasets should also be provided in
addition to the SDTM datasets
At this point they dont need to conform to V3
Will be provided separately (e.g., in a different
submission directory)
Standardized datasets will enable the use of
standardized review tools and could lead to more
thorough and efficient reviews (e.g., decreased
learning curve)

37
Summary

There are significant differences between CDISC
SDS V2 and V3 in terms of scope, design and
philosophy
For more information regarding SDTM Version 3.1
www.cdisc.org
Thank you!
William_J_Qubeck_at_Groton.Pfizer.com

Write a Comment

User Comments (0)