Data Migration - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Data Migration

Description:

Events, Medical History and Labs. Source data often has a multitude of dictionaries (COSTART, WHOART, MEDDRA, ICD9/10, SNOMED) ... – PowerPoint PPT presentation

Number of Views:1835
Avg rating:3.0/5.0
Slides: 26
Provided by: briank82
Category:
Tags: data | migration

less

Transcript and Presenter's Notes

Title: Data Migration


1
Data Migration
  • Massachusetts Biotechnology Council
  • 11-July-2008
  • Brian K. Perry, President
  • BKP Technologies, Inc.

2
Highlights
  • What is Data Migration?
  • Anatomy of Migration Projects
  • Migration Types and Strategies
  • Technical Considerations
  • Validation Considerations

3
What is Data Migration?
  • Data migration is the process of transferring
    data between storage types, formats, or computer
    systems. Data migration is usually performed
    programmatically to achieve an automated
    migration, freeing up human resources from
    tedious tasks.
  • - Wikipedia

4
What is Data Migration?
CDMS
Safety
Data Warehouse or Datamart
Migration ETL Process
EDC
Analytics
Preclin
Data
5
Anatomy of Migration Projects
Planning and Analysis Phase
Team Selection and Planning
Migration Strategy
Analyze Data Sources
Execution Phase
Validation
Migration
Data Mapping/ Programming
Go Live
6
Team Selection and Planning
  • Project Management
  • Clinical Data Management
  • Pre-Clinical
  • Product Safety/Pharmacovigilance
  • Information Technology
  • QA/Validation

7
Analysis of Data Sources
  • CDMS, EDC or Safety System
  • Direct Database Transfer
  • Flat Data Export Files
  • CDISC or E2B XML Exports
  • SAS Datasets
  • Other electronic sources
  • Microsoft Excel Spreadsheets
  • Home-grown databases
  • Paper Documents (Source)
  • Regulatory Submissions (NDA, 3500A, etc)

8
Migration Types
  • Single Use
  • End of In-House study
  • End of CRO study
  • Migration from legacy system
  • Acquisition/License of compound/product
  • Continuous
  • On-going Studies
  • Safety Data and Post Marketing Data

9
Migration Strategies Single Use
  • Data Formats
  • Full Database Dump
  • Flat File Exports
  • SAS Datasets
  • Structured Files (XML w/CDISC or E2B)
  • Considerations
  • Cleanliness of data source
  • Static nature of data

10
Migration Strategies Continuous
  • Data Formats
  • Full Database Dump
  • Structured Files (XML w/CDISC or E2B)
  • Considerations
  • Dynamic nature of data
  • Ability to adapt to changes in source system
  • Validation on-going

11
Migration Strategies CDISC/E2B
  • Leverages existing CDISC and E2B export
    functionality of CDMS, EDC and Safety systems
  • Data mapping is simplified because the standards
    are defined
  • But. Not all data in source database may be
    present in CDISC or E2B

12
Migration Strategies Database Transfer
  • Provides access to all data fields in the source
    and destination systems
  • More complicated mapping than CDISC or E2B
    options
  • May not be an option for single-use migrations
    where the source system is contained at a partner
    company or CRO

13
Migration Strategies Tools
  • Commercial Data Integration/Manipulation and ETL
    Tools
  • BizTalk Server Microsoft Corporation
  • Data Junction Pervasive Software Inc.
  • DataMirror Transformation Server DataMirror
    Corporation
  • Data Transformation Services (DTS) Microsoft
    Corporation
  • XML Spy - Altova
  • Open Source Tools
  • PERL
  • PHP

14
Technical Considerations
  • CDMS/Safety System view of Data
  • Optimized for Data Entry, Cleaning, Review and
    Regulatory Submission Preparation
  • Operational and transactional data model
  • Different data models and coded values
  • Data Mart/Data Warehouse view of Data
  • Optimized for data retrieval and analysis
  • Unified data model
  • Unified coding of values
  • Normalized or dimensional model of data

15
Technical Considerations
  • Clinical vs. Safety view of data

16
Technical Considerations
  • Identifying Data Elements
  • Data Fields and Values
  • Derived and Computed Values
  • Coding Dictionaries
  • Events/History COSTART, WHOART, MedDRA,
    ICD9/10, Custom dictionary
  • Meds and Products WHODRL, Custom dictionary
  • Metadata
  • Visit Structure
  • Company Products, Studies, Licenses
  • Code Lists

17
Technical Considerations
  • Data Element Issues
  • Data Type Issues
  • Data Field Size Issues
  • CDSIC and E2B Compliance Issues
  • Cleanliness and Integrity of Source Data
  • Transformations of Data
  • Coded Values

18
Technical Considerations
  • Coded Data
  • Events, Medical History and Labs
  • Source data often has a multitude of dictionaries
    (COSTART, WHOART, MEDDRA, ICD9/10, SNOMED)
  • Issues in maintaining multiple dictionary
    versions
  • Leveraging auto-encoders
  • Products
  • Typically WHODRL
  • Managing company products
  • Leveraging auto-encoders

19
Technical Considerations
  • Metadata
  • Visit Structure
  • Code Lists
  • Time Units
  • Dosing Units
  • Weight/Age Units
  • Product data (dose units, frequency,
    formulations, etc.)
  • Lab Codes
  • Causality codes

20
Technical Considerations
The Golden Rule of data migration Garbage
In Garbage Out
21
Validation Considerations
  • Validation Strategies
  • Tools and Process
  • Data Verification of Data Samples
  • Key Decision Drivers
  • Validation status of source system/data
  • Whether the migration is single-use or continuous

22
Validation Considerations
  • Validation Artifacts
  • User Requirements
  • Technical Specifications/Data Mapping Plan
  • Risk Assessment and Mitigation
  • Migration Master Plan
  • Unit Test Plan and Tests
  • Qualifications
  • Installation Qualification
  • Operational Qualification
  • Performance Qualification (Continuous Migrations)
  • Traceability Matrix
  • Final Report

23
Validation Considerations
  • Qualification
  • Installation Qualification (IQ) of Migration
    Tools
  • Operational Qualification (OQ) of
    Mapping/Transforms
  • Performance Qualification (PQ) for Continuous
    Migrations
  • Data Verification
  • Manual sampling and comparison of cases between
    data sources and destination safety system
  • Sample Size
  • ANSI Z1.4 (MIL-105)
  • Sqrt(n) 1
  • 10

24
Questions and Discussion
25
Contact Information
  • Brian K. Perry
  • President
  • BKP Technologies, Inc.
  • bkp_at_bkptech.com
  • 1.617.964.2100
Write a Comment
User Comments (0)
About PowerShow.com