Developing Information Systems for Cancer Research - PowerPoint PPT Presentation

About This Presentation
Title:

Developing Information Systems for Cancer Research

Description:

... Algorithms for Oncologic Disease Identification Using GeneSys SI ... August, 2002 175,000 Emory patients identified by cancer diagnosis loaded into GeneSys SI ... – PowerPoint PPT presentation

Number of Views:129
Avg rating:3.0/5.0
Slides: 53
Provided by: llor2
Category:

less

Transcript and Presenter's Notes

Title: Developing Information Systems for Cancer Research


1
Developing Information Systems for Cancer Research
  • Christopher Flowers, MD, MSc
  • Assistant Professor
  • Medical Director, Oncology Data Center
  • Bone Marrow and Stem Cell Transplant Center
  • Winship Cancer Institute
  • Emory University

2
Health Care Data IntegrationMedical Intelligence
Applications
3
What Data are available?
  • Patient Genomics
  • Microarrays and Gene Chip
  • Analysis Results
  • Quality Values
  • Hospital Patient Management
  • Patient Demographics
  • Inpatient, Outpatient, Patient Types
  • Location, Physician, Visits
  • Hospital Patient Accounting
  • Financial Data
  • Patient charges
  • Payments and Collections
  • Summarized Financial Visit Data
  • Charge Description

4
What Data are available?
  • Pharmacy
  • Orders, Drugs, Medication
  • Formulary
  • Drug Interactions
  • Costs
  • Medical Records
  • Procedures Diagnosis (CPT4 ICD9)
  • Visit, Abstract
  • Physician
  • Admit Diagnosis, Admit Source and Type
  • RDRG/DRG

5
What Data are available?
  • Clinic Patient Accounting
  • Patient Registration Demographics, Insurance
    (FSC), Employer, Case
  • Provider
  • General Ledger
  • Financial Data Invoices
  • Laboratory Results
  • Lab Orders, General Results and Micro
  • Clinic and Hospital Patients

6
What Data are available?
  • Radiation Oncology
  • Treatment Plans
  • Clinical Trials
  • Studies
  • Patient Demographics
  • Pathology
  • Cancer Registry
  • Patient Demographics and abstract
  • Pathology, Treatment Plans and Discharge Summary
  • Progress Notes, Radiology results, Charges

7
What Data are available?
  • Patient Chart Information
  • Physician Notes
  • Radiology Reports
  • HLA
  • Cancer Anatomic Path
  • Lab Test Results
  • Other (Forms entry)
  • IBMTR/ABMTR Form
  • Acute Myelogenous Form
  • Patient Profile Form
  • Informed Consent

8
Analysis of Search Algorithms for Oncologic
Disease Identification Using GeneSys SI
  • Michael Graiser, PhD1, Ashley Hilliard1, Rochelle
    Victor1,
  • Ragini Kudchadkar, MD1, Leroy Hill1, Michael S.
    Keehan, PhD2,
  • Jonathan Simons, MD1, Christopher Flowers, MD1

1 Winship Cancer Institute, Department of
Hematology and Oncology, Emory University School
of Medicine, Atlanta, GA (http//www.winshipcancer
institute.org) 2 NuTec Health Systems, Atlanta,
GA (email info_at_nutechealthsystems.com)
Emory University has a financial interest in
NuTec Health Systems, which designed and built
GeneSys SI. Emory may financially benefit from
this interest if NuTec is successful in marketing
GeneSys SI. This project may produce income for
Emorys charitable purposes and for NuTecs
commercial purposes.
9
Development of GeneSys SI
  • Collaborative effort between Emorys Winship
    Cancer Institute and NuTec Health Systems
  • Web-based query tool and genomic analysis tools
    designed with a team of Emory oncologists and
    research investigators
  • August, 2002 175,000 Emory patients identified
    by cancer diagnosis loaded into GeneSys SI
  • New patients added by individual patient consent
  • Ongoing efforts to add new sources of data
  • Tissue Banking
  • Genomic tools

10
GeneSys SI ModulesHealth Care Applications
11
GeneSys SI
12
GeneSys SI Architecture
  • Linked patient-level data
  • Pathology
  • Cancer Registry
  • Laboratory Results
  • Radiology Results
  • Medication utilization
  • Clinical outcomes
  • Genomics

Microarrays
Enterprise Application Interface
Query Engine
Cancer Registry
Data Warehouse
Legacy Databases
Clinical Trials
Pyxis
5
Occupational Exposure
5
Family History
5
Cancer Epidemiology
Tissue Banking (under construction)
13
GeneSys SI Architecture
GeneSys SI
Microarrays
Enterprise Application Interface
Query Engine
Cancer Registry
Data Warehouse
Legacy Databases
Clinical Trials
Pyxis
5
Occupational Exposure
5
Family History
5
Cancer Epidemiology
Tissue Banking (under construction)
14
Investigator Defined Forms Data
GeneSys SI
Microarrays
Enterprise Application Interface
Public Databases Genetic Protein
Query Engine
Cancer Registry
Data Warehouse
Legacy Databases
Clinical Trials
Pyxis
5
Occupational Exposure
5
Family History
5
Cancer Epidemiology
Tissue Banking (under construction)
15
Database Population
  • GeneSys SI contains information on patients
    who have visited Emory University Hospital,
    Crawford Long Hospital, or The Emory Clinic and
    have received an oncology diagnosis. Benign
    neoplasms are also included.

16
Numbers
  • Total patients 175,748
  • Newly consented 551
  • By ICD9 ICD10

17
Data currently available in GeneSys SI
DATA SOURCE ENTRY DATE
HISTORY (YEARS)
Emory Data Warehouse Hospital administrative (HealthQuest) Clinic administrative (IDX) Medical Records Clinical Labs Hospital Pharmacy Clinic Phamacy September, 1995 September, 1994 1987 January, 2001 January, 1998 April, 2002 9 10 17 3 6 2
Cancer Registry Emory Hosital Crawford Long Hospital 1977 1981 27 23
Clinical Trials 1981 21
Electronic Medical Record PowerChart 1991 13
Radiation Oncology The Emory Clinic Crawford Long Hospital 1994 2001 10 3
Forms Informed Consent July, 2003 1
Genomics TBD N/A
18
Linked Oncology Database
  • Useful for
  • Retrospective clinical outcomes research
  • Clinical trials planning
  • Cost effectiveness analyses
  • Storage of unique clinical data
  • Linking to public genomic and proteomic
    databases
  • Pharmacogenomics

19
Limitations of linked heterogeneous databases
  • Reliance on patient identifiers such as SSN to
    link
  • data entry errors, missing data, business
    practices
  • Patchwork of different databases not intended for
    research purposes
  • Reliance upon coded outcomes (e.g. ICD-9 codes)
  • frequently assigned by personnel unfamiliar with
    patient, disease, or procedure
  • Multiple sources for the same data
  • diagnosis, treatment, DOB, DOE, other
    demographics

Breitfeld et.al. J Clin Epi, 2001. Earle et al.
Med Care, 2002. Verstraeten et.al. Expert Rev.
Vaccines, 2003.
20
Research Objectives
  • Develop query algorithms to identify pts with a
    histological diagnosis
  • Follicular lymphoma
  • Examine sensitivity and specificity of query
    algorithms
  • Develop query strategies for identifying pts with
    other diseases of interest

21
(No Transcript)
22
10 Leading Cancer Sites by Gender, US, 2005
Men710,040
Women662,870
  • 32 Breast
  • 12 Lung bronchus
  • 11 Colon rectum
  • 6 Uterine corpus
  • 4 Non-Hodgkins lymphoma
  • 4 Melanoma of skin
  • 3 Ovary
  • 3 Thyroid
  • 2 Urinary bladder
  • 2 Pancreas
  • 20 All other sites

Prostate 33 Lung bronchus 13 Colon
rectum 11 Urinary bladder 7 Melanoma of skin
5 Non-Hodgkins lymphoma 4 Leukemia
3 Kidney 3 Oral cavity 3 Pancreas 2 Al
l other sites 17
Excludes basal and squamous cell skin cancers
and in situ carcinomas except urinary
bladder. American Cancer Society, 2005.
23
Lymph Node
Secondary Follicle
Afferent Lymphatic Vessel
Mantle Zone
Marginal Zone
Germinal Center
Primary Follicle
Postcapillary Venule
Subcapsular Sinus
Artery
Cortex
Medullary Cord
Medula
Medullary Sinus
Efferent Lymphatic Vessel
Courtesy of Thomas Grogan, MD.
24
WHO NHL Classification
  • B-cell
  • Precursor B-cell neoplasms
  • B-acute lymphoblastic leukemia (B-ALL)
  • Lymphoblastic lymphoma (LBL)
  • Peripheral B-cell neoplasms
  • B-cell chronic lymphocytic leukemia/small
    lymphocytic lymphoma
  • B-cell prolymphocytic leukemia
  • Lymphoplasmacytic lymphoma/immunocytoma
  • Mantle cell lymphoma
  • Follicular lymphoma
  • Extranodal marginal zone B-cell lymphoma of MALT
    type
  • Nodal marginal zone B-cell lymphoma
  • Splenic marginal zone lymphoma
  • Hairy cell leukemia
  • Plasmacytoma/plasma cell myeloma
  • Diffuse large B-cell lymphoma
  • Burkitts lymphoma
  • T-cell/NK-cell
  • Precursor T-cell neoplasm
  • Precursor T-acute lymphoblastic leukemia (T-ALL)
  • Lymphoblastic lymphoma (LBL)
  • Peripheral T-cell/NK-cell neoplasms
  • T-cell chronic lymphocytic leukemia/prolymphocytic
    leukemia
  • T-cell granular lymphocytic leukemia
  • Mycosis fungoides/Sézary syndrome
  • Peripheral T-cell lymphoma not otherwise
    characterized
  • Hepatosplenic gamma/delta T-cell lymphoma
  • Angioimmunoblastic T-cell lymphoma
  • Extranodal T-/NK-cell lymphoma, nasal type
  • Enteropathy-type intestinal T-cell lymphoma
  • Adult T-cell lymphoma/leukemia (HTLV1)
  • Anaplastic large cell lymphoma, primary systemic
    type
  • Anaplastic large cell lymphoma, primary
    cutaneous type
  • Aggressive NK-cell leukemia

Fisher et al. In DeVita et al, eds. Cancer
Principles and Practice of Oncology.
20051967.Jaffe et al, eds. World Health
Organization Classification of Tumours. 2001.
25
WHO NHL Classification
  • B-cell
  • Precursor B-cell neoplasms
  • B-acute lymphoblastic leukemia (B-ALL)
  • Lymphoblastic lymphoma (LBL)
  • Peripheral B-cell neoplasms
  • B-cell chronic lymphocytic leukemia/small
    lymphocytic lymphoma
  • B-cell prolymphocytic leukemia
  • Lymphoplasmacytic lymphoma/immunocytoma
  • Mantle cell lymphoma
  • Follicular lymphoma
  • Extranodal marginal zone B-cell lymphoma of MALT
    type
  • Nodal marginal zone B-cell lymphoma
  • Splenic marginal zone lymphoma
  • Hairy cell leukemia
  • Plasmacytoma/plasma cell myeloma
  • Diffuse large B-cell lymphoma
  • Burkitts lymphoma
  • T-cell/NK-cell
  • Precursor T-cell neoplasm
  • Precursor T-acute lymphoblastic leukemia (T-ALL)
  • Lymphoblastic lymphoma (LBL)
  • Peripheral T-cell/NK-cell neoplasms
  • T-cell chronic lymphocytic leukemia/prolymphocytic
    leukemia
  • T-cell granular lymphocytic leukemia
  • Mycosis fungoides/Sézary syndrome
  • Peripheral T-cell lymphoma not otherwise
    characterized
  • Hepatosplenic gamma/delta T-cell lymphoma
  • Angioimmunoblastic T-cell lymphoma
  • Extranodal T-/NK-cell lymphoma, nasal type
  • Enteropathy-type intestinal T-cell lymphoma
  • Adult T-cell lymphoma/leukemia (HTLV1)
  • Anaplastic large cell lymphoma, primary systemic
    type
  • Anaplastic large cell lymphoma, primary
    cutaneous type
  • Aggressive NK-cell leukemia

Fisher et al. In DeVita et al, eds. Cancer
Principles and Practice of Oncology.
20051967.Jaffe et al, eds. World Health
Organization Classification of Tumours. 2001.
26
Methods
  • Selected disease for initial query algorithm
    study (follicular lymphoma - FL)
  • Developed and ran queries for FL using all
    available sources for diagnosis
  • Clinic Hospital ICD9 codes, Cancer Registry
    histology codes, Medical record text reports
    chart, pathology
  • Verified diagnosis for each patient
  • pathology reports
  • other chart reports
  • For each query calculated specificity and
    sensitivity

27
GeneSys SI queries to find follicular lymphoma
patients
QUERY SOURCE CRITERIA
QC Cancer Registry NHL patients NHL between 1985-2002
Q1 Cancer Registry, histology (ICD-0) 9690, 9691, 9695, 9698
Q2 Text search - pathology reports follicular near lymphoma
Q3 Text search - pathology reports follicular lymphoma
Q4 Text search - all medical records follicular near lymphoma
Q5 Text search - all medical records follicular lymphoma
Q6 Clinic ICD-9 diagnosis codes 202.0, 202.00, 202.01, 202.02, 202.03, 202.04, 202.05, 202.06, 202.07, 202.08
Q7 Hospital ICD-9 diagnosis codes (same ICD9 codes)
Q8 Query 2 6 (criteria for query 2 OR 6)
Q9 Query 4 6 (criteria for query 4 OR 6)
Q10 Query 1 2 (criteria for query 1 OR 2)
28
Patients found with follicular lymphoma
queries
QUERY SOURCE PATIENTS RESULTS
QC Cancer Registry NHL patients 425
Q1 Cancer Registry, histology (ICD-0) 242
Q2 Text search 1 pathology reports 406
Q3 Text search 2 pathology reports 126
Q4 Text search 1 all medical records 531
Q5 Text search 2 all medical records 193
Q6 Clinic ICD-9 codes 901
Q7 Hospital ICD-9 codes 288
Q8 Query 2 6 1137
Q9 Query 4 6 1233
Q10 Query 1 2 498
29
Schematic Diagram of Query Outcomes
30
Schematic Diagram of Query Outcomes
Follicular Lymphoma
Other Diagnosis
n 1520
31
Schematic Diagram of Query Outcomes
Follicular Lymphoma
Other Diagnosis
n 1520
Q1
32
RESULTS Analysis of follicular lymphoma cases
PurplePath verified Red Chart
verified WhiteTotal verified Query
Pat True Pos False Pos
True Neg False Neg
Q1 242 15144195 232447 7653031068 14565210
Q2 406 26919288 10216118 686311997 2790117
Q3 126 966102 21324 7673241091 200103303
Q4 531 27994373 13127158 657300957 171532
Q5 193 12336159 28634 7603211081 17373246
Q6 901 14335178 490233723 29894392 15374227
Q7 288 10631137 10150151 687277964 19078268
Q8 1137 28043323 569245814 21982301 166682
Q9 1233 286102388 591254845 19773270 10717
Q10 498 28552337 12338161 665289954 11768
33
Results Algorithms Sensitivity Specificity
Query Case Identified Sensitivity Path Specificity Path Sensitivity All Notes Specificity All Notes
Q1 195 51 97 48 96
Q2 288 91 87 71 89
Q3 102 32 97 25 98
Q4 373 94 83 92 86
Q5 159 42 96 39 97
Q6 178 48 38 44 35
Q7 137 36 87 34 86
Q8 323 95 28 80 27
Q9 388 97 25 96 24
Q10 337 96 84 48 86
For sensitivity and specificity calculations,
numbers of true and false negatives were based on
the total population of patients unique to
these queries (1520 pts 1084 pt w/ path) and not
the entire patient population in GeneSys SI.
34
Results Algorithms Sensitivity Specificity
Query Case Identified Sensitivity Path Specificity Path Sensitivity All Notes Specificity All Notes
Q1 195 51 97 48 96
Q2 288 91 87 71 89
Q3 102 32 97 25 98
Q4 373 94 83 92 86
Q5 159 42 96 39 97
Q6 178 48 38 44 35
Q7 137 36 87 34 86
Q8 323 95 28 80 27
Q9 388 97 25 96 24
Q10 337 96 84 48 86
For sensitivity and specificity calculations,
numbers of true and false negatives were based on
the total population of patients unique to
these queries (1520 pts 1084 pt w/ path) and not
the entire patient population in GeneSys SI.
35
Results Algorithms Sensitivity Specificity
Query Case Identified Sensitivity Path Specificity Path Sensitivity All Notes Specificity All Notes
Q1 195 51 97 48 96
Q2 288 91 87 71 89
Q3 102 32 97 25 98
Q4 373 94 83 92 86
Q5 159 42 96 39 97
Q6 178 48 38 44 35
Q7 137 36 87 34 86
Q8 323 95 28 80 27
Q9 388 97 25 96 24
Q10 337 96 84 48 86
For sensitivity and specificity calculations,
numbers of true and false negatives were based on
the total population of patients unique to
these queries (1520 pts 1084 pt w/ path) and not
the entire patient population in GeneSys SI.
36
Results Algorithms Sensitivity Specificity
Query Case Identified Sensitivity Path Specificity Path Sensitivity All Notes Specificity All Notes
Q1 195 51 97 48 96
Q2 288 91 87 71 89
Q3 102 32 97 25 98
Q4 373 94 83 92 86
Q5 159 42 96 39 97
Q6 178 48 38 44 35
Q7 137 36 87 34 86
Q8 323 95 28 80 27
Q9 388 97 25 96 24
Q10 337 96 84 48 86
For sensitivity and specificity calculations,
numbers of true and false negatives were based on
the total population of patients unique to
these queries (1520 pts 1084 pt w/ path) and not
the entire patient population in GeneSys SI.
37
ROC Plot for Search Algorithms
38
Conclusions
  • Highest Sensitivity
  • Free Text search w/ near algorithm
  • Combination queries
  • Highest Specificity
  • Cancer Registry code, Free Text query follicular
    lymphoma
  • Limiting search to pathology reports improves
    specificity
  • Best Overall Performance
  • Free Text query follicular lymphoma /- Cancer
    Registry code

39
Future Directions
  • Use query results for outcomes research on FL
    (n405)
  • Test query algorithms for
  • other Non-Hodgkins lymphoma
  • Breast ca., prostate ca., colorectal ca.
  • Develop and test query algorithms for treatments
    and outcomes
  • Modify the query engine and interface to automate
    algorithms

40
Winship Cancer InstituteOncology Informatics
  • Leroy Hill
  • Michael Graiser, PhD
  • Rochelle Victor
  • Ragini Kudchadkar, MD
  • Susan Moore MD, MPH
  • Bonita Feinstein RN
  • Ashley Hilliard
  • James Yang
  • John Tumeh
  • Simone Parker

41
Potential Projects
  • Cancer Outcomes Research
  • Genomic Discovery / Pharmacogenomics
  • Clinical Trials Support
  • Medical Informatics

42
Cancer Outcomes Research
  • Examining Treatment Strategies Outcomes for
    Fludarabine Refractory CLL
  • The influence of Comorbidity on Outcome in
    patients undergoing Allogeneic Transplantation
  • Other Cancer Treatments
  • Examining Treatment Strategies Outcomes for
    Relapsed Follicular Lymphoma
  • Management of Squamous Cell Cancer of the Anus
    (Reducing Surgical Morbidity)
  • Examining Regimen-Related Toxicity

43
Pharmacogenomics
  • Provide utilization data for cost-effectiveness
    studies
  • Provide resources to support observational
    studies and clinical trials in pharmacogenomics
  • Resource for developing algorithms for pattern
    recognition

44
Clinical Trials Support
  • Screening algorithms for identifying patients
    eligible for clinical trials
  • Identify populations that would permit clinical
    trial investigation
  • Data resource for monitoring trial outcomes
  • Regimen-related toxicity
  • Treatment Response
  • Survival

45
Medical Informatics
  • Advanced database search algorithms
  • Pattern Recognition
  • Neural Networks
  • Bayesian Networks
  • Hierarchical Statistical Models

46
caCORE
Biomedical Objects
Common Data Elements
Enterprise Vocabulary
47
Common Data Elements (CDEs)
  • Data descriptors or metadata for cancer
    research
  • Precisely defining the questions and answers
  • What question are you asking, exactly?
  • What are the possible answers, and what do they
    mean?
  • Ongoing projects covering various domains
  • Clinical Trials
  • Imaging
  • Biomarkers
  • Genomics

48
caBIO Overview
  • Software industry design paradigms
  • Unified Modeling Language (UML) representations
    of biomedical objects
  • Java 2 Enterprise Edition n-tier system
    architecture
  • Broad coverage of biomedicine (but not
    comprehensive yet)
  • Genomics
  • Gene expression
  • Model systems for cancer
  • Human clinical trials
  • Data on-tap via application programming
    interfaces

49
(No Transcript)
50
Cancer Clinical Database Application SystemWeb
Form Generation
51
Web form input fields for Cancer Chemotherapy
52
Configurable column attributes for the Cancer
Chemotherapy form
Write a Comment
User Comments (0)
About PowerShow.com