Title: caBIG IVIWS SIG: Imaging Vocabularies
1caBIG IVIWS SIGImaging Vocabularies Common
Data ElementsBreakout Overview
- Curtis P. Langlotz, MD, PhD
- University of Pennsylvania
- Daniel L. Rubin, MD, MS
- Stanford University
- Mike Keller, PhD
- Booz Allen Hamilton
2NCI Informatics Long Range Planning, circa 1999
The CII
3Importance of Common Data Collection Methods,
circa 1999
- Serve as building blocks for the CII
- Allow pooling of data and comparison of results
among clinical trials - Facilitate enrollment of patients in clinical
trials - Avoid redundant data collection (enter-once,
use-many principle) - Automate and expedite administration of clinical
trials
4Medical Vocabularies Completeness for Radiology
Langlotz Caldwell, J Digit Imaging 15(1S)201,
2002
5What is RadLex?
- 26 participating organizations
- 9 committees
- 92 radiologist participants
- 5,308 anatomic concepts
10-30 percent of these concepts are not found in
SNOMED-CT
6Motivations for Common Imaging Terminology
- Automatic indexing and retrieval of teaching
files - Point and click structured reporting systems
- Comparison or unification of disparate research
databases
- Reference datasets for cancer imaging research
- Standardized image mark-up and annotation tools
- Common vocabulary and data elements for cancer
imaging
7Fundamentals ofImaging Terminology Ontology
- Daniel L. Rubin, MD, MS
- Stanford University
8Terminologies
- A constrained list of terms
- Usually shown as a list or taxonomy
- Usually few attributes (e.g., ID code, synonyms)
- Usually 1 or no relationships
- No relations ? list of terms
- 1 relation ? taxonomy
- Use
- Coding
- Indexing
- Simple search
------Diseases----- 003. _at_ OTHER SALMONELLA
INFECTIONS 003.0 SALMONELLA GASTROENTERITIS 003.
2 _at_ LOCALIZED SALMONELLA INFECTIONS 003.20
LOCALIZED SALMONELLA INFECTION,
UNSPECIFIED 003.21 SALMONELLA MENINGITIS 003.29
OTHER LOCALIZED SALMONELLA INFECTIONS ------Proced
ures----- 01. _at_ INCISION AND EXCISION OF SKULL,
BRAIN,... 01.0 _at_ CRANIAL PUNCTURE 01.01
CISTERNAL PUNCTURE 01.09 OTHER CRANIAL PUNCTURE
9What is an ontology?
- Similar to terminologies, specifying concepts
(entities) and attributes - Also specifies multiple relationships among
concepts - Permits rich knowledge representation
- Supports complex inference
- Use
- Coding, indexing, and retrieval (like
terminologies) - Reasoning and intelligent applications
- Information integration
- Semantic Web
10Anatomy ontology explicit representation of
knowledge in various relationships
11(No Transcript)
12Vocabulary/CDE Strategy
Metadata storage formats
Metadata for Images
NLP
Terminologies CDEs
Queries Analysis
Image Annotation
Vocabularies Metadata
Formats Tools
Applications
13Vocabulary/CDE Strategy
- Metadata Terminology
- Define image metadata useful to collect for
cancer researchDevelop an image mark-up standard
and associated open source and free annotation
creation and display tools - Determine vocabularies ontologies to populate
the metadata - Formats Tools
- Define formats for associating data and metadata
with images - Identify/develop tools for annotating images
- Develop/reuse NLP methods to extract metadata
from text - Testbed/applications using Vocabulary/CDE (tools
and methods to use metadata to support cancer
research) - Retrieve cases based on terminology-based queries
and image annotations (e.g., trends in tumor
size, image features) - Use ontology annotations on images to combine
image data with clinical and molecular data
14Vocabularies Common Data ElementsProposed Work
Items1 and 2
- Curtis P. Langlotz, MD, PhD
- University of Pennsylvania
15Proposed Work Items
- Create caDSR compatible CDEs from standard
imaging vocabulary terms - Cancer imaging research playbook Devices,
procedures, and protocols - Using terminology/ontology to markup or annotate
images - Evaluate natural language processing (NLP) tools
for prose image metadata (e.g. radiology reports)
16ACRIN
- American College of Radiology Imaging Network
- NCI-funded imaging clinical trial cooperative
group - Dozens of trials funded, including some very high
profile trials (DMIST, NLST) - Tens of thousands of subjects
- Case report forms containing hundreds of
potential CDEs
17Data Collection CDE Example
- Please describe the margins of the mass
- Smooth
- Lobulated
- Irregular
- Spiculated
- Obscured
18Data Collection CDE Example
- Please describe the margins of the mass
- Smooth
- Lobulated
- Irregular
- Spiculated
- Obscured
19The Playbook for Imaging in Cancer Research
- Cancer Research Imaging Procedures and Protocols
- An ontology of the imaging devices, procedures,
and protocols that are used for experimental
cancer imaging - (e.g., 7T 18-cm horizontal bore 4.7T 33-cm bore
magnet operating at 200 MHz for 1-H imaging
experiments) - Common, vendor-independent language to describe
experimental imaging instruments.
20Proposed Work Items
- Create caDSR compatible CDEs from standard
imaging vocabulary terms - Cancer imaging research playbook Devices,
procedures, and protocols - Using terminology/ontology to markup or annotate
images - Evaluate natural language processing (NLP) tools
for prose image metadata (e.g. radiology reports)
21Vocabularies Common Data Elements Proposed
Work Items3 and 4 and Summary
- Daniel L. Rubin, MD, MS
- Stanford University
22Formats Tools
- Metadata Storage Formats
- Need to define a format to associate
instantiations of metadata (annotations) with
images - Image Annotation (mark-up)
- Need tools to annotate images and that adopt
metadata standards adopted by caBIG - NLP
- Goal access free text to allow correlative
research with images - Medium radiology/pathology reports published
literature - Uses indexing/retrieval, information extraction
23Metadata Terminology
- Metadata
- Determine requirements for metadata
- Interview cancer researchers (NCI-funded
Cooperative Clinical Trial Therapy Groups, ACRIN,
industry) re image access/analysis needs - Review prior image-based cancer trials
- Inventory other image metadata standards
efforts - DICOM, HL7, Commercial systems
- Consider analogy to MIAMI (microarray
experiments)the minimal information necessary to
describe a medical image - Identify PHI data fields to help other
applications to anonymize data
24Image Annotation
- Inventory existing tools for annotating images
- Create custom tools for associating metadata with
images - Image annotation tool
- Structured data acquisition tool that is part of
clinical trial data collection process, or
integrates with existing clinical trial tools
25Natural Language Processing
- Determine requirements for NLP
- E.g., extract entities and relations from
radiology reports map to ontologies, etc - Inventory existing NLP tools
- caTIES, MEDLEE, Ricky Taira tools, Meta-Map and
open source - Select or develop NLP tools to fulfill
requirements
26Overall Mission Motivating the Breakout Session
- Extract meaning from imaging data to improve
outcomes for patients with cancer or pre-cancer - Support correlative imaging science
- Clinical trials are conducted by Cancer Centers,
Consortia, and Cooperative Groups - Need to structure imaging content of such trials
- Transmit the pertinent imaging data and metadata
together with clinical trials data to an archive
maintained by the NCI - Need query and data mining capability to
determine trends and patterns in imaging data
across clinical trials
27Vocabulary/CDE Strategy
Metadata storage formats
Metadata for Images
NLP
Terminologies CDEs
Queries Analysis
Image Annotation
Vocabularies Metadata
Formats Tools
Applications
28Vocabulary/CDE Strategy
- Metadata Terminology
- Define image metadata useful to collect for
cancer researchDevelop an image mark-up standard
and associated open source and free annotation
creation and display tools - Determine vocabularies ontologies to populate
the metadata - Formats Tools
- Define formats for associating data and metadata
with images - Identify/develop tools for annotating images
- Develop/reuse NLP methods to extract metadata
from text - Testbed/applications using Vocabulary/CDE (tools
and methods to use metadata to support cancer
research) - Retrieve cases based on terminology-based queries
and image annotations (e.g., trends in tumor
size, image features) - Use ontology annotations on images to combine
image data with clinical and molecular data
29Proposed Work Items
- Create caDSR compatible CDEs from standard
imaging vocabulary terms - Cancer imaging research playbook Devices,
procedures, and protocols - Using terminology/ontology to markup or annotate
images - Evaluate natural language processing (NLP) tools
for prose image metadata (e.g. radiology reports)
30