Title: Data Mining Applications In Healthcare
1Data Mining Applications In Healthcare
TEPR 2004 May 21, 2004 V. Juggy Jagannathan VP
of Research juggy_at_medquist.com
2Introduction
Goals of todays presentation
- Provide an overview of the technologies that are
relevant to the development and deployment of
data mining solutions in healthcare
Allow participants to evaluate where the
technology is useful
3(No Transcript)
4.
Topic Outline
- Data mining
- Uses
- Algorithms
- Technology
- Applications in healthcare
5.
Data Mining Uses
Understand and characterize
Clustering Summarization Association
Rules Sequence Discovery
Extrapolate and forecast
Classification Regression Time-Series
6Data Mining Algorithms
- Clustering
- Hierarchical
- Partitioned
- Genetic
- Association
- Apriori Algorithm
- If.Then rules
- Classification
- Statistical
- K-nearest neighbors
- Decision trees
- ID3
- C4.5
- Neural Networks (Self Organizing Maps)
7Technology
Technology solutions
Data Mining Infrastructure Technologies
- Database Technologies
- On-Line Analytical Processing (OLAP)
- Visualization Technologies
- Data scrubbing technologies
- Natural Language Processing (NLP)
8Database Technologies
- Database
- OLAP
- Visualization
- Scrubbing
- NLP
- Data warehouse vs. Data mart
- Relational technologies
- Oracle
- Microsoft
- XML-databases
- Raining Data
9On-Line Analytical Processing
- Database
- OLAP
- Visualization
- Scrubbing
- NLP
- Analyze multi-dimensional data
- N-dimensional data cubes
- Operations
- Roll-up
- Drill-down
- Slice and dice
- Pivot
10Visualization
- Database
- OLAP
- Visualization
- Scrubbing
- NLP
- 2D/3D Charts
- Topographic displays
- Cluster displays
- Histograms
- Scatter plots
- Advanced visualization (genomic data patterns)
- http//www.ncbi.nlm.nih.gov/Tools/
11- Database
- OLAP
- Visualization
- Scrubbing
- NLP
- Data cleansing
- Filling in missing data
- In healthcare, there is a strong need for
de-identification to protect privacy
12De-Identification of Medical Records
- Names
- all elements of a street address, city, county,
precinct, zip code, their equivalent - geocodes, except for the initial three digits of
a zip code for areas that contain over 20,000
people - all elements of dates (except year) for dates
directly related to the individual, (e.g., birth
date, admission/discharge dates, date of death)
and all ages over 89 - and all elements of dates (including year)
indicative of such age, except that such ages and
elements may be aggregated into a single category
of age 90 or older - telephone numbers
- fax numbers
- e-mail addresses
- social security numbers
- medical record numbers
- health plan beneficiary numbers
- account numbers
- certificate/license numbers
- license plate numbers, vehicle identifiers and
serial numbers - device identifiers and serial numbers
- URL addresses
- Internet Protocol (IP) address numbers
- biometric identifiers, including finger and voice
prints - full face photographic images and comparable
images - any other unique identifying number except as
created by IHS to re-identify information.
Source Policy and Procedures for
De-Identification of Protected Health Information
and Subsequent Re-Identification 45 CFR
164.514(a)-(c) posted by IHS (Indian Health
Services)
13Natural Language Processing
- Database
- OLAP
- Visualization
- Scrubbing
- NLP
- NLP Uses
- translation, summarization, information
extraction, document retrieval or categorization - NLP Approaches
- Clustering, Classification, Linguistic analysis,
knowledge-based analysis
- NLP Companies in health care
- A-Life
- Language and Computing
14Applications in Healthcare
- Safety and quality
- Clinical Research
- Financial
- Public Health
15To err is Human IOM Report
- Safety and Quality
- Clinical Research
- Financial
- Public Health
- Characterization
- JCAHO Core Measures
- CMS Quality measures starter set
- Improves patient care reactive response
- Prediction
- Identifying cases that can result in bad clinical
outcomes and raising appropriate alarms - Impacts patient care proactive response
16Quality Measures Initial Set
Source http//www.cms.hhs.gov/quality/hospital/o
verview.pdf
17Safety and Quality
- University of Mississippi Medical Center
- Data Warehouse Technologies to understand
Medication Errors Funded by AHRQ - Anonymous report data collection
- Data mining technologies
- Use of Neural networks and associative rule
inference
18Clinical Research Clinical Trials
- Safety and Quality
- Clinical Research
- Financial
- Public Health
- Pharmacy and medical claims data
- Drug efficacy and clinical trials for example
how effective is a particular drug regimen - Protein structure analysis
- Genomic data mining
- Diagnostic Imaging data research
19The bottom line on cost
- Safety and Quality
- Clinical Research
- Financial
- Public Health
- General Utilization review does the care
provided meet accepted clinical and cost
guidelines - Drug Utilization review
- Outlier analysis exceptions to treatment
analyzing treatments which cost more than the
normal or less than normal.
20Data mining in public health
- Safety and Quality
- Clinical Research
- Financial
- Public Health
- Syndromatic surveillance
- Bio-terrorism detection
- Communicable disease reporting (Centers for
Disease Control (CDC)) - DAWN (Drug Awareness and Warning Network)
- Federal Drug Agency (FDA) reporting of adverse
drug events.
Example effort AEGIS
21Conclusion
- Classification
- Clustering
- Association rules
- Database
- OLAP
- Visualization
- Scrubbing
- NLP
- Data mining
- Uses
- Algorithms
- Technology
- Applications in healthcare
- Safety and Quality
- Clinical Research
- Financial
- Public Health
22Conclusion
Technology solutions
juggy_at_medquist.com