Mining Patterns and Linkages in Medical Data - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Mining Patterns and Linkages in Medical Data

Description:

Example: ulcer == dysplasia, cold == cough. Shortest Path Discovery Problem. ... dysplasia. cancer. helicobacter. Rule Evaluation: ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 2
Provided by: projecti
Category:

less

Transcript and Presenter's Notes

Title: Mining Patterns and Linkages in Medical Data


1
Mining Patterns and Linkages in Medical Data
Department of Computer Science
EngineeringCollege of Engineering
Samah Fodeh and Pang-Ning Tan
  • Health care data contains a wealth of information
    that can be utilized to help improve the quality
    of patient care and to enhance the capability for
    public health surveillance.
  • Challenges
  • Large database size ( 100 GB).
  • Need efficient mining algorithms.
  • 2. Trend in data.
  • Few records prior to 2001.
  • Evaluation validation of data mining results.
  • Large number of patterns, some of which may be
    spurious.
  • Need a principled approach for evaluating and
    interpreting patterns
  • 4. Noise in data.
  • Need a robust algorithm.

Overall Framework
  • Querying Domain Knowledge
  • Domain knowledge is obtained from the Pubmed
    database provided by National Library of
    Medicine.
  • Challenge
  • Not all articles may be relevant to queried term.
  • Cost for querying

Experimental Results
  • Methodology
  • Preprocess the data.
  • Apply an efficient frequent pattern mining
    algorithm called PGMiner.
  • Extract association rules from the frequent
    patterns.
  • For each rule
  • Query the PubMed database to find articles about
    the concepts.
  • Use the query results to evaluate and interpret
    the rule.

helicobacter
lesions
gastric
bleeding
carcinona
dysplasia
therapy
peptic
congenital
  • Rule Evaluation
  • To evaluate the rule, compare the confidence
    measure computed from medical database (C1) to
    the estimated confidence obtained from medical
    literature (C2).
  • A rule is considered interesting if
  • Purpose is to uncover previously unknown
    relationships

ulcer
healing
gastrointestinal
cancer
infection
cancer
cells
pylori
  • Summary
  • Combine data mining with domain knowledge
    automatically extracted from medical literature
    to detect unexpected relationships and to aid the
    interpretation of discovered rules.
Write a Comment
User Comments (0)
About PowerShow.com