The Neighborhood Auditing Tool - PowerPoint PPT Presentation

1 / 79
About This Presentation
Title:

The Neighborhood Auditing Tool

Description:

The Neighborhood Auditing Tool James Geller Michael Halper Yehoshua Perl C. Paul Morrey – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 80
Provided by: HSS88
Category:

less

Transcript and Presenter's Notes

Title: The Neighborhood Auditing Tool


1
The NeighborhoodAuditing Tool
  • James Geller
  • Michael Halper
  • Yehoshua Perl
  • C. Paul Morrey

2
Research Paper
  • C.P. Morrey, J. Geller, M. Halper, Y. Perl. The
    Neighborhood Auditing Tool A hybrid interface
    for auditing the UMLS. J Biomed Inform,
    42(3)468-89, 2009.

2
3
Overview
  • Goals of an Auditors Tool for the UMLS
  • Principles of Auditing with Neighborhoods
  • The Idea of a Hybrid Display
  • Current State of the NAT Serving the Auditor
  • Presentation of NAT Features
  • Live Audit Session
  • Planned State of the NAT Guiding the Auditor
  • Conclusions
  • Future Work

3
3
4
Auditing the UMLS
  • About 150 source vocabularies
  • It is natural that inconsistencies will appear
  • Over 2.1 million concepts and nearly 9.7 million
    terms
  • Two level structure consisting of the Semantic
    Network and the Metathesaurus

4
UMLS Metathesaurus version 2009AA
5
Previous Work on Auditing
  • H. Gu, Y. Perl, J. Geller, M. Halper, L. Liu, and
    J.J. Cimino. Representing the UMLS as an
    Object-oriented Database Modeling Issues and
    Advantages. J Am Med Inform Assoc, 7(1)66-80,
    2000.
  • J. Geller, H. Gu, Y. Perl, and M. Halper.
    Semantic refinement and error correction in large
    terminological knowledge bases. Data Knowledge
    Engineering, 45(1)1-32, 2003.
  • J.J. Cimino, H. Min, and Y. Perl. Consistency
    across the hierarchies of the UMLS Semantic
    Network and Metathesaurus. J Biomed Inform,
    36(6)450-461, 2003.
  • H. Gu, Y. Perl, G. Elhanan, H. Min, L. Zhang, Y.
    Peng. Auditing concept categorizations in the
    UMLS. Artif Intell Med, 31(1)29-44, 2004.
  • Y. Chen, Y. Perl, J. Geller, and J.J. Cimino.
    Analysis of a study of the users, uses, and
    future agenda of the UMLS. J Am Med Inform
    Assoc, 14(2)221-231, 2007.

6
Previous Work on Auditing (contd)
  • H. Gu, G. Hripcsak, Y. Chen, C.P. Morrey, G.
    Elhanan, J.J. Cimino, J. Geller, and Y. Perl.
    Evaluation of a UMLS auditing process of semantic
    type assignments. In J.M. Teich, J. Suermondt,
    and G. Hripcsak, editors, Proc AMIA Symp, pages
    294-298, Chicago IL, Nov. 2007.
  • Y. Chen, H. Gu, Y. Perl, J. Geller, M. Halper.
    Structural group auditing of a UMLS semantic
    type's extent. J Biomed Inform. 2009
    Feb42(1)41-52.
  • L. Chen, C.P. Morrey, H. Gu, M. Halper, Y. Perl.
    Modeling multi-typed structurally viewed
    chemicals with the UMLS Refined Semantic Network.
    J Am Med Inform Assoc, 16(1)116-31, 2009.
  • Y. Chen, H. Gu, Y. Perl, J. Geller. Structural
    group-based auditing of missing hierarchical
    relationships in UMLS. J Biomed Inform. 2009
    Jun42(3)452-67.
  • Y. Chen, H. Gu, Y. Perl, M. Halper, and J. Xu,
    Expanding the extent of a UMLS Semantic Type via
    Group Neighborhood Auditing. J Am Med Inform
    Assoc, Accepted for publication.

6
7
How we did it before the NAT Provide Info as
Paper Form
CPT C1081844 Antonospora locustae SRC NCBI STY
T004T009 Fungus Invertebrate DEF SYN
Antonospora locustae Nosema locustae PAR
AntonosporaSTY Invertebrate CHD
Data shown for this concept is from the UMLS
Metathesaurus version 2006AC
8
Auditing Results also Paper Form
  • (C1081844) Antonospora locustae
  • STY Fungus Invertebrate
  • No errors
  • Semantic Type Error Fungus
  • Semantic Type Error Invertebrate
  • Add Semantic Type______________________
  • Ambiguity
  • Other error_____________________________
  • Comments _____________________________
    ______________________________________

8
9
Goals of an Auditors Tool for the UMLS
  • Display relevant information to the auditor.
  • Do not overwhelm the auditor with too much
    information.
  • Help the auditor focus on areas most likely to
    contain errors.
  • Algorithms suggest likely erroneous concepts
  • Concepts are reviewed in a neighborhood display

9
10
Principles of Auditing with Neighborhoods
  • Several years of experience Auditing is to a
    large degree a local activity.
  • Concepts have two kinds of knowledge elements
  • Textual Knowledge Elements Preferred term, CUI,
    synonyms, LUI, definition, sources, semantic
    types
  • Contextual Knowledge Elements Neighbors

10
11
Neighborhoods
  • Focus concept The concept presently under review
  • Immediate Neighborhood The set of concepts
    reachable from the focus concept by stepping one
    relationship (up, down, lateral, etc.)
  • Extended neighborhood Includes parents of
    parents (grandparents), children of children
    (grandchildren) and siblings. No lateral chains.

11
12
References about Neighborhood
  • M.S. Tuttle, D.D. Sherertz, N.E. Olson, M.S.
    Erlbaum, W.D. Sperzel, and L.F. Fuller, et al.
    Using META-1, the first version of the UMLS
    Metathesaurus. In Proc 14th Annu Symp Comput Appl
    Med Care, pages 131-135, Washington, D.C., 1990.
  • S.J. Nelson, M.S. Tuttle, W.G. Cole, D.D.
    Sherertz, W. D. Sperzel, M.S. Erlbaum, L.L.
    Fuller, N.E. Olson, From meaning to term
    semantic locality in the UMLS Metathesaurus. In
    Proc Annu Symp Comput Appl Med Care, pages
    209-213, Washington, D.C., 1991.

13
Immediate Neighborhood
13
14
Extended Neighborhood
14
15
Up-Extended and Down-Extended Neighborhood
  • An up-extended neighborhood includes grandparents
    and the immediate neighborhood.
  • A down-extended neighborhood includes
    grandchildren and the immediate neighborhood.
  • Give auditor all s/he needs but not more.

16
Semantic Type Neighborhood
  • If we provide the semantic types for every
    concept, those also form a neighborhood.
  • It is important to keep the information of which
    semantic types are assigned to which concepts.

17
The Idea of a Hybrid Display
  • Diagrams are wonderful as long as they fit on
    one screen.
  • Indented text is wonderful as long as there are
    no or very few multiple parents.
  • But the UMLS does not fit onto one screen and
    there are many cases of multiple parents.

17
18
What makes a diagram wonderful?
  • You can follow parent/child paths with your eyes.
  • You can get a feeling for everything a concept is
    connected to with one look.
  • You can see multiple parents and multiple paths
    with one look.
  • You can see global features (short and bushy
    versus tall and sparse, or (gasp!) tall and
    bushy).

18
19
What makes indented text wonderful?
  • Indentation expresses parenthood compactly and
    elegantly.
  • There are no lines crossing.
  • You dont need a layout algorithm.
  • There is a linear order in which to study text.

19
20
The Idea of a Hybrid Display (cont.)
  • Keep the best features of text and the best
    features of diagrams.
  • Maintain relative positions between the focus
    concept and its children, parents, etc.
  • Eliminate clutter of arrows.

20
21
A Hybrid Diagram/Form Display of a Neighborhood
Parents
Synonyms
Relationships
Focus Concept
Children
21
22
Desirable Information Beyond Neighborhoods
  • Concept definition for Focus Concept
  • Sources for concepts and relationships
  • Assigned Semantic Types of concepts
  • Definitions of relevant Semantic Types
  • Global view of the Semantic Network
  • Indented (better for wide branches)
  • Graphical (better for almost everything else)

22
23
Current State of the NAT Serving the Auditor
  • The Neighborhood Auditing Tool has been
    implemented to fully support display of
    neighborhoods.
  • Navigation to adjacent neighboring concepts is an
    easy click.
  • Additional features listed before have been
    implemented.

23
24
Demonstration of NAT Features
  • Neighborhood
  • Grandparents and grandchildren
  • Synonyms
  • Relationships Concept, Sibling, Term
  • Focus concept definition
  • Sources Concepts, Relationships
  • Display CUIs
  • Semantic Type display
  • Semantic Type definition
  • Semantic Network (indented)
  • Semantic Network (diagram)
  • Navigation
  • Search (full, partial)
  • Viewing History
  • Choice of release
  • Choice of sources

offline version
24
24
25
Audit Example A Cycle of Three Concepts
  • An SQL query found three concepts that
    participate in a PAR/CHD cycle.
  • We follow an auditors review of this cycle.
  • O. Bodenreider, Circular hierarchical
    relationships in the UMLS etiology, diagnosis,
    treatment, complications and prevention. Proc
    AMIA Symp. 200157-61

offline version
25
25
26
The Cycle of Three Concepts
27
Recommended Modeling
28
Audit Example Semantic Types
  • An algorithm determined that the concept
    Antonospora locustae was likely assigned
    incorrect semantic types.
  • We follow an auditors review of this concept.

offline version
28
29
Preliminary Evaluation Study with NAT
  • Compare paper-based auditing and NAT-based
    auditing.
  • Counterbalanced groups.
  • Recall improves with NAT use. Auditors seem
    willing to investigate more concepts.
  • Precision stays the same. Auditors mental
    process does not improve.

30
Conclusions
  • Preliminary study showed that people are more
    successful finding errors with NAT than with
    paper sources. ?
  • Recall improved with the NAT, precision did not.
  • NAT seems to nicely complement use of the UMLSKS.

30
31
Future Work
  • Integration of algorithms for developing audit
    sets with NAT.
  • Recording and reporting auditor recommendations.
  • Facilitate team auditing where several auditors
    review the same sample.
  • Managing and reporting work flow of auditor teams.

31
32
Thank you!
The Neighborhood Auditing Tool is available
online at http//nat.njit.edu
33
33
34
Preliminary Evaluation Study
Auditor Errors Errors Recall Recall Precision Precision F F
Auditor with NAT w/o NAT with NAT w/o NAT with NAT w/o NAT with NAT w/o NAT
1 57 45 0.97 0.82 0.53 0.51 0.86 0.63
2 22 20 0.43 0.35 0.55 0.55 0.48 0.43
3 39 34 0.64 0.58 0.46 0.53 0.54 0.55
4 56 44 0.55 0.54 0.30 0.34 0.39 0.42
Avg. 44 36 0.65 0.57 0.46 0.48 0.57 0.51
35
Improved Recall
  • The auditor finds it easy to search for more
    errors in the neighborhood of the suspicious
    concept.
  • With better recall and the same precision you
    still find more errors.

36
Semantic Types Example
  • The concept Antonospora locustae was selected for
    audit by an algorithm that found it was the only
    concept assigned to the intersection Fungus
    Invertebrate in the UMLS 2007AA.

37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
NAT Features Demonstration
52
Neighborhood
53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
(No Transcript)
64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
(No Transcript)
73
(No Transcript)
74
Cycle Example
  • An SQL query provided us with a list of concepts
    in the Metathesaurus that participate in cycles
    of length three.
  • One of these cycles exists among the concepts
    Bipolar Disorder, Mood Disorders, and Affective
    Disorders, Psychotic.

75
(No Transcript)
76
(No Transcript)
77
(No Transcript)
78
(No Transcript)
79
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com