Title: Parkinson
1Parkinsons Disease Ontology
2Outline
- Use Case
- Parkinsons Disease
- Seed Ontology
- Design Issues
- Extending the seed ontology
- Next Steps
3Use Case Parkinsons Disease
- Description of Parkinsons Disease from different
perspectives - Systems Physiology View
- Cellular and Molecular Biologist View
- Clinical Researcher View
- Clinical Guideline Formulator View
- Clinical Decision Support Implementer View
- Primary Care Clinical View
- Neurologist View
- Identify Information Needs of the stakeholders
identified above - Available at
- http//esw.w3.org/topic/HCLS/ParkinsonUseCase
- Developed by
- Don Doherty
- Ken Kawamato
4Use Case Systems Physiology View
- What chemicals (neurotransmitters) are used by
each circuit element (neuron) to communicate with
the next element (neuron)? What responses do they
elicit in the neurons?
5Use Case Cellular and Molecular Biologist View
- What proteins are implicated in Parkinson's
disease? How are protein expression patterns,
protein processing, folding, regulation,
transport, protein-protein interactions, protein
degradation, etc. affected?
6Use Case Clinical Researcher View
- Can a certain diagnostic test (e.g., a blood test
for a biomarker or an imaging study) provide an
approach to diagnosing Parkinsons disease that
is superior to or can complement existing
diagnostic approaches?
7Use Case Clinical Guideline Formulator View
- What have been the results of clinical trials
that have evaluated the benefits and costs
associated with diagnostic or therapeutic
interventions for Parkinsons disease?
8Use Case Clinical Decision Support Implementer
View
- Which clinical guideline(s) should be used as the
basis for implementing the CDS functionality?
9Use Case Primary Care Clinician View
- If a patient is not currently diagnosed with
Parkinsons disease, do the patients current
symptoms indicate the need for a referral to a
neurologist for further evaluation? If so, what
are the referral criteria?
10Use Case Neurologist View
- What is the differential diagnosis for this
patient given his/her symptoms, signs, and
diagnostic test results?
11First Phase
- Focus on the Cellular and Molecular Biologist
View - Develop Parkinsons Disease Ontology based on
that View - Refine it iteratively
- Augment it with other views later
12Parkinsons Disease Revisited
Studies identifying genes involved with
Parkinson's disease are rapidly outpacing the
cell biological studies which would reveal how
these gene products are part of the disease
process in Parkinson's disease. The alpha
synuclein and Parkin genes are two examples.
The discovery that genetic mutations in the
alpha synuclein gene could cause Parkinson's
disease in families has opened new avenues of
research in the Parkinson's disease field. When
it was also discovered that synuclein was a
major component of Lewy bodies, the pathological
hallmark of Parkinson's disease in the brain, it
became clear that synuclein may be important in
the pathogenesis of sporadic Parkinson's disease
as well as rare cases of familiar Parkinson's
disease. More recently, further evidence for the
intrinsic involvement of synuclein in Parkinson's
disease pathogenesis was shown by the finding
that the synuclein gene may be triplicated or
duplicated in familiar Parkinson's disease,
suggesting that simple overexpression of the wild
type protein is sufficient to cause disease.
Since the discovery of synuclein, studies of
genetic linkages, specific genes, and their
associated coded proteins are ongoing in the
Parkinson's disease research field - transforming
what had once been thought of as a purely
environmental disease into one of the most
complex multigenetic diseases of the brain.
Studies of genetic linkages, specific genes,
and their associated coded proteins are ongoing
in the Parkinson's disease research field.
Mutations in the Parkin gene cause early onset
Parkinson's disease, and the parkin protein has
been identified as an E3 ligase, suggesting a
role for the proteasomal pathway of protein
degradation in Parkinson's disease. DJ-1 and
PINK-1 are proteins related to mitochondrial
function in neurons, providing an interesting
genetic parallel to mitochondrial toxin studies
that suggest disruptions in cellular energetics
and oxidative metabolism are primarily
responsible for Parkinson's disease. Other genes,
such as UCHL-1, tau, and the glucocerebrosidase
gene, may be genetic risk factors, and their
potential role in the sporadic Parkinson's
disease population remains unknown. Mutations in
LRRK2, which encodes for a protein called
dardarin, is the most recently discovered
genetic cause of Parkinson's disease, and LRRK2
mutations are likely to be the largest cause of
familial Parkinson's disease identified thus
far. Dardarin is a large complex protein, which
has a variety of structural moieties that could
be participating in more than a dozen different
cellular pathways in neurons. Because the
cellular pathways that lead to Parkinson's
disease are not fully understood, it is currently
unknown, how, or if, any of these pathways
intersect in Parkinson's disease pathogenesis.
13Step 1 Identify concepts and subsumption
hierarchies
14Step 2 Identify relationships
15Step 3 Look at Information Queries
What cell signaling pathways are implicated in
the pathogenesis of Parkinsons disease? In
which cells? What proteins are involved in
which pathways?
16Design Issues Modeling
- Modeling as relationships vs classes
- E.g., UHCL-1 transcribed_into Dardarin, vs
- Define a class called transcription as follows
- Transcription
- has_gene UHCL-1
- has_protein Dardarin
- Modeling a Disease as a dynamic process as
opposed to a static class
17Design Issues Instance vs SubClass
- A generic/specific relationship can be modeled
either using instance-of vs subclass-of, for e.g. - Parkinsons Disease subclassof Disease vs
Parkinsons Disease instance-of Disease - UHCL-1 subclass-of Gene vs UHCL-1 instance-of
Gene - Synuclein subclass-of Protein vs Synuclein
instance-of Gene - What are the performance impact of these
relationships? - Instance-of involves ABox reasoning
- Subclass-of involved TBox reasoning
- Is one more scalable than the other?
- What is the impact on expressivity?
- Can more knowlledge be represented using one
over the other?
18Design Issue Granularity
- At what level of specificity should relationships
be represented in the ontology? - AllelicVariant causes Disease, vs
- LRR2KVariant causes Parkinsons Disease
- At what level of genericity should relationships
be represented in the ontology? - LewyBody hallmark_of Parkinsons Disease, vs
- AnatomicalEntity hallmark_of Disease
19Design Issue Uncertainty
- The discovery that genetic mutations in the
alpha synuclein gene could cause Parkinson's
disease in families - The OWL/RDF metamodels do not support expressing
this information. - What could be ways of expressing these?
- Using reification in RDF?
- Introducing new relationships in OWL?
- What impact would this have on
- Data Integration?
- Reasoning?
20Design Issue Domain/Range Polymorphism
- What are the semantics of multiple domains and
ranges? - Property associated_with
- domain Pathway
- domain Protein
- range Cell
- range Biomarker
- Are RDF/OWL Semantics good enough for us?
- Do we need remodel relationships to avoid this?
- Different types of polymorphic relationships
- Sub-type polymorphism
- Ad-hoc polymorphism
21Design Issue Default Values
- How do we handle default values of OWL properties
- Example
- Default function of proteosomal pathway is
protein degradation - What is the impact of default values on
biomedical data integration? Reasoning?
22Design Issue Ontology Inclusion
- Cross-linking to other ontologies such as GO,
Neuronames, etc. - If we link to a class or property in another
ontology - Should we include associated sub classes?
- Should we include associated properties?
- Should we include associated axioms?
- What if this leads to inconsistencies
- Cycles
- Contradictions
- How does this impact data integration or
reasoning? - Can we get by with shallow inclusion?
23Ontology Modularization
- Mutually disjoint tree with cross cutting
properties, axioms, etc. - Proposed by Alan Rector
- Example Different hierarchies/lattices for
- Studies (e.g., publication in Pubmed)
- Biomedical knowledge referenced in those studies
(e.g., association between a gene and a disease)
24Design Issue Higher Order Relationships
- Example
- Association between a Gene and a Disease
mentioned in a study
25Creation of Best Practices
- Design issues have been the subject of
investigation in the Knowledge Engineering and
Medical Informatics communities - Different approaches to resolve these issues will
be appropriate in the context of different use
cases. - Goal
- Propose various alternatives in the context of
use cases proposed in HCLSIG
26Extending the Seed Ontology
- Identify concepts and properties inclusions from
- Gene Ontology
- Neuro Names
- Decide the level of inclusion
27Extending the Seed ontology
- Look at statements from research articles to
extend the ontology - Example
- Aggresomes formed by alpha-synuclein and
synphilin-1 are cytoprotective. - Create a new property called formedBy
- domain(formedBy) Aggresome
- range(formedBy) Protein
-
- subClassOf(
- intersectionOf(Aggresome, Restriction(formedBy,
someValuesFrom(intersectionOf(alpha-synuclein,
synphilin-1)))),Restriction(function,
hasValue(cytoprotective) - )
-
28Next Steps
- Apply this ontology to demonstrate Parkinsons
Disease Use Case - Focus of the BIONT BIORDF Collaborative F2F