Title: SWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics
1SWAN/SIOC Aligning Scientific Discourse
Representation and Social Semantics
- Alexandre Passant1, Paolo Ciccarese2, 3, John G.
Breslin4, Tim Clark2, 3 - 1 DERI, NUI Galway, Ireland 2 Massachusetts
General Hospital, Boston, USA - 3 Harvard Medical School, Boston, USA 4 School of
Engineering and Informatics, NUI Galway, Ireland
2Motivation
- To provide a complete RDF-based model to model
online activities and scientific argumentation in
neuromedicine - Combining Web 2.0 shared knowledge using SIOC and
formal scientific data (hypotheses, claims,
dialogue, evidence, publications, etc.) via SWAN - To make (both formal and informal) discourse
concepts and relationships more accessible to
computation - So that they can be better navigated, compared
and understood both across and within domains
3How is this achieved?
- An alignment of ontologies was performed to
provide a complete framework for modelling
activities in scientific communities - SWAN objects were integrated into SIOC Types
module - SWAN was reused to model argumentative
discussions - External models such as SCOT and MOAT were reused
for tagging - SCF is being updated so that it can create data
according to this model
4Collaborative websites are like data silos
Source Pidgin Technologies, www.pidgintech.com
5Many isolated communities of users and their data
Source Pidgin Technologies, www.pidgintech.com
6Need ways to connect these islands
Source Pidgin Technologies, www.pidgintech.com
7Allowing users to easily move from one to another
Source Pidgin Technologies, www.pidgintech.com
8Enabling users to easily bring their data with
them
Source Pidgin Technologies, www.pidgintech.com
9Types of data silos (scientific and social)
- Collaborative websites used by scientific
researchers in various domains - SWAN/SCF is being used to connect these
- Social websites used by people collaborating or
communicating through the Web 2.0 platform - SIOC is being used to connect these
- SWAN/SIOC connects both sets of data silos
together, not just structures but what is
embedded within content as well
10SWAN (Semantic Web Applications in Neuromedicine)
- An ontology of scientific discourse (Ciccarese et
al. 2008) - A participatory knowledge base of hypotheses,
claims, evidence and concepts in biomedicine,
with the first instance in the domain of
Alzheimers disease (AD) - Currently being integrated with the SCF (Science
Collaboration Framework) toolkit for biomedical
web communities - http//swan.mindinformatics.org/
11What does SWAN consist of?
- A formal structure to record and present
scientific discourse - Tools for scientists to manage, access and share
knowledge - Tools for discovering conflicts, gaps and missing
evidence - An information bridge to promote collaboration
- A community process built upon the Alzforum site
12Main concepts and relationships in the SWAN
ontology
13Modules in the SWAN ontology
14A typical hypothesis
15Contributions from leading researchers
Inventory of ideas
Mechanisms of disease
Key research topics
Contribute content
16Scientist view Toxic protein fragments believed
responsible for AD Key information, gaps and
conflicts
17Browsing evidence and inconsistencies
18A researcher-supported effort
- Dozens of etiopathological AD models annotated by
SWAN curators in collaboration with leading
researchers - Content reviewed before release by over twenty
senior AD researchers - Software features reviewed before release by over
thirty senior AD researchers - Extensive feedback incorporated into SWAN, such
that this is a community tool (in line with Web
2.0 principles)
19Semantically-Interlinked Online Communities (SIOC)
- An effort from DERI, NUI Galway to discover how
we can create / establish ontologies on the
Semantic Web - Goal of the SIOC ontology is to address
interoperability issues on the (Social) Web - http//sioc-project.org/
- SIOC has been adopted in a framework of 50
applications or modules deployed on over 400
sites - Various domains Web 2.0, enterprise information
integration, HCLS, e-government
20(No Transcript)
21The steps taken
- Develop an ontology of terms for representing
rich data from the Social Web - Create a food chain for producing, collecting and
consuming SIOC data - As well dissemination via papers about SIOC,
provide docs and examples at sioc-project.org - SIOC aims to enrich the Web infrastructure
- During the next upgrade cycle, gigabytes of
semantically-enriched community data become
available!
22Some of the SIOC core ontology classes and
properties
23Some examples of where SIOC is already use (about
50 applications / modules)
24Creating a Social Semantic Web of
previously-disconnected social data silos
25Also integrating scientific data silos in a
semantic scientific collaboration framework
- Enabling researchers to
- Collect data
- Draw conclusions
- Gather information
- Create/modify hypotheses
- Perform experiments
- But with the benefit of cross-community and
cross-domain experiences and results
26Mappings between SWAN and SIOC at
http//rdfs.org/sioc/swan in OWL-DL
27Mappings between SWAN and SIOC classes
- Subclasses of siocItem
- swanscidisDiscourseElement
- swanscidisResearchStatement
- swanscidisResearchQuestion
- swanscidisResearchComment
- swancitCitation
- swancitJournalArticle
- Other mappings
- siocPost gt swancitWebArticle, swancitWebNews
- siocComment gt swancitWebComment
- swanscidis is the Scientific Discourse module,
which provides a set of classes and properties to
represent discourse elements - swancit is the Citations module, which aims to
model the various citation elements that occur in
scientific publishing
28Mappings between SWAN and SIOC properties
- Subtypes of siocrelated_to
- swandisrelagreesWith / swandisreldisagreesWith
- swandisrelalternativeTo
- swandisrelarisesFrom
- swandisrelcites
- swandisrelconsistentWith / swandisrelinconsisten
tWith - swandisreldiscusses
- swandisrelinResponseTo
- swandisrelmotivatedBy
- swandisrelrefersTo
- swandisrel is the Scientific Discourse
Relationships module, which collects some of the
relationships used for modelling discourse - May also use siocItem dctermshasPart
swanscidisDiscourseElement, for example, to
represent that a particular hypothesis is part of
a blog post
29Mappings redundancy
- Redundant mappings
- Can be entailed thanks to the transitivity of
rdfssubClassOf / rdfssubPropertyOf - e.g. swancitJournalArticle rdfssubClassOf
siocitem can be inferred from
swancitJournalArticle rdfssubClassOf
swancitCitation and swancitCitation
rdfssubClassOf siocItem - However
- SIOC applications generally do not support such
chained entailments - Need to address lightweight inference
- Therefore we provide direct rdfssubClassOf
mappings
30Querying mappings
PREFIX sioc lthttp//rdfs.org/sioc/nsgt SELECT
DISTINCT ?s ?o WHERE ?s siocrelated_to ?o . ?s
a siocItem . ?o a siocItem .
- Simple query to identify relatedness between
items - Applying a SIOC query over SWAN data
- SPARQL / Pellet, files loaded on runtime in
memory - Experiment with both simple mappings (including
transitive closure) and full mappings
31W3C HCLS Interest Group notes published
- http//www.w3.org/TR/hcls-sioc/
- http//www.w3.org/TR/hcls-swan/
- http//www.w3.org/TR/hcls-swansioc/
32RDFa support in Drupal 7 for SSW data
33Exposing scientific results to search
- Yahoo! Search Monkey and Google Rich Snippets
- Highlights the structured data embedded in web
pages - Google developers have indicated that scholarly
publications marked up with Rich Snippets will
also be picked up and appropriately indexed by
Google Scholar
34Acknowledgements
- We would like to thank Science Foundation Ireland
for their support under grant SFI/08/CE/I1380
(Líon 2) - We would also like to thank an anonymous
foundation for a generous gift in support of this
work - Thanks to members of the W3C HCLSIG, in
particular - Susie Stephens
- Scott Marshall
- Eric Prudhommeaux
35Motivation
- To provide a complete RDF-based model to model
online activities and scientific argumentation in
neuromedicine - Combining Web 2.0 shared knowledge using SIOC and
formal scientific data (hypotheses, claims,
dialogue, evidence, publications, etc.) via SWAN - To make (both formal and informal) discourse
concepts and relationships more accessible to
computation - So that they can be better navigated, compared
and understood both across and within domains