Research Problems in Semantic Web Search - PowerPoint PPT Presentation

About This Presentation
Title:

Research Problems in Semantic Web Search

Description:

Swoogle crawls and discovers documents written in RDF,OWL ... Sindice employs Hadoop/Nutch to distribute crawling job across multiple machines ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 25
Provided by: Goog355
Category:

less

Transcript and Presenter's Notes

Title: Research Problems in Semantic Web Search


1
Research Problems in Semantic Web Search
____________________________
  • Varish Mulwad

2
Agenda
____________________________
  • Introduction
  • Swoogle
  • Swoogles Competition
  • Sindice
  • Semantic Web Search Engine (SWSE)
  • Watson
  • Falcon
  • Research Problems and Issues with Swoogle
  • References

3
Introduction
____________________________
Web
Your Agent
Dr.Finins FOAF Profile
Possible because Data is in machine
understandable form like RDF, OWL But how will
agent find all this data ? Search Engines ?
4
Introduction
____________________________
Traditional Search Engine Results
Semantic Web Search Engine Results
5
Swoogle
____________________________
  • Swoogle is a crawler based indexing and retrieval
    system for Semantic Web
  • Swoogle crawls and discovers documents written in
    RDF,OWL
  • Swoogle classifies a Semantic Web Document(SWD)
    as
  • Semantic Web Ontology (SWO) Defines new terms
  • Semantic Web Databases (SWDB) Makes assertions
    about individuals

6
Swoogle
____________________________
  • SWOOGLE DEMO

7
Swoogle Architecture
____________________________
8
Swoogle Architecture
____________________________
  • SWD Discovery Component
  • Google crawler using the Google web service
  • Filetypes with extensions .rdf, .owl, .n3
  • Google limits only 1000 results per query
  • A focussed crawler
  • Crawls documents within a given website
  • Extension and Focus constraints
  • A Swoogle crawler
  • Jena based crawler
  • Explores Semantic Links between SWDs

9
Swoogle Architecture
____________________________
  • Metadata Creation
  • Basic Metadata
  • Encoding RDF/XML, N-Triple, N3
  • Language RDF, RDFS, OWL, DAML OIL
  • OWL Species OWL-LITE, OWL-DL, OWL-FULL
  • Relations among SWDs
  • Reference relationship among SWDs
  • Inter ontology relationships

10
Swoogle Architecture
____________________________
  • Data analysis component
  • Classification of SWD as SWO or SWDB
  • Compute rank of SWD
  • Web based interface
  • Human User Interface http//swoogle.umbc.edu
  • Web Services using REST interface
  • Agent Service

11
Sindice
____________________________
  • Created at Digital Enterprise Research Institute
    (DERI)
  • Key features of Sindice include
  • Sindice collects SWDs and indexes them on
    resource URIs, Inverse Functional
    Properties(IFPs) and keywords
  • Sindice uses the Hadoop parallel architecture

12
Sindice
____________________________
  • Inverse Functional Property (IFP) An OWL
    cardinality restriction
  • Sincdice uses three indexes
  • URI index
  • IFP index
  • Keyword index
  • Benefits - Faster retrieval of data

13
Sindice
____________________________
  • Hadoop architecture is used in the following
    manner
  • Sindice employs Hadoop/Nutch to distribute
    crawling job across multiple machines
  • Collected data is stored in the Hbase distributed
    column based store
  • Efficient handling of large datasets across the
    cluster using a MapReduce implementation

14
Sindice
____________________________
  • SINDICE DEMO

15
SWSE
____________________________
  • Semantic Web Search Engine (SWSE) is also a
    Semantic Web Search Engine created at Digital
    Enterprise Research Institute (DERI)
  • SWSE uses a Multicrawler a pipelined
    architecture for crawling

16
Watson
____________________________
  • Created at Knowledge Management Institute at the
    UK Open University
  • Major Design Principles
  • Considers explicit and implicit relations between
    Ontologies
  • Ranking of Ontologies with focus on quality over
    popularity

17
Watson
____________________________
  • WATSON DEMO

18
Falcon
____________________________
  • Falcon is a Semantic Web Search engine created at
    the Institute of Web Science in China
  • Falcon allows keyword based queries on
  • Objects
  • Concepts
  • Documents
  • Falcon performs class subsumption reasoning

19
Falcon
____________________________
  • FALCON DEMO

20
Summary
____________________________
  • Swoogle
  • Others
  • Sindice
  • Indexes on URI, IFP, keywords
  • Use of Hadoop Architecture
  • SWSE
  • Pipelined Architecture for Crawling
  • Watson
  • Implicit relations between SWDs
  • Falcon
  • Class Subsumption Reasoning
  • Keyword based search
  • Searches Ontologies and Instance Data

21
Issues
____________________________
  • Crawling
  • Swoogles crawler is running as a single thread
    on one machine
  • Limits the number of SWDs dicovered and revisted
  • Possible Solutions
  • Use of Hadoop Architecture
  • Use of Grub

22
Other Issues
____________________________
  • Crawling large structured Datasets like DBPedia
  • More reasoning
  • More services

23
References
____________________________
  • Li Ding et al., "Swoogle A Search and Metadata
    Engine for the Semantic Web", Proceedings of the
    Thirteenth ACM Conference on Information and
    Knowledge Management, November 2004.
  • P. Mika, G. Tummarello Web Semantics in the
    Clouds, IEEE Intelligent Systems, Volume 23 ,
    Issue 5 (September 2008)
  • E. Oren, R.Delbru, M. Catasta, R. Cyganiak, H.
    Stenzhorn, G.
  • Tummarello Sindice.com A document-oriented
    lookup index for open linked data. In
    International Journal of Metadata, Semantics and
    Ontologies, 3(1), 2008.
  • Mathieu dAquin et al., Watson A Gateway for
    the Semantic Web ,Poster session of the European
    Semantic Web Conference, ESWC 2007
  • Gong Cheng, Weiyi Ge, Honghan Wu, Yuzhong Qu ,
    Searching Semantic Web Objects Based on Class
    Hierarchies In WWW 2008 Workshop on Linked Data
    on the Web, 2008

24
Questions?
____________________________
Write a Comment
User Comments (0)
About PowerShow.com