Title: TwoLevel Semantic Annotation Model
1Two-Level Semantic Annotation Model
- BYU Spring Conference 2007
- Yihong Ding
Sponsored by NSF
2Semantic Web
- Content is represented in
- explicit,
- shared,
- conceptualizations.
- Consequence
- Machines can understand meaning
- Machines can derive implicit logics
3Semantic Annotation Enable the Semantic Web
- Metadata links data in a web page to defined
concepts in an ontology - Annotated data becomes machine-processable
4Sample Annotation
ltrdfDescription rdfaboutwebpageCarIns13gt lt
carMileagegt116000lt/carMileagegt ltcarPrice"gt3450
lt/carPricegt ltcarMake"gtFORDlt/carMakegt ltcarYea
r"gt1986lt/carYeargt ..... lt/rdfDescriptiongt
5Two Contradictive Annotation Methods
- Pure ontology-based annotation
- Layout-driven annotation
6Ontology-based Annotation
- Using defined data-recognition patterns in
ontologies to directly annotate web content - Examples in extraction ontologies
- External Representation
- Price \d\d?\d?\d.\d\d
- Contextual Representation
- Context phrases (left, right), e.g. \?
- Context keywords e.g. price obo
neg(\.otiable)
7Ontology-based Annotation
- Pros
- Resilient to varying domains and page layouts
- No mapping requirements between recognition
patterns and ontology definitions - Cons
- Accuracy depends on how good the knowledge base
is - Creation of recognition patterns
- Execution speed
8Layout-Driven Annotation
lttrgt lttd colspan"2" class"pageHeading"
valign"top"gt lth1gtYou've Got Maillt/h1gt lt/tdgt lt/tr
gt lttrgt lttd align"center" valign"top"
class"smallText" rowspan"2"gt ltscript
language"javascript" type"text/javascript"gt lt!--
document.write('lta href"javascriptpopupWindow(\
'http//www.screen-scraper.com/shop/index.php?main
_pagepopup_imageamppID7\')"gtltimg
src"images/dvd/youve_got_mail.gif" border"0"
alt"You\'ve Got Mail" title" You\'ve Got Mail "
width"100" height"80" hspace"5" vspace"5"
/gtltbr /gtlarger imagelt\/agt') //--gt lt/scriptgt
ltnoscriptgtlta href"http//www.screen-scraper.com
/shop/index.php?main_pageimages/dvd/youve_got_mai
l.gif" target"_blank"gtltimg src"images/dvd/youve_
got_mail.gif" border"0" alt"You've Got Mail"
title" You've Got Mail " width"100" height"80"
hspace"5" vspace"5" /gtltbr /gt larger
imagelt/agtlt/noscriptgt lt/tdgt lttd class"main"
align"center" valign"top"gtModel
DVD-YGEMlt/tdgt lt/trgt lttrgt lttd class"main"
align"center"gtlt/tdgt lt/trgt lttrgt lttd
align"center" class"pageHeadinggt34.99lt/tdgt lttd
class"main" align"center"gtShipping Weight 7.00
lbs.lt/tdgt lt/trgt lttrgt lttdgtnbsplt/tdgt lttd
class"main" align"center"gt10 Units in
Stocklt/tdgt lt/trgt lttrgt lttd class"main"
align"center"gtManufactured by Warnerlt/tdgt lttd
align"center"gt lttable border"0" width"150px"
cellspacing"2" cellpadding"2"gt lttrgt lttd
align"center" class"cartBox"gtnbspQuantity
- Using defined page layout patterns to annotate
web content - Example (screen-scraper)
9Layout-Driven Annotation
- Pros
- More accurate
- Faster
- No knowledge base required
- Cons
- Layout-pattern generation
- Layout-pattern regeneration
- Layout-pattern maintenance
- Mappings between layout-patterns and ontology
definitions
10Observation
- These two annotation methods are complementary.
11Two-Layer Annotation Model
Massive Annotation Process
Structural Annotator
Document
Sample Annotation Process
Conceptual Annotator using ontology-based IE
wrapper
12lttrgt lttd colspan"2" class"pageHeading"
valign"top"gt lth1gtYou've Got Maillt/h1gt lt/tdgt lt/tr
gt lttrgt lttd align"center" valign"top"
class"smallText" rowspan"2"gt ltscript
language"javascript" type"text/javascript"gt lt!--
document.write('lta href"javascriptpopupWindow(\
'http//www.screen-scraper.com/shop/index.php?main
_pagepopup_imageamppID7\')"gtltimg
src"images/dvd/youve_got_mail.gif" border"0"
alt"You\'ve Got Mail" title" You\'ve Got Mail "
width"100" height"80" hspace"5" vspace"5"
/gtltbr /gtlarger imagelt\/agt') //--gt lt/scriptgt
ltnoscriptgtlta href"http//www.screen-scraper.com
/shop/index.php?main_pageimages/dvd/youve_got_mai
l.gif" target"_blank"gtltimg src"images/dvd/youve_
got_mail.gif" border"0" alt"You've Got Mail"
title" You've Got Mail " width"100" height"80"
hspace"5" vspace"5" /gtltbr /gt larger
imagelt/agtlt/noscriptgt lt/tdgt lttd class"main"
align"center" valign"top"gtModel
DVD-YGEMlt/tdgt lt/trgt lttrgt lttd class"main"
align"center"gtlt/tdgt lt/trgt lttrgt lttd
align"center" class"pageHeadinggt34.99lt/tdgt lttd
class"main" align"center"gtShipping Weight 7.00
lbs.lt/tdgt lt/trgt lttrgt lttdgtnbsplt/tdgt lttd
class"main" align"center"gt10 Units in
Stocklt/tdgt lt/trgt lttrgt lttd class"main"
align"center"gtManufactured by Warnerlt/tdgt lttd
align"center"gt lttable border"0" width"150px"
cellspacing"2" cellpadding"2"gt lttrgt lttd
align"center" class"cartBox"gtnbspQuantity
Data-recognition Pattern (Price) Price
\?\d\d?\d?\d.\d\d
13Two-Layer Annotation Model, What do we gain?
- All the benefits of the two annotation methods
- Resilient
- Mapping problem eliminated
- Knowledge-base problems mitigated
- Maintenance problem mitigated
- More accurate
- Faster
- Even more
- Automatic augmentation of knowledge base
- Key to large-scale annotation for the web
14Conclusion
- Contradictory is a synonym of complementary
- Two-level annotation model solves a traditional
problem on semantic annotation and it is a key to
large-scale annotation for the web