Title: Tabular Topic Maps
1Tabular Topic Maps
- Digital Camera Accessories TTM at Kodak
- Content Versioning
- Ontology Validation
- Tools
2Need for the project
- Complexity was increasing
- The number of products offered for sale.
- The number of languages the product information
was presented in. - Granularity of information content variation
across selling regions around the world. - The relationships among various products.
- All of these requirements needed to be met with
no increase in labor content. - Excel should be used as familiar and ubiquitous
across-the-company UI. - Spreadsheets should be human readable. Learning
curve should minimal. - Spreadsheets should be sufficiently encapsulated
and defined so that different departments could
work on their own tasks independently. - There should be an easy way to add new types of
relationships without need to change existing
data and processing flow. - Information should be mergeable.
- There should be a defined process that turns
merged knowledge into a website.
3Progress
- Started with workbook design that was there.
- Optimized it and slowly moved to Tabular Topic
Maps keeping UI design user friendly, Neutral
syntax, hospitable to new types of relationships - Sheets are independent and yet provide a clear
interface for managing common sets of
interconnected information objects. - Requires one to think and organize content in
terms of topics (products, regions, languages),
vocabularies, associations, roles and contexts. - The project started with a simple requirement to
provide typical selling information relationships
about Kodak's digital camera accessories and had
triumphantly developed into a large content
management challenge. - Currently multilingual information and
relationships between digital cameras,
accessories and selling regions yield to over
300,000 unique combinations on the Kodak.com
website.
4Full Cycle
5Workflow
6Why Controlled Vocabularies.
- Challenge
- Different teams work on various pieces of
information independently and in parallel. - How can results of their work merge?
- Agreements that allow unambiguous reference to
topics of discourse. - Should agree on the controlled vocabularies
- Should agree on the naming convention within
controlled vocabularies - Thereafter topics can be unambiguously identified
by a unique combination of - Controlled vocabulary
- Unique name in the scope of that vocabulary
7Why Ontology
- An upper vocabulary should exist that covers
- Known control vocabularies
- Common classes of topics, topic occurrence
(objects) , relationships between topics
(associations) and role types played by topics in
associations. - Constraints
- There should exist common sense constraints
imposed on topic characteristics in various
contexts. - There should exist a computer understandable
schema of constraints correlations between topic
characteristics. - An agreement about possible relationships between
products, accessories, OS, countries and other
languages and about roles that various types of
topics can play. - TTM Ontology Rules for creating Subject
Indicators constraints. - Building Domain Ontology Modeling common
grounds between parties.
8Subject Identifiers and URN Notation in Excel
urnkodaklangde
urnKodakcategorymemory
urnkodakobjectproductinformation
urnkodakeknEKN006544
9Subject Identifiers and URN Notation in Content
Files
- Metadata attribute on the content container uses
the same URN notation to indicate the topic that
it belongs to. - 3877" urn"urnkodakcategorygear"
-
-
-
-
-
- Gear
- Keep your camera safe and secure when
traveling with Kodak camera bags. Try our cables
for connecting to your television or computer,
too. - Wherever you go, keep your digital camera
safe and secure with a Kodak camera bag. We have
three convenient sizes large, medium and small
for every photographic need. Kodak also offers
a wide range of cables for connecting your camera
to computers or televisions. -
- Spider creates a topic This matches with the urn
used in the Workbook allowing the xsl that
creates output to connect the content as
needed. -
- xlinkhref"urnkodakcategorygear"/ntity
-
- xlinkhref"urnkodakobjectcontent"/f
- klangen"/
10Using Scalar Vector Graphics to Display
Relationships
Accessories availability by region.
11Handling Content Version Changes in the Topic Map
- Requirement
- Minimize the number of generated files to be
pushed to production. - Use XSLT.
- Solution
- Compute topic map difference of the old and the
new topic maps. - Base difference computation on the underlying
ontology. - Use diff results to drive file generation for new
and modified topics and delete instructions for
deleted topics. - Require every topic to have at least one subject
identifier. - Processing
- Use subject identifiers to find new and deleted
subjects. - Compare topic base names (scope).
- Compare topic occurrences (scope, type).
- Find new and deleted associations(scope, primary
players) - Compute new and deleted association role players.
12Practical Ontologies for Tabular Topic Maps
- Inter-team slip-ups.
- A duplicated information
- A contradicting information
- Cross-team miscommunications.
- Examples
- New product was added by one team, but the
marketing team did not add it to the
recommended/compatible associations. - New language was added but translators or content
creators dead not realize it or lag behind.
13Semantic Web Glasses for Topic Maps and RDF
http//www.cogx.com/swglasses
14Processing
XML Topic Map
RDF Topic Map
Error Report
- XSL transforms Topic Map from XML to RDF
representation. - OWL Validator generates an error report based on
OWL ontology referenced as topic and
characteristics classes.
OWL Validator
XSLT Translator
15Validating Topic Maps with DAML tools
16Contextual Ontologies for Topic Maps
- Constraints are contextual.
- Set of classes can not be constrained without a
context . - The same topic can be an instance of
- A nice little boy in the context of his mother.
- A trouble maker in the context of teacher X.
- Set of classes can be constrained when the
context is specified. - A constraint can exist that in the scope of
teacher X all students are one of - Valedictorian,
- Perfect student (teachers pet in the scope of the
rest of the class), - Bright mind,
- Average student,
- Trouble maker,
- No explicit notion of context/scope in OWL
17Modeling Ontology
- Find the contexts of the information to be
modeled. - Language, geographical, etc.
- Find and model common grounds between involved
parties. - Example A legal family may be constrained to
- Contain 1 man and 1 woman in the context of
pre-200x USA, - Contain any 2 spouses in the context of post-200x
USA - Contain 1 man and 1 or more women in some other
parts of the world. - Example There should be text-labels in every
language context. - Contexts may be expressed directly using scopes,
or indirectly via topic characteristics patterns. - Outcome
- Track such inconsistencies as missing or
duplicate occurrences of required types, invalid
associations between topics, etc. - Some of the errors that we are able to catch are
very tedious and normally shows up only upon
reviewing a corpus of generated web pages.
Automated error tracking provides a higher level
of quality content for the real time page
creation process.
18Coherent Contextual Characteristics Constants.C4
Expressions
One set of topic characteristics constrains other
characteristics of a topic in a given context
- - add-hock Class description.
- - Class definition
- A topic is an instance of an add-hock class if it
matches this pattern. - - set of constraints
- An instance of this add-hock Class should also
match all patterns from this set of constriants. - and contain CCX topic
queries. - CCX Topic queries may reference add-hock classes
defined elsewhere.
19Processing
- XSL templates are turn C4expressions into a
validating style sheet.
Error Report
Topic Map
- The validating stylesheet applied to a topic map
generates a report displaying warnings and errors.
Topic Map Constraints
XSLT Validator
20C4 Expression example (Kodak).
-
-
- /1.0/psi1.xtmat-class-instance"
- 1.xtmrole-instance"/
-
- psi1.xtmrole-class"/
- egory"/
-
-
-
-
-
- cardinality"1"
- cardinality"1"
-
-
21Topic Map Ontology and Validation
- A sample of a few violations found in the
operational topic map for the accessories site.
Note we have both missing components and
duplicate components.
22C4-Expression Example 1.
- Example
- Pattern A class of all topics that have US
Passports - Restrictions Must have one and only one SSN
typeurnusssnnum cardinality1/
23C4-Expression 2a
- A class of all topics that graduated XYZ
- Topic that plays "http//www.cogx.com/psi.xtm/rol
e-student role in association of class
http//www.cogx.com/psi.xtm/at-graduates where
"http//www.cogx.com/psi.xtm/role-college role
is played by urnuscollegexyz - Must have at least one web page
at-graduates role
"http//www.cogx.com/psi.xtm/role-student"
com/psi.xtm/role-college"/ ref"urnuscollegexyz"/ onAssociation
eb-page minCardinality1
24CCX Example 2b
- A class of all topics that graduated XYZ
- Must have a web page in every language taught at
College XYZ
each"x" in"xyz-taught-langs"
web-page" cardinality"1" ref"x"/