Title: CONCEPT MODELING:
1CONCEPT MODELING
A Proposed Hybrid Approach to Patent Modeling
Ðorde Popovic, Ognjen Šcekic, Veljko Milutinovic
2What is concept modeling?
- A way of modeling reality
- Identifying concepts
- Identifying relations among concepts
- Organizing the concepts in a knowledge-base,
allowing an "intelligent" way to search and
process this data. - Why do we need concept modeling?To make
electronic resources not only machine-processable,
but also machine-understandable!
3Challenges
- How to create a model that has a uniform
structure, and is powerful enough to capture the
essence of any concept? - How should these models be linked into an
efficient structure? - How can we bridge the gap between natural
languageand a machine-processable model?
47 Ws PROs and CONs
Which
When
What
Ultimate Goal From a specific to a general model!
concept
Why
WHAT associations provide general facts about any
concept.
Where
Who
(W)How
5Why start with patents?
- Described by a very formal, structured language
claims. - Each patent is a novel concept.
- Definition of one patent is usually based on
another one.
6Structure of a Patent Document
General info about the patent (can be used for
Which, When, Why, Where, Who and How)
Description not well structured
References to related patents
Claims primary target for What
Abstract of the patent
7Conceptual Indexing (1)
- What is conceptual indexing?
- New technique for organizing information to
support subsequent access that can dramatically
improve your ability to find the information you
need,with less hassle and with better results. - William A.
Woods - Conceptual indexing combines techniques of
- Knowledge representation
- Natural language processing
- Classical techniques for indexing words and
phrases - Bridges the gap between natural languageand a
machine processable model.
8Conceptual Indexing (2)
- Conceptual indexing technology is a combination
of - Concept extractor
- Identifies phrases to be indexed.
- Concept assimilator
- Analyzes a concept phrase to determine
- its place in the conceptual taxonomy.
- Conceptual retrieval system
- Uses conceptual taxonomy to make connections
- between requested and indexed items.
Figure 1 Main components of a conceptual indexer
9Hybrid Approach Indices RDF/OWL
- Conceptual indices
- RDF/OWL
- Motivation Use the advantages of one approach
to eliminate the drawbacks of the other.
10Conceptual Indices vs. RDF/OWL
Conceptual indices RDF/OWL ontologies
Major advantages Linear-complexity structures Very expressive and precise
Major advantages Provide basic subsumption relations Based on First-Order Logic
Major advantages Provide built-in knowledgeon low-level concepts Supported by W3C
Major drawbacks Incapability of establishing explicit relations among high-level concepts Great complexity
Major drawbacks Incapability to create precise models Great complexity
11Why not use ontologies alone?
- If we want to use an ontology we have 2 choices
- Use an existing, well-established ontology that
might not suite our needs. - Create a new ontology which does suit our needs
- We can create several different
ontologies,depending on how we want to capture
the information. - Problems arise when we want to merge ontologies.
- This approach works fine within a closed
communitywith specific needs - There already exists a well-defined basic
ontology structure. - Community members have a good knowledge of how to
model new conceptsin terms of the existing ones.
12Why not use indices alone?
- For example, let us take the simplest possible
definition, for a bird - bird 1 a creature with wings and feathers
that lays eggs and can usually fly. - Our index might then contain the following
associationscreature, wings, feathers, eggs,
fly. - A conceptual index does not offer the possibility
to state the fact that some birds do not fly!
1 - Word definition taken from Longman Dictionary
of Contemporary English, 3rd edition, 1995.
13Hybrid Approach (1)
- An index of associations represents a simple
model,similar to what humans have on their
mindwhen they first think of a bird. - Having enough associations, one can create a
model with a considerable degree of accuracy. - RDF/OWL statements provide a means for
expressing additional (but very important)
information(e.g. there are birds that cannot
fly!) - We believe this is good enough for most
applications.
14Hybrid Approach (2)
- It is important to keep track of how many times a
term is mentioned,because it affects its
descriptive power. - Example
- U.S. Patent 6,989,179 Synthetic grass
sport surfaces, claims section - 1. synthetic grass 10
- 2. playing surface 9
-
- These terms represent the essence of what is
being described!
15Hybrid Approach (3)
- However, this is only because we know what
synthetic grass and playing surface are! - ? At some level, we need to have some
intrinsic, built-in knowledge-base of basic
concepts! - All the other concepts can then be described in
terms of these basic concepts. - Solution Conceptual indexers are equipped with
a knowledge base of basic terms.
16Patent Model Conceptual Index
- A patents Claims section is scanned and
processedby a conceptual indexer. - The result is a descriptive index, associated
with the patent (it size is approx. 1-5 of the
full text). - This index can be seen as an ordered list of the
patents WHAT associations (terms, phrases,
sentence fragments). - An entry in the descriptive index contains a
low-level concept,and the number of its
occurrences.
17Patent Model RDF/OWL
- For a different application, a different RDF/OWL
model needs to be devised. - For describing patents this model could be used
to capture explicitly stated information - Patent number and other numbers (? WHICH)
- Inventor, examiner, attorney, (? WHO)
- Date when the patent was filed (? WHEN)
- Explicit references to similar patents (? WHICH)
- etc
- Each W can have multiple sub-categories that are
application-specific!
18Patent Model Creation
Figure 2 Creation of a patent model Claims
section is processed by the conceptual indexer to
produce an index associated with the
patent. Additional information about the concept
is captured by RDF/OWL statements,into a
predefined, application-specific structure.
19Patent Model Result
Figure 3 Patent model WHAT associations are
contained in a descriptive index. Other Ws are
expressed through RDF/OWL statements.
20Patent model Big Picture
- Descriptive indices are re-processed by the
Conceptual indexer,to form the system index. - Each entry in the system index retains links to
the descriptive indices it originates from,and
vice-versa. - This structure allows us to
- Perform quick searches of the existing patents
- Add/remove patents easily
21Figure 4 Top-level scheme
22Patent Model Patent Relations
- Two ways of establishing relations among patents
- Via RDF/OWL statements, using automated reasoners
- ? Problem Referential integrity Consistency
- Via System index (implicit links)
- ? Problem Inexact, based on probability
23Patent Model Implicit Links (1)
- Descriptions of similar concepts (patents)
usually make a frequent use of similar or even
same terms. - By determining overlapping terms we
createdynamic, implicit links among similar
concepts. - The number of such implicit links can be used to
express similarity among concepts. - The algorithm for determining the similarity
needs to be tweaked empirically.
24Patent Model Implicit Links (2)
- For exampleWhen describing two different
vaccines we would probably make a frequent use
of terms like vaccine, inactivated antigens,
immune response, etc.
25Advantages Drawbacks
- Advantages
- Reduced complexity (a great reduction of direct
links between concepts) - Fast search and retrieval (as the result of
using indices) - Scalability
- Drawbacks
- Use of indices implies loss of precision
26Conclusion
- Our idea is still in the first stage of
development. - Its key advantages areits general applicability
and reduced complexity. - Further research is needed to explore the
quality and feasibility of the proposed solution. - However, we expect that the combination of
OWL/RDF structures and indices might produce a
satisfactory performance/exactness ratio.
27References
- W. A. Woods, L. A. Bookman, A. Houston, R. J.
Kuhns, P. Martin, S. Green, "Linguistic
Knowledge Can Improve Information
Retrieval",Proc. of the Applied Natural Language
Processing Conference (ANLP-2000),Seattle, 2000. - O. Scekic, P. Bojic, "An Overview of OWL and its
Role in Semantic Web Architecture",YU-INFO 06,
Kopaonik, SerbiaMontenegro, 2006. - Boris V. Dobrov, Natalia V. Loukachevitch,
Tatyana N. Yudina, "Conceptual Indexing Using
Thematic Representation of Texts,Scientific
Research Computer Center of Moscow State
University, Moscow, 1998. - S. Omerovic, D. Savic, S. Tomazic,"A Survey of
Concept Modeling",Faculty of Electrical
Engineering, University of Ljubljana, Slovenia
(to appear). - William A. Woods, Conceptual Indexing A Better
Way to Organize Knowledge, Technical report,
Sun Microsystems Laboratories, 1998. - http//www.uspto.gov U.S. Patent office
28CONCEPT MODELING
A Proposed Hybrid Approach to Patent Modeling
Ðorde Popovic Ognjen
Šcekic Veljko
Milutinovic popajce_at_ptt.yu
ogi_at_cg.yu
vm_at_etf.bg.ac.yu
Thank you !