Title: BISC Program
1Fuzzy Conceptual Search and Matching Tool for
Intelligent Knowledge Management and Discovery in
the Internet BISC Seminar Masoud
Nikravesh BISC Program EECS Department, CS
Division UC Berkeley Thursday,September 6, 2001
2Content
- Introduction and Experts View Points
- Fuzzy Logic and the Internet
- Fuzzy Conceptual Matching
3Content
- Introduction and Experts View Points
- Fuzzy Logic and the Internet
- Fuzzy Conceptual Matching
4Database Vs. Internet
Database Internet Distributed Distributed C
ontrolled Autonomous Query (QL) Browse
(Search) Precise Fuzzy/Imprecise Structure
Unstructured
5How Big is the Internet?
Text Only It is just Big! Text and Images It is
very Big! Homepages and Databases Connected to
Pages It could be Huge!
Planetary Sciences
Earth Sciences
Genome
DNA Genetic
Geophysical Data Oil Industry
Satellite Images NASA
6Search Engine and Queries Challenges and Road
Ahead
- Deductive Capabilities
- Customization and Specialization
- Metadata and Profiling
- Semantic Web
- Imprecise-Querying
- Automatic Parallelism via Database Technology
and Approximate Reasoning - Ontology
- Ambiguity Resolution through Clarification Dialog
- Definition/Meaning Specificity
- User Friendly
- Multimedia
- Databases
- Interaction
7Internet and Academia Challenges and Road Ahead
Ambiguity and Conceptual and Ontology
(Tom) Aggregation and Imprecision Query (Ronald
Yager) (F. Gomide) Perception, Emotion, and
Intelligent Behavior (Elie Sanchez) Content-Based
(S. Mitra) Escape from Vector Space Deductive
Capabilities (Lotfi A. Zadeh) Imprecise-Querying
(Lotfi A. Zadeh, Berners-Lee) Ambiguity
Resolution through Clarification Dialog (Masoud
Nikravesh and Lotfi A. Zadeh) Precisiated
Natural Languages (PNL) (Lotfi. A. Zaddeh) FLINT
2001
8Soft Computing Past, Present, Future
NEW DIRECTIONS IN ENHANCING THE POWER OF THE
INTERNET
Fuzzy Logic and the Internet Challenges and
Solutions Computing with Words
(CW) Computational Theory of Perception
(CTP) Precisiated Natural Languages (PNL)
9One of the problems that Internet users are
facing today is to find the desired information
correctly and effectively in an environment that
the available information, the repositories of
information, indexing, and tools are all dynamic.
Nov 2001, FLINT
This issue is critically significant about
problems that center on search and deduction in
large, unstructured knowledge bases. August
2001 FLINT
- to enhance the power of the Internet,
especially in the realms of knowledge
organization, search and deductive synthesis... - solutions which make the Internet more
efficient and more user-friendly. - Lotfi A. Zadeh August 2001
10Internet and Industry Challenges and Road Ahead
XMLgtSemantic Web, Workflow, Mobile E-Commerce,
CRM and Resource Allocation Intent, Ambiguity
Resolution, Interaction Reliability, Decision
Support and Monitoring Personalization and
Navigation Decision Support and Resource
Allocation Document Soul, Contextual
Categorization, Approximate Reasoning Imprecise
Query FLINT 2001
11Content
- Introduction and Experts View Points
- Fuzzy Logic and the Internet
- Fuzzy Conceptual Matching
12- Fuzzy Logic and the Internet
- The web environment is, for the most part,
unstructured and imprecise. To deal with
information in the web environment what is needed
is a logic that supports modes of reasoning which
are approximate rather than exact. While searches
may retrieve thousands of hits, finding
decision-relevant and query-relevant information
in an imprecise environment is a challenging
problem, which has to be addressed. - Another, and less obvious, is deduction in an
unstructured and imprecise environment given the
huge stream of complex information.
13Structure of conventional technique and the
problem related to perception
14(No Transcript)
15Internet and Vast Amount of Information
- Central tasks for most of the Search Engines are
- Query or User Information Request
- Do what I mean and not what I say!
- Model for the Internet Web representation
- Web page collection, documents, text, images,
music, etc - Ranking or matching function
- Degree of relevance, recall, precision,
similarity, etc
16Similarity / Measures of Association
17Similarity / Precision and Recall
Non-Relevant
Relevant
Retrieved
A ? B
B
A ? B
A ? B
A ? B
B
Not Retrieved
N
A
A
N Number of documents
18Relationship Between Precision and Recall
              Â
Â
 Â
?
19Fuzzy Logic and the Internet
20Relationship Between Precision and Recall
              Â
Â
Due to aggregation Operator
 Â
x
x
x
x
x
Precision
?
?
x
?
Due to Intelligence
x
x
x
x
x
x
Recall
21Fuzzy Similarity and Fuzzy Subsethood Measures
Suppose the fuzzy sets to be measured are fuzzy
sets A and B and with the membership functions
?A(x), ?B (x). E (A, B) ? degree (AB)
A B A ? B A ? B
A ? B A ? B
A ? B A
S (A,B) ? degree (A?B)
22Content
- Introduction and Experts View Points
- Fuzzy Logic and the Internet
- Fuzzy Conceptual Matching
23Fuzzy Conceptual Matching and Human Mental Model
Scanning
Modeling
Suggestion
Decision
subject
subject
subject
subject
subject
subject
subject
subject
Concept
subject
subject
subject
subject
24Fuzzy Conceptual Matching and Human Mental Model
Scanning
Modeling
Suggestion
Decision
subject
subject
subject
subject
subject
subject
subject
subject
Concept
subject
subject
subject
subject
Ambiguity
Imprecise
Fuzzy
Imprecise
Precise
25Fuzzy Conceptual Matching and Human Mental Model
Clarification Dialog
User Profile
Ontology
Context
Scanning
Modeling
Suggestion
Decision
Concept
Words
Concept
Words
Words
Concept
Words
Words
Concept
Concept
Words
Words
Concept
Words
Words
Concept
Concept
Words
Ambiguity
Imprecise
Fuzzy
Imprecise
Precise
26Documents Similarity Search Personalization-User
Profiling Often time it is hard to find the
right term and even in some cases the term does
not exist. The User Profile is automatically
constructed from text document collection and can
be used for Query Refinement and provide
suggestions and for ranking the information based
on pre-existence user profile.
27Terms Similarity Automated Ontology Generation
and Automated Indexing The ontology is
automatically constructed from text document
collection and can be used for Query Refinement.
28Search Engine and Ambiguity
Search engines often return a large list of
irrelevant results due to ambiguity of search
query terms. To resolve this problem one can use
the following approaches 1) Users/ Client by
selecting a very specific (unique) term and 2)
Systems/Server by offering alternate query
terms for users to refine the query terms
29Search Engine and Deduction
- sources of ambiguity
- Definition/Meaning
- What is the largest building?
- What is the meaning of largest
- Specificity
- Where is the GM headquarters?
- What level of specificity is required?
- Needed Clarification dialog
30(No Transcript)
31Information Overload
Number of Terms
Query
Too Many Cycles
How many cycles are
User query or the
needed before the user
User
Server
information that has
is satisfied?
been sent to Server
needs to be changed
until the user is
Search Result
Ranking
Satisfied.
too much
Search Result
Search Result
?
too little
Number of Terms
Number of Terms
32- Example
- How to Find A Naked Human?
- Textual Information
- Vocabulary/Ontology
- Images
- Color of Skin
- Texture Properties
- Rule Based Grouping of Human Body by
- Geometry Constraint
- Structure (Individual Parts)
- Relationship (Parts)
- Other Constraint
- Color and Texture
- Relationship (Color, Texture, Parts)
Range of Shading Skin Region
33(No Transcript)
34Fuzzy Conceptual Matching and Human Mental Model
Clarification Dialog
User Profile
Ontology
Context
Scanning
Modeling
Suggestion
Decision
subject
subject
subject
subject
subject
subject
subject
subject
Concept
subject
subject
subject
subject
Ambiguity
Imprecise
Fuzzy
Imprecise
Precise
35Intelligent model that can mine the Internet to
conceptually match and rank homepages based on
predefined linguistic formulations and rules
defined by experts or based on a set of known
homepages. The FCM model will be used for
intelligent information and knowledge retrieval
through conceptual matching of both text and
images (here defined as "Concept"). The FCM can
also be used for constructing fuzzy ontology or
terms related to the context of the query and
search to resolve the ambiguity. This model can
be used to calculate conceptually the degree of
match to the object or query.
36100
New Query Language
Correct Queries ()
SQL
0
100
Logical Reasoning Ability (Percentile)
24
New Text Search Method
Correct Documents Found per Hour
Natural Language Search
16
100
Verbal Fluency (Percentile)
Landauer (1995)
37Search Engine and Deduction
Unlike classical logic, fuzzy logic is
concerned, in the main, with modes of reasoning
which are approximate rather than exact. In the
Internet, almost everything, especially in the
realm of search, is approximate in nature.
Putting these two facts together, an intriguing
thought emerges in time, fuzzy logic may replace
classical logic as what may be called the
brainware of the Internet. In my view, among
the many ways in which fuzzy logic may be
employed, there are two that stand out in
importance. The first is search. Another, and
less obvious, is deduction in an unstructured and
imprecise environment. Existing search engines
have zero deductive capability. To add a
deductive capability to a search engine, the use
of fuzzy logic is not an option - it is a
necessity. Lotfi A. Zadeh November
2000 BISC-FLINT
38LARGEST BUILDING IN THE WORLD FOR DISCOVER OF NEW
DRUGS
Largest Building Industry Search Engine
Korea's Largest Building
Largest Building Management
39 largest building in the world designed for the
discovery of new drugs
40Largest Building Industry Search Engine
41Search Engine and Deduction
- Sources of ambiguity
- Specificity
- Where is the GM headquarters?
- What level of specificity is required?
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47Search Engine and Deduction
- Sources of ambiguity
- What is the largest building in the world?
- Definition/Meaning
- What is the meaning of largest
- Specificity
- What level of specificity is required?
48(No Transcript)
49(No Transcript)
50(No Transcript)
51(No Transcript)
52(No Transcript)
53(No Transcript)
54(No Transcript)
55(No Transcript)