Jean-Charles LAMIREL, Jieh HSIANG - PowerPoint PPT Presentation

About This Presentation
Title:

Jean-Charles LAMIREL, Jieh HSIANG

Description:

Biological-like models for intelligent information management ... Images of Amstrong moonwalk, July 69 ' Managing different kinds of queries for discovery ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 54
Provided by: Rlam3
Category:

less

Transcript and Presenter's Notes

Title: Jean-Charles LAMIREL, Jieh HSIANG


1
Using a Background Neural Model in a Digital
Library
  • Jean-Charles LAMIREL, Jieh HSIANG
  • Liu WJ

LORIA, Nancy, France
2
The CORTEX team
  • Research areas Biological-like models for
    intelligent information management
  • Applications
  • Autonomous robotics and in-board intelligence
  • Numerical classification (vs. symbolical)
  • Information retrieval and discovery

3
The CORTEX information retrieval and discovery
activity
  • Main themes of research
  • Interface for personalized access to information
  • Intelligent multimedia data mining
  • Web - Documentary database interaction
  • Collaborations
  • ORPAILLEUR INRIA team, INIST, LaVillette, NSC
    Taiwan, industry...
  • European projects SCHOLNET, EISCTES

4
Some examples of application
  • Adaptive environment for assistance to
    investigation on the Web
  • Multi-topographic navigation MultiSOM
  • For multimedia data mining
  • For data mining on full text (patents)
  • Numerical-symbolic collaboration

5
Presentation summary
  • Introduction
  • Basic set of functionalities for information
    discovery
  • Limitations of the classical methods for
    information discovery
  • The MultiSOM model Butterfly application
  • Basic behaviour
  • Extensions
  • Management of textual information

lamirel_at_loria.fr
6
Basic set of functionalities for information
discovery
  • Synthetical view of the studied domain
  • Distribution of the thematical indicators of the
    domain
  • Highligting of regularities / weak signals
  • Management of several type of synthesis
  • Interactivity
  • Dynamic data mixture / type of need
  • Choice of meta-orientation of investigation
  • Setting of the granularity level of the analysis
  • Multimedia

7
Managing different kinds of queries for discovery
  • Exploratory (no goal)  Which is the contents
    of the database ?
  • Thematic (general orientation)  Images of
    space conquest 
  • Connotative (hidden goal, indirect research)
     Impressive images on human technology 
  • Precise  Images of Amstrong moonwalk, July 69 

8
Limitations of the classicalmethods for
information discovery
  • Overall view of the studied domain
  • Noise
  • Complex interpretation (hidden information)
  • Local views necessarily independant
  • Weaks signal difficult to highlight
  • No interactivity
  • Passive classification
  • Predefined ways to access to information

9
Neural methods for information cartography
  • Topographic learning (SOM)
  • classification
  • projection
  • Multi-viewpoint modelization capabilties
    (MultiSOM)
  • Intuitive auto-organization of information
  • Active maps (IR Navigation)
  • Low human intervention during construction
  • Multimedia capabilities

10
Butterfly museum application
  • Different kinds of query
  • Query by keywords
  • Query by example
  • Different kinds of criteria
  • Colour (automatic)
  • Shape (manual)
  • Texture (manual)
  • Problems
  • Hand-made classifications
  • Combination of results coming from different
    criteria

Yellow very strong,Red not,Edge
strongSpot middle,
11
Butterfly application automation
Global and/or cross viewpoints classifications
User interface
Combination of results
User interface
Validation of insertion or classification
recalculation
Butterfly application
Viewpoint classifications
12
Basic topographic map building
  • Data description
  • Document (image) index vector eg vector of
    characteristics
  • Weighting of the characteristics modalities (very
    strong1, )
  • Optionnal IDF weighting (weak signals detection)

13
Basic topographic map building
  • Map predefined parameters settings
  • Number of neurons
  • Structure eg 2D grid with square neighbourhood
  • Competitive learning

14
Selection of the winning neuron
Influence on the neigbourhood
Competitive learning
15
Map labelization and zoning
  • Map labelization
  • Based on the best components of the profiles
  • Class or member-oriented
  • One single method is not sufficient
  • gt Gives an overview of the detected themes
  • Map zoning
  • Based on the SOM topographic properties
  • Based on the best components of the class
    profiles
  • gt Gives an overview of the weights of the
    themes

16
(No Transcript)
17
The MultiSOM model
18
Map on-line generalization
  • Goal
  • Synthethize the map contents by decreasing the
    number of neurons (classes)
  • Constraints
  • Preserve the map topographic properties
  • No classification re-computation
  • Method
  • Exploitation of the neighbourhood relations on
    the map

19
Map on-line generalization
20
(No Transcript)
21
Semantic viewpoints
  • Subspace of the description space
  • Can be a field, a subset of keywords, ...
  • Possible overlapping sets
  • Concurrent or complementary viewpoints
  • gtExamples indexer keywords, title keywords,
    authors, , visual characteristics, sounds
  • gtButterflies color, shape, texture,

22
Inter-map communication
  • Goal
  • Cope with the limitations of a global map
  • Allow communication between viewpoints
  • Constraints
  • Interpretable behaviour
  • Method
  • Re-projected data Transmitters neurons
  • Two steps
  • 1) Activation of a source map (directly or
    through a query)
  • 2) Transmission to target maps

23
Inter-map communication
24
Inter-map communication
  • A function
  • Two modes
  • Possibilistic (weak thematic relations over
    viewpoints) 
  • Probabilistic (mesure of the themes similarities)

gt g class belonging degree
25
Activity coherency
STRONG FOCALIZATION
WEAK FOCALIZATION
26
Inter-map communication
BUTTERLIES
27
Compliance with IR operations
Response NO
Response YES
Question Are there butterflies with spots AND
veins ?
28
Remaining problems (to be solved)
  • Validation of the automatic classification
    results by the experts
  • Testing of different results merging methods
  • Test the use of prototype features in
    classification
  • Realization of a Web interface for the maps
  • Compare map build-in result combination mechanism
    with external combination mechanism
  • Test map capabilities for the help in adding new
    individuals
  • Introduce textual data and combine it with images

29
(No Transcript)
30
Experimentation on patents (texts)
  • Goal Intelligent technological survey
  • Full text analysis of the patents
  • Domain of oil engineering
  • Provide answers to questions like
  • 1. Which are the relationships between patentees
    ?,
  • 2. On which specific technology does a patentee
    work ? Which are the advantages of this specific
    technology ? For which use ?,

31
Basic experimental protocol
DILIBReformating
PatentsDatabase
Patents in XMLFormatStructured by Viewpoints
Nominal groupsExtraction
ValidatedMulti-indexes
Interactive maps for analysis
MicroNOMADMultiSOM
lamirel_at_loria.fr
32
Nominal groups extraction
  • 1) Lexicographic analysis (compound terms)
  • 2) Normalization
  • Ex  oil fabrication  and  oil engineering
    gt  oil engineering 
  • Results

33
Patents reindexing
  • Selected Viewpoints title, use, advantages and
    patentees

34
Example of dynamic analysis
DYNAMIC DEDUCTION Parentee TONEN CORP.  is a
specialist of lubrification of the  automatic
transmission . It products mainly oils based on
 organo- molybdenum compound  whic have the
specific property of having a  friction
coefficient stable stable on a wide range of
temperature 
35
Classical methods (AK-means)
CLASSES MAP
36
Conclusion
  • Different viewpoints yield complementary results
  • Ex Indexer keywords Closed themes, Title
    keywords Open themes, ...
  • Detection of indexation inconsistencies
  • Projection of thematic pertinence of a query
  • Bilateral synergy images ltgt textual information
  • Very rich and flexible inter-map communication
    mechanism
  • Cross analysis between viewpoints, dynamics
  • No limitation regarding viewpoints type and
    number

37
Perspectives
  • Sophisticated 2D mapping, 3D mapping
  • Pure image mosaic navigation
  • Automatization of communication between
    viewpoints
  • Interaction with Gallois lattice map zoning and
    generalization, rule mapping, lattice entry
    points selection
  • Applications
  • La Vilette interactive browsing through museum
    collection, setting up of exibitions
  • INIST Cartography of the Web (EISCTES EEC
    Project)

38
3) Combining Symbolic and Numeric Techniques for
DL Contents Classification and Analysis
  • Jean-Charles LAMIREL,
  • Yannick TOUSSAINT (Orpailleur)

39
Introduction
  • Combining numerical and symbolic methods
  • MicroNOMAD Self Organizing Maps (SOM)
  • Basic SOM topographic properties
  • MicroNOMAD multi-map communication process
  • Lattice
  • Formal properties and symbolic deduction
  • Hierarchical structure and inheritance of
    properties
  • Study of projection of SOM over lattice
  • Making explicit formal properties on the map
  • Map intelligent zoning and labelization

40
Galois lattice
  • Symbolic hierarchical method (i1, i2, p1, p2,
    p3)
  • Partial order defined by the subsumption relation
    over the set of formal concepts
  • (I1, P1) ? (I2, P2) ? I1 ? I2,
  • (I1, P1) ? (I2, P2) ? P1 ? P2,
  • ? I1, I2 there is a unique meet and join.
  • Inheritance of properties
  • Extraction of association rules
  • Search Engine ? Web, IR

41
I i1, i2, i3, i4, P AI, Robots, Search
Engine, Web, IR i1 Web, IR i2 Web, IR i3
Web, IR, Search Engine i4 AI, Robots
i1, i2, i3, i4 , ?
i1, i2, Web, IR
i4, AI, Robots
i1, i2, i3, Search Engine, Web, IR
?, IA, Robots, Search Engine, Web, IR
R1 Search Engine ? Web, IR
42
Complementarity of approaches
  • Kohonen SOM
  • Complex weighting scheme
  • Difficulty for precise interpretation
  • Good illustrative power (topographic structure)
  • Good synthesis capabilities
  • Non linearity
  • Lattice
  • High number of classes
  • Memory and time consuming
  • Hierarchical structure
  • Rule extraction
  • Incrementality

43
3-steps methodology
Projection
Grouping
Agglomeration
44
Conclusion
  • Cosine method seems to be the best of the test
  • Good accuracy
  • Well-balanced agglomeration
  • Agglomeration preserves closed areas on SOM
  • Other projection and agglomeration methods have
    to be tested
  • Preservation of partial order and inheritance

45
Perspectives
  • Evaluation on large corpus Expert
  • Rule management
  • class quality evaluation
  • class labelisation
  • Deduction validation on communicating maps
    (lattice extensions)
  • Implementation of an operational prototype

46
Other approaches
  • Multi-classificator cooperation (PhD)
  • SVM
  • Stigmergy
  • Genetic
  • Neural maps
  • On-line learning of user s behaviour,
    intelligent relevance feedback

47
Annexes
  • Topographic inconsistencies
  • Area computation
  • Inter-map communication
  • Activity coherency

48
Topographic inconsistencies
NO INCONSISTENCIES
WEAK INCONSISTENCIES
STRONG INCONSISTENCIES
49
Topographic inconsistencies
GLOBAL
STRONG
Neuron neighbourhood
50
Area computation
WHILE
SO AS
IN
DO
END DO
51
Inter-map communication
52
Viewpoint oriented Patents Analysis
  • Selected Viewpoints title, use, advantages and
    patentees

lamirel_at_loria.fr
53
Themes extending oil live  and  black sludge
control are strongly linked together because
they are neighbours on the map
 black sludge  apparition has a negative
incidence on the  friction coefficient of oil
MAP OF VIEWPOINT ADVANTAGES
Write a Comment
User Comments (0)
About PowerShow.com