MICCE - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

MICCE

Description:

Flexible architecture for discovering and annotating 'concepts' in unstructured ... Things which appear in similar contexts are semantically similar (even if their ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 12
Provided by: agi5
Learn more at: http://agiri.org
Category:

less

Transcript and Presenter's Notes

Title: MICCE


1
MICCE
  • Mutual Information Contextual Concept Extractor

Dr. Deborah Duong Virginia Tech Applied Research
Laboratory for National and Homeland
Security dduong_at_vt.edu
Mike Ross Science Applications International
Corp. Integrated Intelligence Solutions
Operation michael.s.ross_at_saic.com
2
What is MICCE?
  • Flexible architecture for discovering and
    annotating concepts in unstructured data(based
    on work of Lin, Pantel) in development at SAIC
  • Inspired by cognitive principles of perception
  • Applicable to symbol grounding

3
Main Idea
  • Things which appear in similar contexts are
    semantically similar (even if their surface
    features differ)

4
Current Results
  • Applied to 100,000 Reuters News articles from
    1996/1997
  • upturn, decline, slowdown, improvement
  • guilder, crown, penny, franc
  • Bill Clinton, Oscar Luigi Scalfaro, Jacques
    Chirac, Nelson Mandela,
  • Boris Yeltsin, Aleksander Kwasniewski, Hosni
    Mubarak
  • learn, feel, believe, notice, felt, know,
    guess, think
  • slide, dip, gain
  • review, evaluate, study, examine
  • sorry, comfortable, ashamed
  • northeast, northwest, southwest, southeast
  • contingent, pursuant, conditioned, subject,
    conditional
  • certainly, obviously, simply, clearly
  • recently, yesterday

5
CBC Context Feature
  • Text is ingested as sequences of word-stems
  • Parsed by error-prone dependency parser
  • Multiple context types (more can be added)
  • Compute Mutual Information between every word and
    every context.
  • The cook ate the salad with the onions.
  • -- EAT (1.0)
  • -- SALAD (1.0)
  • -- ONION (1.0)
  • --subj-- EAT (1.0)
  • --subj-- EAT obj-- SALAD (1.0)
  • --subj-- EAT with-- ONION (0.5)
  • --subj-- EAT obj-- SALAD --with-- ONION (0.5)

COOK...
6
CBC Similarity Distance

7
CBC Similarity Distance
  • Clusters are sets of words which appear in
    similar contexts.


8
CBC Decomposition Clusters act as basis vectors
warehouse, factory, facility
shrub, flower, bush
plant
9
Feedback? (we hope). Mixing top-down and
bottom-up processing
  • The cook ate the salad with the onions.
  • -- EAT, CONSUME, DINE
  • -- SALAD, SOUP, SANDWICH, LETTUCE
  • -- ONION, PICKLE, TOMATO, GARLIC

COOK...
Underlying data may then be reinterpreted with(ea
t, onion) vs. with(salad, onion)
Concepts are formed from unstructured data
10
Statistics vs. Cognitive/Neural Models of
Perception
  • Similarity
  • Mutual Information
  • Vector decomposition
  • Trimming vectors
  • Neuron co-firing
  • Novelty/Expectedness
  • Hierarchical neural layers
  • Forgetting

11
Symbol Grounding
  • In MICCE, concepts are essentially abstract
    descriptions of relationships between
    environmental symbols (percepts).
  • In an AI, these descriptions could be reinforced
    by
  • additional perceptual data, or by homomorphism
    between symbolic structure and conceptual
    structure

Cook agent Eat-Event patient Salad
cook subj eat obj salad
Write a Comment
User Comments (0)
About PowerShow.com