Title: Thesis Committee Meeting
1Thesis Committee Meeting
- Co-supervisor Edie Rasmussen Director
SLAIS - Co-supervisor Richard Rosenberg Prof
Em CS - David Poole Prof CS
- Tamara Munzner Asst Prof CS
20 April 2006
2goals of our meeting
- During the meeting ---
- Review work done
- Decide on work necessary for completion
- Following the meeting ---
- Draw up a document summarizing the plan, to be
approved by all members.
3time line
- According to FoGS ---
- Two committee meetings
- First today to decide on the plan
- Second in December, to review work done and
approve write up - Degree completed in Spring 2007
4FoGS suggestion
- Given that It was Mr. Huggetts belief that the
work done in his first years of study would
constitute in some way a portion of his doctoral
thesis. - 1/3 of thesis Code development (MemoPlex)
- 1/3 of thesis The large paper (Principia)
- 1/3 to be determined in this meeting
- Thesis should be written in manuscript style.
5The Manuscript Thesis
- Purpose
- To gain writing experience in a format used by
researchers in a field of study. - To ensure timely publication.
- Format
- Constructed around one or more related
manuscripts previously published or being
prepared for publication. - An introductory chapter sets the context for the
work. - The concluding chapter relates the manuscript
chapters to each other, and proposes directions
for future research.
6(No Transcript)
7The Thesis Plan
8TOC
- Part 1 Overview
- Area, motivation, and background
- Part 2 Work Done
- Items of significant work
- Part 3 To Do
- Possible next steps
9Part 1
10what I do
- Information management using associative
spreading-activation networks - Why
- It offers faster and more intuitive access to
relevant information - Why you should care
- It could improve how you access personal
information
11dynamic associative networks
- No hierarchies, and link lengths simply indicate
relatedness.
Collins and Loftus, 1975
12P-MAK the general framework
- Principles of Mnemonic Associative Knowledge
- mnemonic Relating to or intended to assist the
memory - knowledge Information organized by human goals
- A set of principles describing how knowledge is
constructed - Knowledge only makes sense wrt human
characteristics - If mechanized, also depends on properties of
machines - Knowledge depends on relations between objects
13what is an object ?
- An object is a discrete piece of meaningful
information, such as a document, image, or piece
of music. - An object is defined by discrete attributes
- An object has an activation level reflecting its
utility. - Objects may be linked by various types of
relation - P-MAK focuses on the document domain, where an
object could be a book, chapter, passage, or
article. - In documents, the object attributes are keywords.
14P-MAKs 3 types of relations
- Semantic
- Objects are linked if they share attributes
- Co-usage
- Objects are linked if they co-occur consistently
- Situational
- Objects are indexed by the time and circumstances
in which they occur - Other ...
15objects relations network
- Retrieval from long-term memory, using spreading
activation along weighted links.
unit object element attribute
Anderson, 1983
16P-MAKs semantic network
- For retrieving similar objects
- Two key processes
- Defining objects are identified and put in
nodes, and a classifier is used to extract
descriptive attributes - Relating nodes are linked if they share
keywords. More shared keywords make stronger
links - Semantic links are static
- Semantic knowledge is cumulative and permanent
17P-MAKs co-usage network
- For retrieving objects that are typically used
together - Assumed that there is some invisible (semantic)
relation between them - Persistence links are dynamic
- Links grow stronger the more objects are used
together - Relations that are not stimulated fade and are
forgotten
18forgetting is a good thing
- Forgetting is vital in a dynamic system, for ---
- Preventing available information from becoming
overwhelming - Focusing attention on items that have proven
importance - Allowing thematic drift, to stay up to date
19P-MAKs situational network
- For retrieving objects that relate to particular
times or circumstances - Serves a predictive function by retrieving
objects when appropriate cues re-occur - Few computational models of such episodic memory
exist (e.g. Laird, Miikkulaenen) - Relations strengthen or fade depending on support
- P-MAK defines ---
- Temporal indexing for temporal events (w/ timers)
- Environmental indexing for spatial events (w/
sensors)
20Temporal Indexing
- Answers
- What usually happens at time t ?
- When is event e likely to occur?
- As events are observed, they are linked to index
nodes that contain a conjunction of temporal
units. - Index nodes represent temporal patterns.
- The more common a pattern, the greater its link
and node weights.
21Temporal Indexing
- Events e0 and e1 are indexed by temporal nodes,
and activate objects n0 and n1.
40
22Environmental Indexing
- Answers
- Under what conditions does event e occur?
- What events are associated with sensor s?
23project goals
- To build information management systems
- and knowledge structures that are ---
- Simple
- Efficient
- Extensible
- Inspectable
- Human-centred
24Links to cognitive science
- Discrete objects
- Symbolic networks
- Spread of activation
- Co-usage nets Hebbian learning
- Situational nets episodic memory
25Why model an IM system on human memory?
- Human memory is clearly effective at representing
the statistical regularities of the environment. - It is much studied and well understood.
- It provides users with a familiar mental model.
- Such systems could operate as cognitive
prostheses by extending human memory.
26Why use networks for knowledge representation?
- they can be built and edited ad hoc (cf.
structured DBs) - they are ideal for sparse domains (esp. semantic)
- they are easily depicted (cf. vector methods)
- they allow graph-theoretic analyses (clustering,
arities, small worlds, etc.) - Associative networks in particular are good for
--- - a human-readable representation (cf. PDP, LSA)
- finding related items quickly
- searching through browsing (navigation)
27Prior art
- A short list of work in this area
- The Memory Extender (Jones 86)
- A Spreading Activation Model for IR (Preece, 81)
- On the Use of Spreading Activation Methods in
Automatic Information Retrieval (Salton Buckley
88) - IR by Constrained Spreading Activation in SemNets
(Cohen Kjedlsen 87) - But none is a good fit.
- Using assoc-nets for information retrieval is
rare --- but not because it has been
proven ineffective it just doesnt
seem to have caught on, compared with neural
networks.
28Part 2
29Two main pieces of work
- Principia A paper that describes the basis of
- Information management using associative
spreading-activation networks - All project products can be described wrt
Principia - MemoPlex An implementation of an associative
information management system, based on a
specification written in (Hoos, 2001).
30(1) MemoPlex an IMS
- Based on an unpublished white paper (Hoos, 2001)
- Information management system using spread of
activation for information retrieval - Items are linked through an associative network
- One link type, initially weighted for similarity
- Node and link strengths decay if not used
31The Plex system
- Starting with a large corpus of documents ---
- Documents are represented by nodes.
- A classifier (here, tf-idf) is used to extract
keywords for each document. - Documents are linked if they share keywords.
- Link strength depends on the number of shared
keys. - Produces a multi-dimensional semantic network,
where nodes (documents) are most strongly linked
to their most similar peers.
32 33MemoPlex evaluation
- Pro ---
- Code base is robust and incrementally improved
- Provides essential network utilities and GUI
components for further study - Can be used as a diagnostic tool
- Con ---
- Theoretical basis is weak
- Semantic co-usage relations are entangled
- Interface is confusing
- Many irrelevant features (e.g. is also a Web
applet) - Untested
34...but from this starting point...
35(2) AutoPlex
- The first application of spreading activation to
spatio-temporal (ST) problems - Implements and tests ST indexing in the context
of automobile-driving behaviour. - Reads ST data from automotive GPS logs.
- Introduces the idea of ST node aggregation.
36 37The Plan
- Write up AutoPlex as a systems paper, likely to
be submitted to a conference.
38(3) The Tempora ST paper
- Formalizes and extends lessons learned in the
AutoPlex project. - Introduces the Temporal Subsumption Graph (TSG).
- Formalizes the process of node aggregation.
39Temporal Subsumption Graph (TSG)
- The TSG defines a hierarchy of units some are
aggregable into longer periods (e.g. the defns
of morning and spring).
40Temporal Aggregation (Perfect)
- When an event is evenly represented by an
aggregable level of the TSG.
As Friday is added
Weekday is substituted
41Temporal Aggregation (Partial)
- In the absence of perfect support, an event is
represented by a wff.
Given consistently even support
a wff is substituted
42The Plan
- Perform experiments on TSGs and aggregation,
using - synthesized event series
- user event logs
- Add results from experiments to the paper.
43(4) The Search Browse study
- In collaboration w/ Joel Lanir
- Examines whether a similarity network can aid
retrieval tasks in large document corpora. - Corpora of 2000 items consist of recent NYT
articles, and Reuters articles from 1987. - Uses Google Desktop for searching, and the Plex
engine for building and navigating the similarity
networks.
44The Plan
- Experiment recently finished on 24 subjects (6
per cell). - Log parser is written.
- Initial indications suggest that there is a
positive effect. - Analysis to begin shortly.
- To be written up as a conference paper.
45Part 3
46How it all fits together
test products
papers
Principia (journal)
P-MAK
sb experiment
conference paper
semantic
MemoPlex application
from Hoos, 2001
co-usage
Temporal (conference)
situational
AutoPlex application
conference paper
47Possible next steps
- Principia
- review submit to journal Minds and Machines
- AutoPlex
- write up for systems conference or journal
- Tempora
- perform benchmark tests
- write up for conference
- SB study
- perform analysis
- write up for conference
48CONCLUSION
49The benefits of P-MAK
- Unsupervised, real-time learning and unlearning.
- Improves and maintains accuracy over time.
- Captures statistical regularities in the
environment, as well as one-time events. - Can be used to flag outlier events.
- Can be used to mimic and support human memory
reliably.
50Applications include
- User modeling
- Recommender systems
- Social filtering (e.g. in intranets and
libraries) - Situational awareness decision support
- Robotic episodic memory
- Memory prosthesis (esp. wrt elder care)
51In sum where we stand
- A principled approach
- A reusable code base
- A wide area of theory and application
- 1 journal paper near submission
- 3 conference papers in the works
52Thank you