Title: Improving intelligent assistants for desktop activities
1Improving intelligent assistants for desktop
activities
- Simone Stumpf, Margaret Burnett, Thomas
Dietterich -
- Oregon State University
- School of Electrical Engineering and Computer
Science
2Overview
- Background
- Activity switching problems
- How to improve activity prediction
- Reducing interruptions
- Improving accuracy
- Conclusion
3Background TaskTracer System
- Intelligent PIM system
- The user organizes everyday life into different
activities that have a set of resources - e.g., teach cs534, iui-07 paper, etc.
- How it works
- The user indicates the current activity
- TaskTracer tracks events (File open, etc.)
- TaskTracer automatically associates resources
with the current activity - TaskTracer provides useful information finding
services through intelligent assistants
4Example TaskTracer services
- TaskExplorer
- Presents a list of resources for each activity
for easier access - FolderPredicor
- Predicts the location of resources useful for
current activity
5Activity switching problems
Physical cost (mouseclicks, keypresses)
AAAI web page
IL local folder
IL netw
IL DOC
AAAI PPT
- To provide services
- Assumes that users switches activity so data is
not too noisy - TaskPredictor assists by predicting activity,
based on resource use
Cognitive cost (deciding to switch)
6TaskPredictor
- Window-document segment (WDS) unbroken time
period in which a window in focus is showing a
single document - Assumptions
- A prediction is only necessary when the WDS
changes - A prediction is only made if predictor is
confident enough - Shen et al. IUI 2006
- Source of features words in window titles, file
pathnames, website URLs, (document content) - Hybrid approach Naïve Bayes and SVM
- Accuracy 80 on 10 coverage
7Reducing interruptions
8Problems in activity prediction
Physical cost to interact (mouseclicks,
keypresses)
Cognitive cost to interact (deciding to switch)
- Potential notifications still high
- Wait to see if user stays on WDS to reduce number
of notifications
9Activity boundaries
Prepare IL paper
Download latest version
Edit document
Save document
Upload latest version
Open document
- Iqbal et al. CHI 2005, 2006
- Interruption costs are lower on boundaries
- Costs high within a unit
- So what happens if the user does stay on WDS?
10Reducing interruptions
- Move from single-window prediction to
multiple-window prediction (Shen et al, IJCAI
2007) - Identify user costs to make prediction
- Determine opportunities intelligently
- Trade-off of user cost/benefit
- Make predictions at boundaries, then commit
changes on user feedback
11Improving accuracy
12Why improve accuracy?
- 100 accuracy rare
- TaskPredictor and other predictors may make wrong
predictions - Limited feedback only labels
- Users know more can we harness it?
- How can learning systems explain their reasoning
to the user? - What is the users feedback to the learning
system? - (Stumpf et al. IUI 2007)
13Pre-study explanation generation
1
n
Rule-based
Ripper
1
n
1
n
Keyword-based
NB
1
n
Similarity-based
- Enron farmer-d
- 122 emails, 4 folders (Bankrupt, Enron News,
Personal, Resume)
Concrete, and simplified but faithful
14Classification
- Standard Weka implementations
- Stratified 5-fold cross-validation
- Stop words and stemming
- Features email sender, set of email recipients,
words in Subject and Body - Ripper generates ordered set of rules
- NB learns weights on words
15Rule-based
16Keyword-based
5 words in email having highest positive weight
5 words in email having most negative weight
17Similarity-based
Most decrease if removed from training set
Up to 5 words in both emails having highest
weights
18Within-subject study design
15 minutes
1
2
3
19Giving feedback
- Participants were asked to provide feedback to
improve the predictions - No restrictions on form of feedback
20Responses to explanations
- Negative comments (20)
- those are arbitrary words.
- Confusion (8)
- I dont understand why there is a second email.
- Positive comments (19)
- The Resume rules are good.
- Understanding (17)
- I see why it used Houston as negative.
Correcting or suggesting changes (32) Different
words could have been found in common, like
Agreement, Ken Lay.
21Understanding explanations
- Rule-based best, then Keyword-based
- Serious problems with Similarity-based
- Factors
- General idea of the algorithm
- I guess it went in here because it was similar to
another email I had already put in that folder. - Keyword-based explanations negative keyword list
- I guess I really dont understand what its doing
here. If those words werent in the message? - Word choices topical appropriateness
- Day, soon, and listed are incredibly
arbitrary keywords.
22Preferring explanations
- Preference trend follows understanding
- Factors
- Perceived reasoning soundness and accuracy
- I think this is a really good filter
- Clear communication of reasoning
- I like this because it shows relationships
between other messages in the same folder rather
than just spitting out a bunch of rules with no
reason behind it. - Informal wording
- This is funny... (laughs) ... This seems more
personable. Seems like a narration rather than
just straight rules. Its almost like a
conversation.
23The user explains back
- Select different features (53)
- It should put email in Enron News if it has the
keywords changes and policy. - Adjust weights (12)
- The second set of words should be given more
importance. - Parse/extract in different way (10)
- I think that it should look for typos in the
punctuation for indicators toward Personal. - Employ feature combinations (5)
- I think it would be better if it recognized a
last and a first name together. - Use relational features (4)
- This message should be in EnronNews since it is
from the chairman of the company.
24Underlying knowledge sources
- Commonsense (36)
- Qualifications would seem like a really good
Resume word, I wonder why thats not down here. - English (30)
- Does the computer know the difference between
resumé and resume? - Domain (15)
- Different words could have been found in common
like Ken Lay.
25Current work
- More than 50 of suggestions could be easily
incorporated - New algorithms to handle changes to weights and
keywords - User feedback as constraints on MLE of the
parameters - Co-Training
- Investigate effects on accuracy using study data
- Constraints Not hurting but not much improvement
either - Co-training approach better
26Conclusion
- User costs important
- Higher accuracy
- Timing of prediction notifications
- Usefulness of predictions
- Explanations of why a prediction was made