Title: Interactive Object Recognition Using
1Interactive Object Recognition Using Proprioceptiv
e Feedback
Taylor Bergquist, Connor Schenck, Ugonna Ohiri,
Jivko Sinapov, Shane Griffith and Alexander
Stoytchev Developmental Robotics Lab Iowa State
University, Ames, IA, U.S.A. E-mail knexer
cschenck ucohiri jsinapov shaneg alexs
_at_ iastate.edu
2What is Proprioception?
- It is the sense that indicates whether the
body is moving with required effort, as well as
where the various parts of the body are located
in relation to each other. - - Wikipedia
3The Importance of Proprioception
Empty
Full
4The Importance of Proprioception
Hard
Soft
5Exploratory Behaviors in Children
6Lifting Weight, Gravity, Effort
http//www.subflux.com/blog/index.php?cat8
7Shaking Weight, Inertia, Contents
8Dropping Gravity and Physics
9Pushing Physics and Objects
10Crush Compliance, Flexibility
11Five Exploratory Behaviors
Lift
Crush
Shake
Push
Drop
12Robot Platform
13Objects Used in the Experiments
- 50 household objects
- Different materials metal, paper, plastic, wood,
etc. - Some objects have contents inside of them (e.g.,
pill bottle) - All are graspable by the Barrett Hand
14(No Transcript)
15Experiment Scale
- 50 objects
- 5 exploratory behaviors
- 10 repetitions
- 50 5 10 2500 joint torque records
16Torque Data Preprocessing
- Joint torque data was sampled and recorded at 500
Hz using the robots API - The raw data was filtered to remove outliers
J1J7
17Feature Extraction
- Joint torque data (500 samples/second in R7)
- Some way is needed to compress the data
- Discretize to obtain a sequence Pi of tokens
from a finite alphabet
J1 J7
SOM
18Training the Self-Organizing Map
19Problem Formulation
Proprioceptive sequence
Object Recognition Model
20Predictive Models
- k-NN and global alignment
- Emphasizes temporal structure of the
proprioceptive sequences - Multinomial Naïve Bayes and n-gram
- Emphasizes distributional structure of the
sequences
21k-NN
- k-NN memory-based learning algorithm
With k 3
2 neighbors
1 neighbors
Test point
?
Therefore, Pr(red) 2/3 Pr(blue)
1/3
- Uses Needleman-Wunsch global alignment as a
measure of similarity between two sequences - A guided tour to approximate string matching,
by Navarro, G. in ACM Computing Surveys, v.33,
2001
22Multinomial Naïve Bayes (N-grams)
Sample Sequence
2-grams
1-grams
23Multinomial Naïve Bayes (2-grams)
Sample Sequence
2-grams
where n(wt, di) is the number of occurrences of
word wt V in the feature vector di.
24Multinomial Naïve Bayes (2-grams)
Sample Sequence
2-grams
where n(wt, di) is the number of occurrences of
word wt V in the feature vector di.
25Evaluation
- Ten-fold cross validation
- 2500 total interactions, 250 per fold
- Evaluated on recognition accuracy, where
- Chance accuracy 1/50 2
correct predictions
x 100
Accuracy
total predictions
26Recognition from a Single Behavior
27Recognition from Multiple Behaviors
- How to combine predictions from multiple
behaviors? - assume that all behaviors are equally useful
- weight behaviors according to their accuracy
28Object Recognition Results
- What happens when the robot uses information
from multiple interactions with the same object?
29Multimodal Recognition
Behaviors
Single Multiple
Single
Multiple
Modalities
30Multimodal Recognition
Behaviors
Single Multiple
Single
Multiple
Modalities
- This paper (proprioception only)
31Multimodal Recognition
Behaviors
Single Multiple
Single
Multiple
Modalities
Follow up paper (proprioception audio)
Interactive Object Recognition Using
Proprioceptive and Auditory Feedback
Submitted to IEEE Robotics and Automation
Magazine (under review).
32Audio Data (for the same dataset)
- Audio data was recorded during data collection
and transformed into spectrograms
Raw Sound
Discrete Fourier Transform
33Multimodal Training (Two SOMs)
Training a self-organizing map (SOM) using
sampled joint torques
Training an SOM using sampled frequency
distributions
34Multimodal Feature Extraction
Discretization of joint-torque records using a
trained SOM
Discretization of the DFT of a sound using a
trained SOM
is the sequence of activated SOM nodes over
the duration of the interaction
is the sequence of activated SOM nodes over
the duration of the sound
35Multimodal Recognition
Audio sequence
Proprioception sequence
Proprioceptive Recognition Model
Auditory Recognition Model
Weighted Combination
36Multimodal Recognition Results
37Related Work
- Audition
- Kubus, Kröger, and Wahl, 2007
(3 objects) - Richmond and Pai, 2000 (4 objects)
- Torres-Jara, Natale, and Fitzpatrick, 2005 (4
objects) - Sinapov, Weimer, Stoytchev, ICRA 2009 (36
objects) - Proprioception
- Natale, Metta, and Sandini, 2004 (7 objects)
38Accuracy vs. Number of Objects
39Multiple Modalities Multiple Behaviors
40Conclusions and Future Work
- Conclusions
- Robots can and should use proprioception as a
source of information about the world - Better results can be obtained by combining
multiple interactions and multiple modalities
- Future Work
- More complex behaviors and more objects
- Integrate proprioception with more modalities
(vision, haptics, etc.)
41Take Home Message
- Number of objects recognition accuracy
- Number of behaviors recognition accuracy
- Number of modalities recognition accuracy
42Thank you
Any questions?
43THE END
44Object Recognition Results
- The dotted lines show the best and worst case
for the weighted and unweighted combination.
45Recognition from Multiple Behaviors
- How to combine predictions from multiple
behaviors? - Previous work (Sinapov, Weimer, Stoytchev, ICRA
2009) - assume that all behaviors are equally useful
- Choose the object Oi that maximizes
?B P(Oi PB), where B is an exploratory behavior
performed on the object. - This assumption fails to hold for this work
- weight behaviors according to accuracy instead
- Choose Oi to maximize ?B P(Oi PB) wB,
where wB is the estimated reliability of the
model given a sequence from behavior B.