Multimodal Input for Meeting Browsing and Retrieval Interfaces: Preliminary Findings

1 / 21

About This Presentation

Title:

Multimodal Input for Meeting Browsing and Retrieval Interfaces: Preliminary Findings

Description:

PC with speakers. wireless mouse, keyboard. touchscreen. 2 cameras. recording equipment ... mostly non-native English speakers. different levels of computer ... –

Number of Views:36

Avg rating:3.0/5.0

Slides: 22

Provided by: LISO4

Category:

more less

Transcript and Presenter's Notes

Title: Multimodal Input for Meeting Browsing and Retrieval Interfaces: Preliminary Findings

1
Multimodal Input for Meeting Browsing and
Retrieval Interfaces Preliminary Findings

Agnes Lisowska Susan Armstrong
ISSCO/TIM/ETI, University of Geneva
IM2.HMI

2
The Problem

Many meeting centered projects, resulting in
databases of meeting data
but

3
The Problem

Many meeting centered projects, resulting in
databases of meeting data
but
How can a real-world user best exploit this data?

4
Mouse-keyboard vs. Multimodal Input

Web - similar media (video, pictures, text,
sound) and we are used to manipulating them with
keyboard and mouse
but

5
Mouse-keyboard vs. Multimodal Input

Web - similar media (video, pictures, text,
sound) and we are used to manipulating them with
keyboard and mouse
but
Multimedia meeting domain is novel
interesting information found across media in the
database
so .

6
Mouse-keyboard vs. Multimodal Input

Web - similar media (video, pictures, text,
sound) and we are used to manipulating them with
keyboard and mouse
but
Multimedia meeting domain is novel
interesting information found across media in the
database
so .
Multimodal interaction could be the most
efficient way to exploit cross-media information

7
The Archivus System

Designed based on
a user requirements study
data and annotations available in IM2 project
Flexibly multimodal
can study the system with minimal a priori
assumptions about interaction modalities
Input
pointing mouse, touchscreen
language voice, keyboard
freeform questions allowed, but not a QA system
Output
text, graphics, video, audio

8
The Archivus Interface
9
Experiment Scenario

Scenario
user is a new employee that must do some fact
finding and checking for their boss
Task
answer a series of short answer (Who attended
all of the meetings?) and true/false questions
(The budget was 1000CHF)
21 questions in total
ordering of questions is varied (4 different
tasks)
alternated starting with true/false or
short-answer
Done in the lab, not in the field

10
Experiment Methodology Wizard of Oz

What it is
user interacts with what they think is a fully
functioning system but a human is actually
controlling the system and processing (language)
input
Why
allows experimenting with natural language input
without having to implement SR and NLP
Data
video and audio
users face (reaction to the system)
users input devices
users screen

11
Experiment Environment

Users room
PC with speakers
wireless mouse, keyboard
touchscreen
2 cameras
recording equipment

Wizards room
NL processing simulation
view of the user
view of the users screen

12
Procedure

Pre-experiment questionnaire (demographic
information), consent form
Read scenario description and software manual
Phase 1 20 minutes
subset of modalities
11 questions (5 true/false, 6 short answer)
Phase 2 20 minutes
all modalities
10 questions (5 true/false, 5 short answer)
Post-experiment questionnaire and interview (time
permitting)

13
Experiment

Participants
24 in total 11 female, 13 male
mostly non-native English speakers
different levels of computer experience
4 modalities used
mouse (M), voice (V), keyboard (K), and
touchscreen (T)
8 Phase I conditions
M, T, V, MK, VK, TVK, MVK, MTVK
Experiment was conducted between-subjects, with 3
subjects per condition

14
What we were looking at

Task completion
Which modalities result in most success?
Learning Effect
Does learning with a novel modality encourage its
use later on?
Number of Interactions
Are users equally active with functionally
equivalent modalities?

15
Task Completion

Expectation
mouse-keyboard would be most efficient
Result
Mouse-keyboard on par with voice only, TVK and
MVK
Mouse-only, all modalities and touchscreen-only
are best

Table 1. All answers found in Phase 1
16
Task Completion

In the mouse-only and touchscreen-only condition
user can only make correct moves
not the case when voice interaction is involved
Touchscreen-only was worse than mouse-only
lower pointing accuracy with touchscreen
blocking effect due to unfamiliarity with
touchscreen
similar results with MVK and TVK
Combining voice with other modalities does add
value to the interaction

17
Learning Effect

Expected use of novel modalities in Phase 1
increases likelihood of use in Phase 2

Table 2. Number of interactions in Phase 2

More voice use in Phase 2 of mouse-only than
voice-only
If given familiar modalities in Phase 1, more
likely to explore new modalities in Phase 2

18
Learning Effect

Lack of learning effect could be caused by
unconscious need to feel comfortable with the
system and input modalities at early stages of
interaction
comfort can manifest in two ways
with system itself (same for all users)
knowing what graphics represent, type of info
available and where it can be found
with interaction methods (different in
conditions)
what input modalities are available
System is slower with voice

19
Number of Interactions pointing modalities

Only looked at functionally equivalent modalities
Mouse vs. touchscreen
users more active with mouse than touchscreen
similarly for MKV and TKV
Comfort and/or blocking effects are factors
Quickly learn strategies with mouse and
touchscreen

Table 3a Modality interactions per condition and
phase
20
Number of Interactions voice and keyboard

Novel input modality on its own is easier to
learn than novel mod less frequently used half
of traditional MK paradigm

Table 3b Modality interactions per condition and
phase

Keyboard use increases when mouse becomes
available
but
In Ph2 in the MK condition, voice is used almost
twice as much as keyboard, despite the continued
high use of mouse input

Table 4. Voice vs. keyboard interactions in
Phase 1 and 2 of VK condition
21
Conclusions and Future Work

Encouraging results
Users can be encouraged to use voice
in particular when in combination with other more
familiar modalities
Blocking effect can be reduced
especially if all modalities available at once
Results achieved despite
a high learning curve
small number of participants
New experiments with
new version of system and WOz environment, tablet
PC, tutorial, more users (10/condition)