CAMEO: Year 1 Progress and Year 2 Goals - PowerPoint PPT Presentation

About This Presentation

Title:

CAMEO: Year 1 Progress and Year 2 Goals

Description:

Title: PowerPoint Presentation Author: WSE Created Date: 5/7/2002 1:59:17 PM Document presentation format: On-screen Show Other titles: Times New Roman Courier New ... – PowerPoint PPT presentation

Number of Views:62

Avg rating:3.0/5.0

Slides: 26

Provided by: WSE9142

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: CAMEO: Year 1 Progress and Year 2 Goals

1
CAMEOYear 1 Progress and Year 2 Goals

Manuela Veloso, Takeo Kanade,
Fernando de la Torre, Paul Rybski, Brett
Browning,
Raju Patil, Carlos Vallespi, Betsy Ricker

2
CAMEO Internals
3
CAMEOs Connection to other CALO Agents
CAMEO is an example of a physical event capture
system. Systems such as these transmit state
information about people to the CALO timeline
server.
Individualized CALO agents can access this
information to obtain updates about their
individual users.
4
Inferring Meeting State with CAMEO Overview

CAMEO observes activities of people in meeting
Raw visual motion is segmented into discrete
actions
High-level meeting state is inferred from the
aggregate actions of the group

5
Training CAMEO to Recognize Human Actions
6
Action Recognition
Person action sequences are represented as a
simple finite state machine.
Person Action State Machine
State transitions are encoded in a dynamic
Bayesian network which infers the current person
state as a function of observed human activity
and previous state.
Dynamic Baysian Network
7
Classification of Person State in a Meeting
Example of person state classification Here,
the states of a person are correctly classified
from the Bayesian network. The parameters of the
activity data are learned from previously-recorded
meeting data.
Standing
Stand
Sitting
Sit
Time in seconds
8
Classification of the Meeting State
Global meeting state is defined by the aggregate
activities of every person attending the meeting.
9
Generating Meeting Summary

Meeting event log becomes summary
Low and high-level events can be organized into a
hierarchy
Meeting can be viewed at any requested level of
detail from summary to captured video (and
eventually audio)

2004-02-03 Project Status Report
130405 Meeting Start
131212 General Discussion
131945 Presentation
132423 General Discussion
132929 Meeting End

10
Generating Meeting Summary

Meeting event log becomes summary
Low and high-level events can be organized into a
hierarchy
Meeting can be viewed at any requested level of
detail from summary to captured video (and
eventually audio)

2004-02-03 Project Status Report
130405 Meeting Start
131212 General Discussion
131945 Presentation
131945 Jim stands
131950 Jim walks to podium
132000 Jim speaks
132204 Unknown speaks
132245 Jim speaks
133023 Wendy stands
133037 Wendy walks to podium
133042 Wendy speaks
133304 Wendy sits down
133304 Jim speks
133850 Jim sits down
134023 General Discussion
135029 Meeting End

11
Protecting Individuals Privacy Issues

Recognition is voluntary. CAMEO only recognizes
people it has registered.
We can digitally represent video logs so faces
are distorted or represented only as shapes

Raw video with tracking information
Stored video log after privacy filtering
12
Some ways CALO Agents could use CAMEO Data

What meetings happened when?
Who was at the meeting?
Who was sitting, standing, or speaking?
Where were people looking?
Who was talking?
What were people doing?
Who was pointing at what?
What happened during the formal presentation?

What happened during the general discussion?
What is a general/detailed summary of the
meeting?
What did person 'x' contribute to the meeting?
How to replay a meeting from a specific point in
time?
How to replay specific parts of the meeting?

13
Some ways CALO Agents could use CAMEO Data

What meetings happened when?
When a meeting starts, CAMEO can post an event to
the timeline server indicating the start time of
the meeting. By querying the timeline server for
events of the appropriate tag, CALO agents could
determine the starts of the various meetings and
obtain other information about them such as what
it was about.

14
Some ways CALO Agents could use CAMEO Data

Who was at the meeting?
Face recognition is required. This can be done
by applying various kinds of image matching
algorithms (SVD, template matching, etc...) to
see how close a given face is to a database of
saved faces. A database of saved faces must be
available to work from.

15
Some ways CALO Agents could use CAMEO Data

Who is sitting, standing, or speaking?
By tracking the positions of people as they move
around, we should be able to tell who is sitting
and who is standing. Depending on how animated
the faces are in that state, we should also be
able to tell who is speaking by how much they're
bobbing around.

16
Some ways CALO Agents could use CAMEO Data

Where are people are looking?
In order to determine where people are looking, a
profile face detector is needed. In this case,
we should be able to tell which direction they're
looking and correlate this with the other faces
in the image to figure out where in the image
people are likely to be looking

17
Some ways CALO Agents could use CAMEO Data

Who was talking?
Besides tracking the face movements, audio data
can be recorded by possibly instrument CAMEO or
the meeting attendees with microphones (i.e. Alex
Rudnicky). With multiple microphones in the
room, sound localization techniques would be
required.

18
Some ways CALO Agents could use CAMEO Data

What were people doing?
Besides the relative positions of peoples bodies
in the room, more detailed information could be
obtained with a full-body tracker. Including
information about the room itself, such as what
else is in the room (tables, whiteboards, or
chairs) would let CAMEO report more detailed
information.

19
Some ways CALO Agents could use CAMEO Data

Who was pointing at what?
We need to have even more detailed full-body
tracking. By tracking arms and arm positions
with a stereo camera (ie, Trevor Darrell), we
should be able to figure out where the person is
pointing. By putting a stereo head on a panning
mount, a lot of information about the environment
could be obtained very easily. Even by extending
the 2D tracker so that it identifies arms as
being attached to bodies, we might be able to get
this information. However, this is only as good
as long as the person is pointing in a direction
perpendicular to CAMEO. Having two CAMEOs would
be a good way to solve this problem.

20
Some ways CALO Agents could use CAMEO Data

What happened during the formal presentation?
Information has to be collated and merged in such
a way as the speaker is identified, and
information regarding the speech and powerpoint
presentation is processed (CALO-MMD group).

21
Some ways CALO Agents could use CAMEO Data

What happened during the general discussion?
Information has to be collated and merged in such
a way as the speakers are identified, and
information regarding the speech is processed
(CALO-MMD group).

22
Some ways CALO Agents could use CAMEO Data

What is a general/detailed summary of the
meeting?
Given a state machine which can be used to
describe the most common things in a meeting, we
could cluster the individual events into larger
states which indicate the various sections of the
meeting based on a generic agenda (intro, formal
presentation, questions, open discussion,
wrap-up), or even a specific agenda that is
provided to CAMEO ahead of time? People print
out agendas and often bring them to formal
meetings so that everyone can follow allong.

23
Some ways CALO Agents could use CAMEO Data

What did person 'x' contribute to the meeting?
Tracking an individual person's speech and
gestures allows the events posted to the timeline
server to be gathered/clustered into a
personalized kind of state machine that can be
viewed at a very minute level of detail
(individual gestures and actions) or a high level
description such as "person x didn't talk very
much", etc...

24
Some ways CALO Agents could use CAMEO Data

How to replay a meeting from a specific point in
time?
The raw movie files are available. Once the
individual person events are classified, the
timestamps can be extracted from the timeline
server and the video can be replayed from that
location.

25
Some ways CALO Agents could use CAMEO Data

How to replay specific parts of the meeting,
i.e., introductions, discussion after the
presentation, wrap up?
We need to create a probabilistic meeting
ontology that we can use to parse and tag the
meeting identifying parts of the meeting with
different probabilities. We can learn the model
of different types of meetings in terms of
learning the probabilistic parameters of an
ontology or the Bayesian dependencies from types,
people, and meeting purpose, to the format of the
meeting.