Dialogue, Speech and Images: The Companions Project Data Set - PowerPoint PPT Presentation

About This Presentation
Title:

Dialogue, Speech and Images: The Companions Project Data Set

Description:

Initial data collection more limited based on WoZ methodology - this is what this talk is about. ... it was a friend, Chris's wedding, so... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 18
Provided by: christophe138
Learn more at: http://www.lrec-conf.org
Category:

less

Transcript and Presenter's Notes

Title: Dialogue, Speech and Images: The Companions Project Data Set


1
Dialogue, Speech and Images The Companions
Project Data Set
  • Yorick Wilks, David Benyon, Christopher Brewster,
    Pavel Ircing, and Oli Mival
  • http//www.companions-project.org

2
Companions Project
  • 4-year, FP6, EU Project
  • 14 partner sites (academic commercial research)
  • Research in Multimodal interfaces
  • Machine learning applied to dialogue systems
  • Emotions and ECAs
  • Dialogue and planning for mobile devices
  • Two prototypes/demonstrators
  • Senior Companion
  • Health and Fitness Companion

3
A Multiplicity of Companions
  • Two major prototypes
  • A Health and Fitness Companions
  • Task driven, focussed, domain specific
  • A Senior Companion
  • Open domain, mixed initiative, building a life
    narrative via photos
  • Other Companions
  • A mobile version of the HFC
  • A home/cookery focused version of HFC
  • An SC for the Czech language

4
The need for Dialogue Corpora
  • General paucity of dialogue corpora
  • The SC is open domain (because photos can be
    qabout anything), aimed at the elderly, and we
    cannot assume dialogue structures transfer
  • Key idea Use the initial prototype to generate
    more data
  • Initial data collection more limited based on WoZ
    methodology - this is what this talk is about.

5
Specifications
  • Modified WoZ
  • Emphasis on naturally occuring dialogues relevant
    to domain
  • People asked to reminisce about photos
  • Initially random public domain
  • Proper scenario - photos of personal importance
  • Photos primarily of people and events friends
    and relative, weddings, holidays, etc.
  • We assumed the WoZ knew how many people in photos
    (because we assumed image processing technology
    could tell the System)

6
Specifications (2)
  • WoZ instructed to use a standard set of questions
    such as
  • What are is the name of the person in the
    picture?
  • Where is this picture taken?
  • What is the relationship between the people?
  • But interviewer not limited to this
  • User is encouraged to express feelings, memories,
    and associations

7
Data Collection Set up (English)
  • Data with two set ups
  • WoZ with Avatar TTS as system
  • WoZ without TTS i.e. with human interviewer
  • Use of TTS (although theoretically more
    realistic) slowed down the experiments too much
  • Photos showed one at time, participation tails
    off after about 20 min

8
(No Transcript)
9
Senior Companion Data Collection at Napier
  • September 2007
  • 45 sessions/ 30 hours
  • Gender 27 male/13 female
  • Age 19 - 73
  • 7 sessions in homes, 38 at Napier
  • With avatar 16 sessions, without avatar 29
    sessions
  • Early sessions were not transcribed to ASR
    standards, later sessions used Transcriber tool
    (Barras et al., 2001)

10
Current status (English data)
  • TOTAL SESSIONS 101 (approx 70 hours)
  • In .TRS format 42 (approx 30 hours)
  • 27 sessions with full video and trs files
    (waiting on transcription for 16)
  • 15 sessions with trs files with no video
  • In simple text format 59 (approx 40 hours)
  • 55 session pre-transcriber (.trs files)
  • 4 sessions pre-transcriber with full video

11
Moira Ross, 68, Aberdeen, Scotland
12
Data Collection Sample
  • M1 Okay, I think were ready to start looking at
    your pictures now. Please tell me about your
    first photo.
  • F1 Okay, thats at a friends wedding and thats
    Martin and my son, Stefan, thats a few years old
    now, wearing their kilts.
  • M1 How old is Stefan?
  • F1 I think in that picture he must have been
    about five?
  • M1 Is that Stefan on the right?
  • F1 It is, yes.
  • M1 Great. Is there anything else you would like
    to say about them?
  • F1 Yeah, well I remember that day, about it was
    a friend, Chriss wedding, so and I think it was
    a yeah Stefan had his kilt outfit on that day.
  • M1 Thats very interesting, how does this photo
    make you feel?
  • F1 It just it reminds me that it was winter, it
    was right after Christmas, that wedding, it was
    very cold that day.
  • M1 Okay, lets move on to the next.
  • F1 Thats me and Martin in Gibraltar. That was
    very, very many years ago. We were visiting a
    friend in Gibraltar.

13
Czech SC data recording
  • Set up chosen with the avatar Wizard
  • a dedicated room has been established for the
    recording
  • the subject sees on the screen
  • the photo currently being discussed
  • the avatar (talking head)
  • audio is captured by high-quality wireless
    microphones
  • the subject is also simultaneously recorded by 3
    miniDV cameras video is intended for future use
    in emotion detection, gesture recognition, etc.

14
Recording room setup
15
Current state of the Czech Data
  • 50 subjects recorded (mostly seniors)
  • average length of an interview is 55 minutes,
    average number of photos being discussed during
    the session is 8.5
  • it turns out that old people really DO enjoy
    discussing their photos with an artificial
    companion and reminiscing about them
  • on the other hand, people in productive-age
    often tend to provide just the technical
    description of the discussed photo

16
Availability and Format
  • We plan to make all this data publicly available
  • Most appropriate format still open issue
  • Four data streams at least
  • Audio
  • Transcription
  • Video
  • Images discussed

17
Thank You
  • Comments and advice - welcome!
Write a Comment
User Comments (0)
About PowerShow.com