Modelling Personality Features by Changing Prosody in Synthetic Speech - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Modelling Personality Features by Changing Prosody in Synthetic Speech

Description:

J rgen Trouvain1,2, Sarah Schmidt3, Marc Schr der4, Michael Schmitz3 ... lexicon. reliable, intelligent, successful. Competence. snowboard. daring, spirited, ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 21
Provided by: Jrg21
Category:

less

Transcript and Presenter's Notes

Title: Modelling Personality Features by Changing Prosody in Synthetic Speech


1
Modelling Personality Features by Changing
Prosody in Synthetic Speech
  • Jürgen Trouvain1,2, Sarah Schmidt3, Marc
    Schröder4, Michael Schmitz3 Bill Barry2
  • 1Phonetik-Büro Trouvain, Saarbrücken
  • 2Institute of Phonetics, Saarland University
  • 3Institute of Computer Science, Saarland
    University
  • 4DFKI GmbH, Saarbrücken

2
Dimensions of human personality
Five factor model
Personality dimension High level Low level
Neuroticism sensitive, nervous secure, confident
Extraversion outgoing, energetic shy, withdrawn
Openness to experience inventive, curious cautious, conservative
Agreeableness friendly, compassionate competitive, outspoken
Conscientiousness efficient, organized easy-going, careless
3
Features of personality in synthetic speech
  • Nass Lee (2001)
  • "introvertedextroverted" (among others)
  • manipulated parameters in synthetic speech
  • F0 range
  • F0 mean
  • tempo
  • listeners perceive degree of introversion as
    predicted

4
Dimensions of brand personality
Aaker (1997)
Personality dimension Attributes Examples
Sincerity down-to-earth, honest, wholesome, cheerful plant
Excitement daring, spirited, imaginative, up-to-date snowboard
Competence reliable, intelligent, successful lexicon
Sophistication upper class, charming fragrance
Ruggedness outdoorsy, tough tractor
5
Prosody of brand personality
findings of possible correlates in literature
F0 mean F0 range tempo
Sincerity
Excitement
Competence
Sophistication
Ruggedness
6
Synthetic speech
  • MARY speech synthesis mary.dfki.de
  • two voices
  • male voice (Mbrola de6)
  • female voice (Mbrola de7)
  • one utterance
  • "Hallo, ich bin Produkt XY. Ich möchte mich kurz
    vorstellen. Ich werde nun meine Eigenschaften
    erläutern."

7
Parametrisation of prosody
F0 mean F0 range tempo
lowered ("") 30 2 st 0
baseline 0 4 st 15
raised ("") 30 8 st 30
default rather slow
8
Manipulating prosody
F0 mean F0 range tempo
Sincerity 0 1 0
Excitement 1 1 1
Competence -1 1 1
Sophistication -1 1 0
Ruggedness -1 0 -1
Baseline 0 0 0
9
Listening test
  • Schmidt (2005)
  • judging on scale from 1 (does not fit at all) to
    5 (fits very well)
  • 36 native speakers of German
  • online test

10
Judgements female voice
judged as baseline as intended
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3
sophistic. 3.1 3.2
rugged 2.6 2.7

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
11
Judgements female voice
judged as baseline as intended best rated version
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3
sophistic. 3.1 3.2
rugged 2.6 2.7

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
12
Judgements female voice
judged as baseline as intended best rated version
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3 3.6 (sincere)
sophistic. 3.1 3.2 3.4 (sincere)
rugged 2.6 2.7 3.3 (sophist.)

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
13
Judgements female voice
judged as baseline as intended best rated version
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3 3.6 (sincere)
sophistic. 3.1 3.2 3.4 (sincere)
rugged 2.6 2.7 3.3 (sophist.)



1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
14
Judgments male voice
judged as baseline as intended
sincere 3.2 3.5
excited 2.5 3.4
competent 3.5 4.0
sophistic. 2.7 3.6
rugged 3.0 3.6

()


1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
15
Judgements male voice
judged as baseline as intended best rated version
sincere 3.2 3.5
excited 2.5 3.4
competent 3.5 4.0
sophistic. 2.7 3.6
rugged 3.0 3.6

()


1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
16
Judgements male voice
judged as baseline as intended best rated version
sincere 3.2 3.5 3.7 (sophistic.)
excited 2.5 3.4
competent 3.5 4.0 4.1 (sophistic.)
sophistic. 2.7 3.6
rugged 3.0 3.6 4.1 (competent)

()


1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
17
Judgements male voice
judged as baseline as intended best rated version
sincere 3.2 3.5 3.7 (sophistic.)
excited 2.5 3.4
competent 3.5 4.0 4.1 (sophistic.)
sophistic. 2.7 3.6
rugged 3.0 3.6 4.1 (competent)
()

()




1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
18
Summary
  • tendency for statistically significant
    differences
  • between baseline and models
  • between baseline and best versions
  • different preferences for different voices
  • "excited" 3.4 (male) vs. 4.1 (female)
  • "rugged" 4.1 (male) vs. 3.3 (female)
  • improved default settings for synthesis
  • male "sophisticated" model
  • female "sincere" model

19
Conclusions
  • modelling personality in synthethis possible
  • more research needed, eg. wrt "excited" (also
    important for emotional synthesis)
  • parametrical synthesis vs. unit-selection
  • applications
  • talking objects
  • speech prostheses for voice-handicapped
  • tuning of a synthetic corporate voice

20
Outlook
www.icphs2007.de
Write a Comment
User Comments (0)
About PowerShow.com