Title: Modelling Personality Features by Changing Prosody in Synthetic Speech
1Modelling Personality Features by Changing
Prosody in Synthetic Speech
- Jürgen Trouvain1,2, Sarah Schmidt3, Marc
Schröder4, Michael Schmitz3 Bill Barry2 - 1Phonetik-Büro Trouvain, Saarbrücken
- 2Institute of Phonetics, Saarland University
- 3Institute of Computer Science, Saarland
University - 4DFKI GmbH, Saarbrücken
2Dimensions of human personality
Five factor model
Personality dimension High level Low level
Neuroticism sensitive, nervous secure, confident
Extraversion outgoing, energetic shy, withdrawn
Openness to experience inventive, curious cautious, conservative
Agreeableness friendly, compassionate competitive, outspoken
Conscientiousness efficient, organized easy-going, careless
3Features of personality in synthetic speech
- Nass Lee (2001)
- "introvertedextroverted" (among others)
- manipulated parameters in synthetic speech
- F0 range
- F0 mean
- tempo
- listeners perceive degree of introversion as
predicted
4Dimensions of brand personality
Aaker (1997)
Personality dimension Attributes Examples
Sincerity down-to-earth, honest, wholesome, cheerful plant
Excitement daring, spirited, imaginative, up-to-date snowboard
Competence reliable, intelligent, successful lexicon
Sophistication upper class, charming fragrance
Ruggedness outdoorsy, tough tractor
5Prosody of brand personality
findings of possible correlates in literature
F0 mean F0 range tempo
Sincerity
Excitement
Competence
Sophistication
Ruggedness
6Synthetic speech
- MARY speech synthesis mary.dfki.de
- two voices
- male voice (Mbrola de6)
- female voice (Mbrola de7)
- one utterance
- "Hallo, ich bin Produkt XY. Ich möchte mich kurz
vorstellen. Ich werde nun meine Eigenschaften
erläutern."
7Parametrisation of prosody
F0 mean F0 range tempo
lowered ("") 30 2 st 0
baseline 0 4 st 15
raised ("") 30 8 st 30
default rather slow
8Manipulating prosody
F0 mean F0 range tempo
Sincerity 0 1 0
Excitement 1 1 1
Competence -1 1 1
Sophistication -1 1 0
Ruggedness -1 0 -1
Baseline 0 0 0
9Listening test
- Schmidt (2005)
- judging on scale from 1 (does not fit at all) to
5 (fits very well) - 36 native speakers of German
- online test
10Judgements female voice
judged as baseline as intended
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3
sophistic. 3.1 3.2
rugged 2.6 2.7
1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
11Judgements female voice
judged as baseline as intended best rated version
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3
sophistic. 3.1 3.2
rugged 2.6 2.7
1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
12Judgements female voice
judged as baseline as intended best rated version
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3 3.6 (sincere)
sophistic. 3.1 3.2 3.4 (sincere)
rugged 2.6 2.7 3.3 (sophist.)
1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
13Judgements female voice
judged as baseline as intended best rated version
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3 3.6 (sincere)
sophistic. 3.1 3.2 3.4 (sincere)
rugged 2.6 2.7 3.3 (sophist.)
1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
14Judgments male voice
judged as baseline as intended
sincere 3.2 3.5
excited 2.5 3.4
competent 3.5 4.0
sophistic. 2.7 3.6
rugged 3.0 3.6
()
1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
15Judgements male voice
judged as baseline as intended best rated version
sincere 3.2 3.5
excited 2.5 3.4
competent 3.5 4.0
sophistic. 2.7 3.6
rugged 3.0 3.6
()
1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
16Judgements male voice
judged as baseline as intended best rated version
sincere 3.2 3.5 3.7 (sophistic.)
excited 2.5 3.4
competent 3.5 4.0 4.1 (sophistic.)
sophistic. 2.7 3.6
rugged 3.0 3.6 4.1 (competent)
()
1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
17Judgements male voice
judged as baseline as intended best rated version
sincere 3.2 3.5 3.7 (sophistic.)
excited 2.5 3.4
competent 3.5 4.0 4.1 (sophistic.)
sophistic. 2.7 3.6
rugged 3.0 3.6 4.1 (competent)
()
()
1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
18Summary
- tendency for statistically significant
differences - between baseline and models
- between baseline and best versions
- different preferences for different voices
- "excited" 3.4 (male) vs. 4.1 (female)
- "rugged" 4.1 (male) vs. 3.3 (female)
- improved default settings for synthesis
- male "sophisticated" model
- female "sincere" model
19Conclusions
- modelling personality in synthethis possible
- more research needed, eg. wrt "excited" (also
important for emotional synthesis) - parametrical synthesis vs. unit-selection
- applications
- talking objects
- speech prostheses for voice-handicapped
- tuning of a synthetic corporate voice
20Outlook
www.icphs2007.de