Modelling Personality Features by Changing Prosody in Synthetic Speech

About This Presentation

Title:

Modelling Personality Features by Changing Prosody in Synthetic Speech

Description:

J rgen Trouvain1,2, Sarah Schmidt3, Marc Schr der4, Michael Schmitz3 ... lexicon. reliable, intelligent, successful. Competence. snowboard. daring, spirited, ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 21

Provided by: Jrg21

Category:

more less

Transcript and Presenter's Notes

Title: Modelling Personality Features by Changing Prosody in Synthetic Speech

1
Modelling Personality Features by Changing
Prosody in Synthetic Speech

Jürgen Trouvain1,2, Sarah Schmidt3, Marc
Schröder4, Michael Schmitz3 Bill Barry2
1Phonetik-Büro Trouvain, Saarbrücken
2Institute of Phonetics, Saarland University
3Institute of Computer Science, Saarland
University
4DFKI GmbH, Saarbrücken

2
Dimensions of human personality
Five factor model
Personality dimension High level Low level
Neuroticism sensitive, nervous secure, confident
Extraversion outgoing, energetic shy, withdrawn
Openness to experience inventive, curious cautious, conservative
Agreeableness friendly, compassionate competitive, outspoken
Conscientiousness efficient, organized easy-going, careless
3
Features of personality in synthetic speech

Nass Lee (2001)
"introvertedextroverted" (among others)
manipulated parameters in synthetic speech
F0 range
F0 mean
tempo
listeners perceive degree of introversion as
predicted

4
Dimensions of brand personality
Aaker (1997)
Personality dimension Attributes Examples
Sincerity down-to-earth, honest, wholesome, cheerful plant
Excitement daring, spirited, imaginative, up-to-date snowboard
Competence reliable, intelligent, successful lexicon
Sophistication upper class, charming fragrance
Ruggedness outdoorsy, tough tractor
5
Prosody of brand personality
findings of possible correlates in literature
F0 mean F0 range tempo
Sincerity
Excitement
Competence
Sophistication
Ruggedness
6
Synthetic speech

MARY speech synthesis mary.dfki.de
two voices
male voice (Mbrola de6)
female voice (Mbrola de7)
one utterance
"Hallo, ich bin Produkt XY. Ich möchte mich kurz
vorstellen. Ich werde nun meine Eigenschaften
erläutern."

7
Parametrisation of prosody
F0 mean F0 range tempo
lowered ("") 30 2 st 0
baseline 0 4 st 15
raised ("") 30 8 st 30
default rather slow
8
Manipulating prosody
F0 mean F0 range tempo
Sincerity 0 1 0
Excitement 1 1 1
Competence -1 1 1
Sophistication -1 1 0
Ruggedness -1 0 -1
Baseline 0 0 0
9
Listening test

Schmidt (2005)
judging on scale from 1 (does not fit at all) to
5 (fits very well)
36 native speakers of German
online test

10
Judgements female voice
judged as baseline as intended
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3
sophistic. 3.1 3.2
rugged 2.6 2.7

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
11
Judgements female voice
judged as baseline as intended best rated version
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3
sophistic. 3.1 3.2
rugged 2.6 2.7

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
12
Judgements female voice
judged as baseline as intended best rated version
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3 3.6 (sincere)
sophistic. 3.1 3.2 3.4 (sincere)
rugged 2.6 2.7 3.3 (sophist.)

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
13
Judgements female voice
judged as baseline as intended best rated version
sincere 3.4 3.5
excited 2.9 4.1
competent 3.5 3.3 3.6 (sincere)
sophistic. 3.1 3.2 3.4 (sincere)
rugged 2.6 2.7 3.3 (sophist.)

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
14
Judgments male voice
judged as baseline as intended
sincere 3.2 3.5
excited 2.5 3.4
competent 3.5 4.0
sophistic. 2.7 3.6
rugged 3.0 3.6

()

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
15
Judgements male voice
judged as baseline as intended best rated version
sincere 3.2 3.5
excited 2.5 3.4
competent 3.5 4.0
sophistic. 2.7 3.6
rugged 3.0 3.6

()

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
16
Judgements male voice
judged as baseline as intended best rated version
sincere 3.2 3.5 3.7 (sophistic.)
excited 2.5 3.4
competent 3.5 4.0 4.1 (sophistic.)
sophistic. 2.7 3.6
rugged 3.0 3.6 4.1 (competent)

()

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
17
Judgements male voice
judged as baseline as intended best rated version
sincere 3.2 3.5 3.7 (sophistic.)
excited 2.5 3.4
competent 3.5 4.0 4.1 (sophistic.)
sophistic. 2.7 3.6
rugged 3.0 3.6 4.1 (competent)
()

()

1 "does not fit at all" 5 "fits very well"
p lt 0.01 p lt 0.05 () p lt0.06
18
Summary

tendency for statistically significant
differences
between baseline and models
between baseline and best versions
different preferences for different voices
"excited" 3.4 (male) vs. 4.1 (female)
"rugged" 4.1 (male) vs. 3.3 (female)
improved default settings for synthesis
male "sophisticated" model
female "sincere" model

19
Conclusions

modelling personality in synthethis possible
more research needed, eg. wrt "excited" (also
important for emotional synthesis)
parametrical synthesis vs. unit-selection
applications
talking objects
speech prostheses for voice-handicapped
tuning of a synthetic corporate voice

20
Outlook
www.icphs2007.de

Write a Comment

User Comments (0)