PRESENTED BY OMKAR PUND. VIKRAM RAJIVADE. SHAILESH RASKAR - PowerPoint PPT Presentation

About This Presentation
Title:

PRESENTED BY OMKAR PUND. VIKRAM RAJIVADE. SHAILESH RASKAR

Description:

dr.d.y.patil polytechnic, ambi computer department topic : voice morphing – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 19
Provided by: Sai168
Category:

less

Transcript and Presenter's Notes

Title: PRESENTED BY OMKAR PUND. VIKRAM RAJIVADE. SHAILESH RASKAR


1
PRESENTED BY OMKAR PUND. VIKRAM RAJIVADE.
SHAILESH RASKAR
  • DR.D.Y.PATIL POLYTECHNIC, AMBICOMPUTER
    DEPARTMENT
  • TOPIC
  • VOICE MORPHING

2
CONTENTS
  • WHAT IS VOICE MORPHING ?
  • APPROACHS TO THE PROBLEM.
  • SPEECH PRODUCTION.
  • CONVERSION OF VOICE.
  • TYPES OF VOICE MORPHING.
  • REFRANCES OR METHODS.
  • APPLICATION OF VOICE MORPHING.
  • AVAILABLE SOFTWARE FOR VOICE MORPHING.
  • SUMMARY.
  • CONCLUSION.

3
WHAT IS VOICE MORPHING ?
  • Voice Morphing which is also referred to as voice
    transformation and voice conversion is a
    technique to modify a source speaker's speech
    utterance to sound as if it was spoken by a
    target speaker.
  • There are many applications which may benefit
    from this sort of technology. For example, a TTS
    system with voice morphing technology integrated
    can produce many different voices. In cases where
    the speaker identity plays a key role, such as
    dubbing movies and TV-shows, the availability of
    high quality voice morphing technology will be
    very valuable allowing the appropriate voice to
    be generated (maybe in different languages)
    without the original actors being present.

4
APPROACHS TO THE PROBLEM
  • Voice conversion will be performed in two phases.
  • In the first phase, the training, the speech
    signals of the source and target speakers will be
    analyzed and the voice characteristics will be
    extracted by means of a mathematical optimization
    technique, very popular in the speech processing
    world, the Linear Prediction Coding (LPC)
    technique.

5
APPROACHS TO THE PROBLEM
  • In second phase , the transformed features will
    be used in order to synthesis speech that will,
    hopefully, resemble that of the target speaker.
  • Speech synthesis will be performed again by means
    of the Linear Prediction Coding.

6
Speech production
  • The respiratory subsystem is composed of the
    lungs, trachea and windpipe, diaphragm and the
    chest cavity.
  • The larynx and pharyngeal cavity or throat
    constitutes the laryngeal subsystems.
  • The articulatory subsystem includes the oral
    cavity and the nasal cavity.

7
Speech production
  • The oral cavity is comprised of the velum, the
    tongue, the lips, the jaw and the teeth.
  • In speech processing technical discussions, the
    vocal tract is referred to as the combination of
    the larynx, the pharyngeal cavity and the oral
    cavity.
  • The respiratory subsystem behaves like an air
    pump, supplying the aerodynamic energy for the
    other two subsystems.
  • In speech processing, the basic aerodynamic
    parameters are air volume, flow, pressure and
    resistance.

8
Conversion of voice
  • TECHNICS-
  • Wavelet Decomposition.
  • Proposed model.
  • Wavelet Decomposition -
  • Wavelets are a class of functions that
    possess compact support and form a basis for all
    finite energy signals.
  • They are able to capture the non-stationary
    spectral characteristics of a signal by
    decomposing it over a set of atoms which are
    localized in both time and frequency. The DWT
    uses the set of dyadic scales and translates of
    the mother wavelet to form an orthonormal basis
    for signal analysis.

9
example
  • The original signal S is split into an
    approximation cA1 and a detail cD1.
  • The approximation is then itself split into an
    approximation and a detail and so on.
  • Decomposing a signal into k levels of
    decomposition therefore results in k1 sets of
    coefficients at different frequency resolutions,
    k levels of detail and 1 level of approximation
    coefficients.

10
Conversion of voice
  • Proposed model
  • Voice morphing is performed in two steps
    training and transformation. The training data
    consist of repetitions of the same phonemes
    uttered by both source and target speakers.
  • The source and target training data is divided
    into frames of 128 samples and the data is
    randomly divided into training and validation
    sets.
  • A 5-level wavelet decomposition is then performed
    to the source and target training data.

11
Types of voice morphing
  • IN THIS SECTION WE KNOW THAT IN WHICH FORM WE
    CAN TRANFORM A NORMAL VOICE OR SPEECH.

SOURCE TARGET RESULT1 RESULT2
F TO M SPEECH1 TARGET1 RESULT1 VOICE1
M TO F SPEECH2 TARGET2 RESULT2 VOICE2
F TO F SPEECH3 TARGET3 RESULT3 VOICE3
M TO M SPEECH4 TARGET4 RESULT4 VOICE4
12
Types of voice morphing
  • The "Source Speech" column indicates the
    utterances of the source speaker.
  • Target Speech" column is the target speaker's
    utterances.
  • The utterances in both these two columns are NOT
    included in the training data for the estimation
    of the conversion function.
  • The next two columns for result.
  • The difference between these two columns is that
    the RESULT1" applies the target prosody
    extracted from the target utterance, but the
    RESULT2" still applies the original prosody of
    the source utterances.

13
REFRANCES OR METHODS.
  • Abe M. , Nakamura S. , Shikano K. and Kuwabara
    H. Voice conversion through vector quantization,
    Proceedings of the ICASSP, 1988.
  • Stylianou Y., Cappe O. And Moulines E.
    Statistical Methods for Voice Quality
    Transformation, Proceedings of Euro speech, 1995.
  • Arslan L. and Talkin D Voice Conversion by
    Codebook Mapping of Line Spectral Frequencies and
    Excitation Spectrum, Proceedings of Euro speech ,
    1997.

14
APPLICATION OF VOICE MORPHING
  • ENTERTAINMENT.
  • IN FILM INDUSTRY.
  • SECURITY.
  • IN COMPUTER GAMING

15
AVAILABLE SOFTWARE FOR VOICE MORPHING .
  • MORPH VOX PRO VOICE CHANGER 2.0.6.
  • MORPH VOX PRO VOICE CHANGER 4.2.2.
  • MORPH VOX PROVOICE CHANGER 4.3.8.
  • TERA VOICE SERVAER 2004.
  • FLASH VOICE BUTTONS 3.0.
  • VOICE TWISTER 1.0.4.
  • VOICE AGAIN 1.5.2.
  • QUICK VOICE FOR OSX 2.2.0.
  • QUICK VOICE FOR WINDOWS 2.2.0.

16
SUMMARY.
  • Voice morphing is the process of changing voice
    personality i.e. speech uttered by a source
    speaker is modified to sound as if the target
    speaker had uttered it.
  • In this dissertation our attempt of voice
    morphing commenced by introducing the basic
    properties of speech signals.
  • Introducing basic techniques of voice morphing.
  • Concept behind voice morphing.

17
CONCLUSION.
  • As voice morphing is
    a technology with a lot of interesting, useful
    and fun applications further research on the
    subject with or without the implementation of the
    GTM (Generative Topographic Mapping) model is
    bound to follow that will lead to the production
    of morphed speech of an excellent quality.

18
thankyou
Write a Comment
User Comments (0)
About PowerShow.com