Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization

Add to your list(s) Download to your calendar using vCal

Heiga Zen (Toshiba Research Europe Ltd.)
Tuesday 21 June 2011, 13:00-14:30
Cambridge University Engineering Department, Lecture Room 11.

If you have a question about this talk, please contact Kai Yu.

An increasingly common scenario in building hidden Markov model-based speech synthesis and recognition systems is training on inhomogeneous data. For example, data from multiple different sources and/or different types of data are used. This seminar introduces a new technique for training hidden Markov models on such inhomogeneous speech data, in this case including speaker and language variations. The proposed technique, speaker and language factorization, attempts to factorize speaker-specific/language-specific characteristics in the data and model them by individual transforms. Language-specific factors in the data are represented by transforms based on cluster mean interpolation with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by transforms based on constrained maximum likelihood linear regression. This technique allows multi-speaker/multi-language adaptive training to be performed. Since each factor is represented by an individual transform, it is possible to factor-in only one of them. Experimental results on statistical parametric speech synthesis show that the proposed technique enables the speaker and language to be factorized, allowing the speaker transform estimated in one language to be successfully used to synthesize speech in different language while keeping the voice characteristics.

This talk is part of the speech synthesis seminar series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization

This talk is included in these lists:

Other lists

Other talks