Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Towards Weaker Supervision and Simpler Pipelines in Speech Recognition

Add to your list(s) Download to your calendar using vCal

Gabriel Synnaeve
Friday 27 May 2016, 11:00-12:00
Engineering Department, CBL Room BE-438.

If you have a question about this talk, please contact Louise Segar.

raditionally, speech recognition required feature engineering, fine grained transcription, a phonetics step, and glue all of these with long pipelines. The progress in computational power and datasets size enabled the success of a less rigid class of deep learning models. We will present our recent works towards coarser annotation: either training acoustic models directly on pairs of same or different words (no class information nor phonetic information) with siamese neural networks, or training on sentences annotated as bag of words with large convolutional neural networks and temporal pooling. We will also show promising results on unsupervised acoustic model training. Finally, we will present results towards training directly from the raw waveform to the graphemes with an efficient sequence-based loss. This will span joint work with Neil Zeghidour, Ronan Collobert, Dimitri Palaz, Nicolas Usunier, Christian Puhrsch, and Emmanuel Dupoux.

This talk is part of the Machine Learning @ CUED series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Towards Weaker Supervision and Simpler Pipelines in Speech Recognition

This talk is included in these lists:

Other lists

Other talks