Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Deep Learning for Multifarious Speech Processing: Tackling Multiple Speakers, Microphones, and Languages

Add to your list(s) Download to your calendar using vCal

Jonathan Le Roux, Mitsubishi Electric Research Labs, USA
Friday 10 May 2019, 15:00-16:00
LT6, Baker Building, CUED.

If you have a question about this talk, please contact Prof. Ramji Venkataramanan.

Speech processing has been at the forefront of the recent deep learning revolution, with major breakthroughs in automatic speech recognition, speech enhancement, and source separation. I will give an overview of deep learning techniques developed at MERL towards the goal of cracking the Tower of Babel version of the cocktail party problem, that is, separating and/or recognizing the speech of multiple unknown speakers speaking simultaneously in multiple languages. I will also attempt to present live demonstrations with audience participation (weather, time, and network conditions permitting).

Bio: Jonathan Le Roux is a Senior Principal Research Scientist and the Speech and Audio Team Leader at Mitsubishi Electric Research Laboratories (MERL) in Cambridge, Massachusetts. He completed his B.Sc. and M.Sc. degrees in Mathematics at the Ecole Normale Supérieure (Paris, France), his Ph.D. degree at the University of Tokyo (Japan) and the Université Pierre et Marie Curie (Paris, France), and worked as a postdoctoral researcher at NTT ’s Communication Science Laboratories from 2009 to 2011. His research interests are in signal processing and machine learning applied to speech and audio. He has contributed to more than 80 peer-reviewed papers and 20 patents in these fields. He is a founder and chair of the Speech and Audio in the Northeast (SANE) series of workshops, a Senior Member of the IEEE , and a member of the IEEE Audio and Acoustic Signal Processing Technical Committee (AASP).

This talk is part of the Probabilistic Systems, Information, and Inference Group Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Deep Learning for Multifarious Speech Processing: Tackling Multiple Speakers, Microphones, and Languages

This talk is included in these lists:

Other lists

Other talks