Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Interpretable representation learning for speech and audio signals

Add to your list(s) Download to your calendar using vCal

Dr Purvi Agrawal, Indian Institute of Science (IISc) and Microsoft India
Tuesday 02 March 2021, 12:00-13:00
Zoom: https://zoom.us/j/95352633552?pwd=RzJVK2UzOGZyNU5mVHd1Y1VPT2tDUT09.

If you have a question about this talk, please contact Dr Kate Knill.

Seminar on zoom

The learning of interpretable representations from raw data presents significant challenges for time series data like speech. In this talk, we will discuss a relevance weighting scheme that allows the interpretation of the speech representations during the forward propagation of the model itself.

The relevance weighting is achieved in a 2-stage deep representation learning framework where the weighting approach performs the task of feature selection at each stage.
A relevance sub-network, applied on the first stage operating on raw speech signals, acts as an acoustic filterbank layer with relevance weighting. A similar relevance sub-network applied on the second convolutional layer performs modulation filterbank learning with relevance weighting.
All the layers are trained jointly for a speech recognition task on noisy and reverberant speech. The proposed representation learning framework is also extended for the task of sound classification.

We will discuss the detailed analysis of the relevance weights and intermediate representations learned by the model which would reveal that the relevance weights capture information regarding the underlying speech/audio content, along with improved system performances.

Bio: Purvi Agrawal recently defended her Ph.D. thesis titled “Neural Representation learning for Speech and Audio Signals” from Learning and Extraction of Acoustic Patterns (LEAP) lab with Dr. Sriram Ganapathy, Dept. of Electrical Engineering, Indian Institute of Science (IISc), Bangalore. Prior to joining IISc, she obtained her Masters in Speech Communications from DA-IICT, Gandhinagar in 2015. She has also worked in Sony R & D Labs, Tokyo in 2017. She will be joining as an Applied Researcher-II at Microsoft India with the speech research team in Feb. 2021. Her research interests include interpretable deep learning, raw waveform modeling, low-resource data modeling, unsupervised/self-supervised learning.

This talk is part of the CUED Speech Group Seminars series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Interpretable representation learning for speech and audio signals

This talk is included in these lists:

Other lists

Other talks