University of Cambridge > Talks.cam > CUED Speech Group Seminars > Towards Improving End-to-End Neural Diarization

Towards Improving End-to-End Neural Diarization

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Simon Webster McKnight.

Until recently, diarization systems were formed by different submodules like voice activity detection, embedding extraction and clustering of such embeddings. However, the last quinquennial has seen many developments in diarization towards end-to-end models. These models, unlike modular ones, are trained to optimize a diarization-related loss and provide a more straightforward inference. Nevertheless, end-to-end systems still pose certain challenges. In this talk, I will comment on some of the work I did addressing some of their problems regarding synthetic training data generation and handling variable numbers of speakers.

This talk is part of the CUED Speech Group Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity