University of Cambridge > > CUED Speech Group Seminars > Audio-Visual Learning: Challenges and New Approaches

Audio-Visual Learning: Challenges and New Approaches

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Dr Kate Knill.

This talk will be on zoom

Abstract: Audio-visual learning is a research topic that aims at exploiting the relationship between audio and visual modalities. By leveraging these two modalities, we could either improve the performance of previously considered single-modality tasks or address new challenging problems. With the success of deep-learning base methods, some challenging audio-visual problems that are infeasible before becomes possible, e.g. audio-visual generation. In this talk, I will present the recent development of audio-visual learning, along with my PhD works. Several interesting applications in audio-visual learning will be visited, such as audio-visual separation and localization, audio-visual speech recognition and enhancement, audio-visual generation. I will review state-of-the-art approaches on these applications, and also discuss some of the challenges and opportunities in the future.

Bio: Jie Pu is a research associate in the Machine Intelligence Laboratory, University of Cambridge. He is a member of the Speech Research Group and works with Professor Mark Gales. His work primarily focuses on audio-visual learning, computer vision and speech analysis. Prior to this, Jie completed his PhD with Professor Maja Pantic at Imperial College London.

This talk is part of the CUED Speech Group Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity