University of Cambridge > Talks.cam > Language Technology Lab Seminars > New Advances in Multimodal Reasoning

New Advances in Multimodal Reasoning

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact shun shao.

Abstract: Today’s language models are increasingly capable of reasoning over multiple steps with verification and backtracking to solve challenging problems. However, multimodal reasoning models that can reason over an integrated set of modalities such as text, images, audio, video, and knowledge graphs are sorely lacking, and can pave the way for a next frontier of AI. I will describe our group’s work on advancing the frontiers of multimodal reasoning, from new multimodal reasoning benchmarks to training multimodal foundation models with modern reasoning approaches, and applications to social understanding and education.

Bio: Paul Liang is an Assistant Professor at the MIT Media Lab and MIT EECS . His research advances the foundations of multisensory artificial intelligence to enhance the human experience. He is a recipient of the Siebel Scholars Award, Waibel Presidential Fellowship, Facebook PhD Fellowship, Center for ML and Health Fellowship, Rising Stars in Data Science, and 3 best paper awards. Outside of research, he received the Alan J. Perlis Graduate Student Teaching Award for developing new courses on multimodal machine learning.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity