University of Cambridge > Talks.cam > NLIP Seminar Series > Narrative Summarization from Multiple Views

Narrative Summarization from Multiple Views

Add to your list(s) Download to your calendar using vCal

  • UserPinelopi Papalampidi (DeepMind) World_link
  • ClockFriday 17 February 2023, 12:00-13:00
  • HouseComputer Lab, SS03.

If you have a question about this talk, please contact Rami Aly.

Abstract:

Although summarizing movies and TV shows comes naturally to humans, it is very challenging for machines. They have to combine different input sources (i.e., video, audio, subtitles), process long videos of 1-2 hours, and their transcripts, and learn from a handful of examples, since collecting and processing such videos is hard. Given the challenges of multimodal summarization, most prior work does not consider all facets of the computational problem at once but instead focuses on either processing multiple but short input sources or long text-only narratives.

In contrast, we aim at summarizing full-length movies and TV episodes while considering all input sources for creating video trailers and textual summaries. For trailer creation, we propose an algorithm for selecting trailer moments in movies based on interpretable criteria such as the narrative importance and sentiment intensity of events. We further demonstrate how we can convert our algorithm into an interactive tool for trailer creation with a human in the loop. Next, for producing textual summaries from full-length TV episodes, we move to a video-to-text setting and hypothesize that multimodal information from the full-length video and audio can directly facilitate abstractive dialogue summarization. We propose a parameter-efficient way for incorporating such information into a pre-trained textual summarizer and demonstrate improvements in the generated summaries.

Bio:

Pinelopi (Nelly) Papalampidi is a Research Scientist at DeepMind working at the intersection of language and vision. She recently completed her PhD at the University of Edinburgh under the supervision of Mirella Lapata and Frank Keller and interned as a Research Scientist at DeepMind and Meta AI. Her PhD thesis focuses on structure-aware movie understanding and summarization via multimodal and graph-based methods.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity