University of Cambridge > > Language Technology Lab Seminars > Narrative Summarization From Multiple Views

Narrative Summarization From Multiple Views

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Panagiotis Fytas.

Although summarizing movies and TV shows comes naturally to humans, it is very challenging for machines. They have to combine different input sources (i.e., video, audio, subtitles), process long videos of 1-2 hours, and their transcripts, and learn from a handful of examples, since collecting and processing such videos is hard. Given the challenges of multimodal summarization, most prior work does not consider all facets of the computational problem at once but instead focuses on either processing multiple but short input sources or long text-only narratives.

In contrast, we aim at summarizing full-length movies and TV episodes while considering all input sources for creating video trailers and textual summaries. For trailer creation, we propose an algorithm for selecting trailer moments in movies based on interpretable criteria such as the narrative importance and sentiment intensity of events. We further demonstrate how we can convert our algorithm into an interactive tool for trailer creation with a human in the loop. Next, for producing textual summaries from full-length TV episodes, we move to a video-to-text setting and hypothesize that multimodal information from the full-length video and audio can directly facilitate abstractive dialogue summarization. We propose a parameter-efficient way for incorporating such information into a pre-trained textual summarizer and demonstrate improvements in the generated summaries.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity