COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Narrative Summarization from Multiple Views
Narrative Summarization from Multiple ViewsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Rami Aly. Abstract: Although summarizing movies and TV shows comes naturally to humans, it is very challenging for machines. They have to combine different input sources (i.e., video, audio, subtitles), process long videos of 1-2 hours, and their transcripts, and learn from a handful of examples, since collecting and processing such videos is hard. Given the challenges of multimodal summarization, most prior work does not consider all facets of the computational problem at once but instead focuses on either processing multiple but short input sources or long text-only narratives. In contrast, we aim at summarizing full-length movies and TV episodes while considering all input sources for creating video trailers and textual summaries. For trailer creation, we propose an algorithm for selecting trailer moments in movies based on interpretable criteria such as the narrative importance and sentiment intensity of events. We further demonstrate how we can convert our algorithm into an interactive tool for trailer creation with a human in the loop. Next, for producing textual summaries from full-length TV episodes, we move to a video-to-text setting and hypothesize that multimodal information from the full-length video and audio can directly facilitate abstractive dialogue summarization. We propose a parameter-efficient way for incorporating such information into a pre-trained textual summarizer and demonstrate improvements in the generated summaries. Bio: Pinelopi (Nelly) Papalampidi is a Research Scientist at DeepMind working at the intersection of language and vision. She recently completed her PhD at the University of Edinburgh under the supervision of Mirella Lapata and Frank Keller and interned as a Research Scientist at DeepMind and Meta AI. Her PhD thesis focuses on structure-aware movie understanding and summarization via multimodal and graph-based methods. This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsScott Polar Research Institute - Polar Physical Sciences Seminar Scott Polar Research Institute - Polar Humanities and Social Sciences ECR Workshop DNEMOther talksFully polarimetric multi-aspect 3D InISAR A Palaeontological view of the modern climate and biodiversity crisis Creative Intelligence in Generative Models and Why Consciousness Matters Free Afternoon Europe's cartographic 'Arcticulation' of the North: The use of maps in official European and national Arctic policies. Political engagement, profession and socialist economics in fin-de-siècle Europe |