University of Cambridge > Talks.cam > Language Technology Lab Seminars > Document Summarisation: Modelling, Datasets and Verification of Content

Document Summarisation: Modelling, Datasets and Verification of Content

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Marinela Parovic.

Within Natural Language Processing, document summarisation is one of the central problems. It has both short-term societal implications and long-term implications in terms of the success of AI. I will describe advances made in this area with respect to three different aspects: methodology and modelling, dataset development and enforcing factuality of summaries. In relation to modelling, I will show how reinforcement learning can be used to directly maximise the metric by which the summaries are being evaluated. With regards to dataset development, I will describe a dataset that we released for summarisation, XSum, in which a single sentence is used to describe the content of a whole article. The dataset has become a standard benchmark for summarisation. Finally, in relation to factuality, I will show how one can improve the quantitative factuality of summaries by re-ranking them in a beam based on a “verification” model.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity