University of Cambridge > > Language Technology Lab Seminars > Unsupervised Question Answering

Unsupervised Question Answering

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Edoardo Maria Ponti.

Obtaining training data for Question Answering (QA) is time-consuming and costly, and existing QA datasets are only available for limited domains and languages. In this talk, we’ll explore to what extent high quality training data is actually required for Extractive QA, and investigate the possibility of unsupervised Extractive QA. We approach this problem by first learning to generate context, question and answer triples in an unsupervised manner, which we then use to synthesize Extractive QA training data automatically. We find that modern QA models can learn to answer human questions surprisingly well using only synthetic training data. We demonstrate that, without using the SQuAD training data at all, our approach achieves 56.4 F1 on SQuAD v1 (64.5 F1 when the answer is a Named entity mention), outperforming early supervised models. We will also explore methods to build cross-lingual Question Answering models which do not require cross-lingual supervision (zero-shot language transfer), as well as the challenge of how to fairly evaluate their performance in many target languages.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity