University of Cambridge > Talks.cam > NLIP Seminar Series > Unsupervised cross-lingual representation learning

Unsupervised cross-lingual representation learning

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact James Thorne.

Abstract: Research in natural language processing (NLP) has seen many advances over the recent years, from word embeddings to pretrained language models. Most of these approaches still rely on large labelled datasets, which has constrained their success to languages where such data is plentiful (mostly English). In this talk, I will give an overview of approaches that learn cross-lingual representations and enable us to scale NLP models to more of the world’s 7,000 languages. I will cover the spectrum of such cross-lingual representations, from word embeddings to deep pretrained models, with a focus on unsupervised approaches and their limitations. The talk will conclude with a discussion of the cutting edge of learning such representations and future directions.

Bio: Sebastian is a research scientist at DeepMind, London. He completed his PhD in Natural Language Processing at the National University of Ireland while working as a research scientist at a Dublin-based NLP startup. Previously, he studied Computational Linguistics at the University of Heidelberg, Germany and at Trinity College, Dublin. His main research interests are transfer and cross-lingual learning. He is also interested in helping make ML and NLP more accessible. You can find him at his blog http://ruder.io/.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2019 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity