University of Cambridge > Talks.cam > Language Technology Lab Seminars > Geographically Grounded Language Models

Geographically Grounded Language Models

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Panagiotis Fytas.

Textual data exhibit pronounced variation along geographical dimensions (e.g., due to dialect differences). Common practices of training and deploying language models do not take this inherent dynamicity into account, leading to detrimental effects for their robustness and performance on downstream tasks. In this talk, I will explore how some shortcomings of text-only NLP pipelines can be alleviated by grounding language models in geography. I will first give a brief overview of prior research on geographical variation in NLP . I will then present geoadaptation, a method for geographically grounding language models that combines language modeling with geolocation prediction in a multi-task learning setup. Geoadaptation leads to consistent performance improvements across a range of tasks and language areas, especially in zero-shot settings. Finally, I will show that the effectiveness of geoadaptation stems from its ability to geographically retrofit the representation space of language models.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity