University of Cambridge > Talks.cam > Language Technology Lab Seminars > Mind the Data

Mind the Data

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Panagiotis Fytas.

Today’s mainstream NLP research focuses on general-purpose models that are scaled up to work with extremely large datasets. This direction has had many benefits, evidenced by performance on research benchmarks and by new use cases for AI in general, and language models specifically, imagined by an ever wider community of stakeholders. What I believe is coming next is a strong demand for customization. More people than ever will want to adapt language models to create new applications. To enable them, I believe we need new affordances for working with the most important ingredient for NLP systems: the data. In this talk, I’ll present recent work from my group showing benefits and risks of new methods for data selection, organization, and synthesis. I’ll advocate for a future in which artifacts like language models are developed to support adaptation to unexpected and diverging demands of a wide population of users, who in turn should be empowered to direct models to serve their own interests.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity