COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Measuring Causal Effects of Data Statistics on Language Model Predictions
Measuring Causal Effects of Data Statistics on Language Model PredictionsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Michael Schlichtkrull. Abstract: The training data is one of the major reasons for state-of-the-art NLP models. But what exactly in the training data causes a model to make a certain prediction? We seek to answer this research question by formalizing it in a causal framework that provides a useful language for investigating how training data influence predictions. Importantly, our causal framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone. Addressing the problem of extracting factual knowledge from pretrained language models (PLMs), we focus on simple data statistics: co-occurrences counts, and show that these statistics influence the predictions of PLMs. This establishes a causal link between simple statistics from the training data (co-occurrence counts) and PLMs’ behavior, and shows that their language understanding is limited. Our causal framework and our results demonstrate the importance of categorizing and studying datasets used for model training and the benefits of causality in our field for understanding NLP models. Bio: Yanai Elazar is a fourth-year PhD student at Bar-Ilan University, working with Prof. Yoav Goldberg on NLP . His main interests involve model interpretation, analysis, biases in datasets and models, and commonsense reasoning. Yanai was awarded multiple scholarships, including the PBC fellowship for outstanding PhD candidates in Data Science, and the Google PhD Fellowship. This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listswealthfits Lennard-Jones Centre external "See Naples and Dial - An italian Job"Other talksExamining glacial-interglacial climate changes by water isotope modelling efforts Title: A case from beyond the grave Oral Session 3 Tea and Coffee Break Statistics Clinic Summer 2022 III |