COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Analysing Memorisation in Classification and Translation through Localisation and Cartography
Analysing Memorisation in Classification and Translation through Localisation and CartographyAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Suchir Salhan. Memorisation is a natural part of learning from real-world data: neural models pick up on atypical input-output combinations and store those training examples in their parameter space. That this happens is well-known, but which examples require memorisation and where in the millions (or billions) of parameters memorisation occurs are questions that remain largely unanswered. In this talk, I first elaborate on the localisation question by examining memorisation in the context of classification in fine-tuned PLMs, using 12 tasks. Our findings give nuance to the generalisation-first memorisation-second hypothesis dominant in the literature and find memorisation to be a gradual process rather than a localised one. Secondly, I discuss memorisation from the viewpoint of the data using neural machine translation (NMT) models by putting individual data points on a memorisation-generalisation map. I illustrate how the data points’ characteristics are predictive of memorisation in NMT and describe the influence that subsets of that map have on NMT systems’ performance. The talk is based on the following two publications: Dankers, V., & Titov, I. (2024). Generalisation First, Memorisation Second? Memorisation Localisation for Natural Language Classification Tasks. ACL -Findings Dankers, V., Titov, I., & Hupkes, D. (2023). Memorisation Cartography: Mapping out the Memorisation-Generalisation Continuum in Neural Machine Translation. EMNLP This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsMy seminars Statistics Reading Group Nursing Essay Writing ServiceOther talksTBC A stronger bound for linear 3-LCC Multi-Head Explainer: A General Framework to Improve Explainability in CNNs and Transformers Training LLMs Anywhere: Enabling Large-Scale Decentralized Learning on Your Mobiles Devices Morphogenetic control of cellular differentiation during gastrulation Towards asymptotic models and minimal seeds for the geodynamo |