COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > A Mutual Information Maximization Perspective of Language Representation Learning
A Mutual Information Maximization Perspective of Language Representation LearningAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact James Thorne. In this talk, we show state-of-the-art word representation learning methods maximize an objective function that is a lower bound on the mutual information between different parts of a word sequence (i.e., a sentence). Our formulation provides an alternative perspective that unifies classical word embedding models (e.g., Skip-gram) and modern contextual embeddings (e.g., BERT , XLNet). In addition to enhancing our theoretical understanding of these methods, our derivation leads to a principled framework that can be used to construct new self-supervised tasks. We provide an example by drawing inspirations from related methods based on mutual information maximization that have been successful in computer vision, and introduce a simple self-supervised objective that maximizes the mutual information between a global sentence representation and n-grams in the sentence. Our analysis offers a holistic view of representation learning methods to transfer knowledge and translate progress across multiple domains (e.g., natural language processing, computer vision, audio processing). This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsAfrica Over Coffee: Journey to Entrepreneurship with Shingai Mutasa¬ Churchill Archives Centre Thinking Society: How is understanding possible?Other talksMeeting Cancelled - Science and the Futures of Medicine - One Day meeting Recent advances in the quarter-plane problem using functions of two complex variables CANCELLED Towards a Global History of Knowledge? Premises, Promises, Concerns – gloknos Annual Lecture On the Wiener-Hopf technique and its applications in science and engineering: Lecture 2 Minkowski, Lyapunov, and Bellman: Inequalities and Equations for Stability and Optimal Control Alan Turing and the Enigma Machine |