COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > NMT Analysis: The Trade-Off Between Source and Target, and (a Bit of) the Training Process
NMT Analysis: The Trade-Off Between Source and Target, and (a Bit of) the Training ProcessAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Huiyuan Xie. Join Zoom Meeting https://cl-cam-ac-uk.zoom.us/j/91424580226?pwd=WHFKRW1ORCtBck15SUVOdXowd29uUT09 Meeting ID: 914 2458 0226 Passcode: 333459 In Neural Machine Translation (and, more generally, conditional language modeling), the generation of a target token is influenced by two types of context: the source and the prefix of the target sequence. While many attempts to understand the internal workings of NMT models have been made, none of them explicitly evaluates relative source and target contributions to a generation decision. We propose a way to explicitly evaluate these relative source and target contributions to the generation process, and analyse NMT Transformer. When looking at changes in the contributions when conditioning on different types of prefixes, we show that models suffering from exposure bias are more prone to over-relying on target history (and hence to hallucinating) than the ones where the exposure bias is mitigated. Additionally, we analyze changes in the source and target contributions when varying the amount of training data, and during the training process. We find that models trained with more data tend to rely on source information more and to have more sharp token contributions; the training process is non-monotonic with several stages of different nature. If we have time, I’ll also talk about our ongoing work that takes a closer look at the phenomena learned during these training stages. This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsOpen Innovation Personal list Booking Required - Computing/IT Courses at the CMSOther talksModelling of Indirect Combustion Noise Life After (Stellar) Death: Habitability Around White Dwarfs How do stylistic features “work” in news texts about a violent event that took place abroad? A cross-cultural case study. Oil, Sugar and Failed Revolution in the City of the Sun God: Empowering Heritage and Community in Si Thep Thailand Capital and labour: Theoretical foundations of the economics of slavery Urban tunneling - the challenges of creating underground space in historic cities |