COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Language (In)Equality in Parsing and Machine Translation: Data Size is Only One Term in the Equation
Language (In)Equality in Parsing and Machine Translation: Data Size is Only One Term in the EquationAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Michael Schlichtkrull. Abstract: Despite the tremendous improvements achieved in less than a decade by neural models, NLP is still far from reaching language equality (i.e. comparable performance in all languages). The uneven amount of data available in different languages is often recognized as the main culprit. In this talk I will discuss recent work that acknowledges this situation and attempts to address it, not by collecting or synthesizing more data, but by exploiting linguistic information already existing for a large number of high- to very-low-resourced languages. In particular I will show how typological features can be used to learn language embeddings that boost the quality of a multilingual dependency parser. In the second part of the talk, I will discuss another obstacle to language inequality, namely the fact that some languages are intrinsically more difficult to model than others, even when controlling for training data size. Specifically, I will present recent results on the effect of word order freedom and case marking on the quality of state-of-the-art neural machine translation. Bio: Arianna Bisazza is Assistant Professor in Computational Linguistics at the University of Groningen, The Netherlands. Her research aims to identify the intrinsic limitations of current language modeling paradigms, and to improve the quality of machine translation for challenging language pairs. She previously worked as a postdoc at the University of Amsterdam and as a research assistant at Fondazione Bruno Kessler, Trento. Topic: NLIP Seminar Time: May 19, 2022 01:00 PM London Join Zoom Meeting https://cl-cam-ac-uk.zoom.us/j/94112888558?pwd=aGN2Skg2UFlnUkxWMmFuRjV6SCs0dz09 Meeting ID: 941 1288 8558 Passcode: 420834 This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsMolecular, Structural & Cellular Microbiology Cambridge University Medical Humanities Society Cambridge Hub eventsOther talksIndustrial Panel (MathWorks, Microsoft, Nvidia) Incidence of post-anaesthetic respiratory complications and their risk factors in 199 dogs undergoing surgery for brachycephalic obstructive airway syndrome Quasi-isolated blocks and the Alperin-McKay conjecture How mathematics helps structuring climate discussions - Rothschild Distinguished Visiting Fellow Lecture Accuracy Controlled Schemes for the Neutron Transport Equation Challenge 2: Communicating Mathematics |