![]() |
COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > Language Technology Lab Seminars > Linear Transformers for Efficient Sequence Modeling
Linear Transformers for Efficient Sequence ModelingAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact shun shao. Abstract: Transformers are still the dominant architecture for language modeling (and generative AI more broadly). The attention mechanism in Transformers is considered core to the architecture and enables accurate sequence modeling at scale. However, attention requires explicitly modeling pairwise interactions amongst all elements of a sequence, and thus its complexity is quadratic in input length. This talk will describe some recent work from our group on efficient architectural alternatives to Transformers for language modeling, in particular linear Transformers, which can be reparameterized as an RNN and thus allow for linear-time constant-memory sequence modeling. We also provide connections between linear Transformers and recent state-space models such as Mamba. Bio: Yoon Kim is an assistant professor at MIT (EECS/CSAIL). He obtained his PhD in computer science from Harvard University, where he was advised by Alexander Rush. Prof. Kim works on natural language processing and machine learning. Current interests include: - Efficient training and deployment of large-scale models - Understanding the capabilities and limitations of language models - Symbolic mechanisms for controlling and augmenting neural networks This talk is part of the Language Technology Lab Seminars series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsQuantum Sir David King's Surface Science Seminars server colocation ukOther talksPaper Trails: Mathematical Collaborations in Newton's Archive. Sai Shruthi Murali on Prebiotic Chemical Kinetics LMB Seminar - Title TBC “Innate immune sensing of viral nucleic acids" Great Fen: progress on the peat All models are wrong and yours are useless: making clinical prediction models impactful for patients |