COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > NLIP Seminar Series > Adaptive Tokenization and Memory in Foundation Models
Adaptive Tokenization and Memory in Foundation ModelsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Suchir Salhan. Abstract: State-of-the-art foundation models (FMs) process information as a sequence of internal representations; however, the length of this sequence is fixed and entirely determined by tokenization. This essentially decouples representation granularity from information content, which exacerbates the deployment costs of FMs and narrows their “horizons” in long sequences. What if, instead, we could dynamically adapt tokenization and memory in FMs to save computation while maintaining or even enhancing performance? First, I will show how we can dynamically compress the key-value cache of Transformers by deciding when to append or merge items to memory. This offers a compromise between Transformers, whose linear key-value cache growth exhausts memory space and increases latency, and State Space Models, whose finite capacity may result in forgetfulness. Secondly, I will demonstrate how FMs can be “freed” from the tokenizers they are bound to by swapping them on-the-fly with arbitrary ones. Taking a step further, we can even get rid of tokenizers entirely by learning end-to-end how to jointly segment and model language. Crucially, this new family of FM architectures equipped with adaptive memory and tokenization does not require to be trained from scratch; instead, pre-existing open-weight FMs can be retrofitted with a negligible amount of data for this purpose. Bio: Edoardo M. Ponti is a Lecturer (≈ Assistant Professor) in Natural Language Processing at the University of Edinburgh, an Affiliated Lecturer at the University of Cambridge, and a visiting professor at NVIDIA . Previously, he was a visiting postdoctoral scholar at Stanford University and a postdoctoral fellow at Mila and McGill University in Montreal. In 2021, he obtained a PhD in computational linguistics from the University of Cambridge, St John’s College. His main research foci are efficient memory and tokenization, modular deep learning, and computational typology. His research earned him a Google Research Faculty Award and 2 Best Paper Awards at EMNLP 2021 and RepL4NLP 2019. He is a board member and co-founder of SIGTYP , the ACL special interest group for computational typology, and a scholar of the European Lab for Learning and Intelligent Systems (ELLIS). He is a (terrible) violinist, football player, and an aspiring practitioner of heroic viticulture. This talk is part of the NLIP Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsBiological Chemistry Research Interest Group Changing Health Physical Chemistry Research Interest GroupOther talksExploring the role of immune system as a primary driver of pathology in the mitochondrial disease Leigh syndrome Coproduction of Mathematical Models Embezzlement of entanglement and the classification of von Neumann algebras The American Progressives on Leaderless Government and the Rule of Law Post Quantum Cryptography in Networks |