COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Artificial Intelligence Research Group Talks (Computer Laboratory) > Paying Attention to Efficiency: LLM Deployment on Mobile and Edge Devices
Paying Attention to Efficiency: LLM Deployment on Mobile and Edge DevicesAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Mateja Jamnik. Transformers have recently sparked significant interest in AI, driving advancements in accuracy and enabling a wide range of applications, from multi-modal intelligent assistants to autonomous systems. While their scaling laws promise even greater capabilities, the demands on hardware and data present significant challenges. In response, there is growing interest in compressing these models to smaller, more efficient forms, making them feasible for deployment with lower resource requirements. As edge and mobile devices are integrating increasingly powerful System-On-Chips (SoCs), deploying these models locally becomes viable, thus enabling new use-cases while enhancing privacy, sustainability and task-specific customization. In this talk, I will be touching upon two areas: first, measuring the execution efficiency and deployability of Large Language Models (LLMs) on mobile and edge devices; and second optimising DNN workloads for efficiency through low-rank decompositions. I will introduce MELT (MobiCom’24), a benchmarking framework designed to assess the computational, memory, energy, and thermal characteristics of LLMs running on device, identifying associated bottlenecks. Following this, I will present Maestro (ICML’24), a novel approach leveraging trainable low-rank decompositions to enable more efficient training and deployment of DNNs, enabled via data-informed progressive shrinking of networks. This talk is part of the Artificial Intelligence Research Group Talks (Computer Laboratory) series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsRoyal Society Rosalind Franklin Seminar Series Ecology Lunchtime Series Oxford Summer CoursesOther talksModelling stochastic die-out and re-introduction of meningococcal serogroup A in Ghana Aggradation, incision and lateral migration of alluvial rivers Exploring technological variability in pre-Hispanic painted pottery from NariƱo, Colombia Cambridge RNA Club - IN PERSON Special seminar: SYZ fibrations and tropical geometry CURC Talk: Govia Thameslink Railway |