![]() |
COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > CBL Research Talks > Efficient Few-Shot Continual Learning in Vision-Language Models
![]() Efficient Few-Shot Continual Learning in Vision-Language ModelsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Cat Spencer. Vision-language models (VLMs) excel in tasks such as visual question answering and image captioning. However, VLMs are often limited by their use of pretrained image encoders, like CLIP , leading to image understanding errors that hinder overall performance. On top of that, real-world applications often require the model to be continuously adapted as new and often limited data continuously arrive. To address this, we propose LoRSU (Low-Rank Adaptation with Structured Updates), a robust and computationally efficient method for selectively updating image encoders within VLMs. LoRSU introduces structured and localized parameter updates, effectively correcting performance on previously error-prone data while preserving the model’s general robustness. Our approach leverages theoretical insights to identify and update only the most critical parameters, achieving significant resource efficiency. Specifically, we demonstrate that LoRSU reduces computational overhead by over 25x compared to full VLM updates, without sacrificing performance. Experimental results on VQA tasks in the few-shot continual learning setting, validate LoRSU’s scalability, efficiency, and effectiveness, making it a compelling solution for image encoder adaptation in resource-constrained environments This talk is part of the CBL Research Talks series. This talk is included in these lists:Note that ex-directory lists are not shown. |
Other listsDepartment of Materials Science & Metallurgy Seminar Series Soc Doc Soc Semantics Lunch (Computer Laboratory)Other talksTesting the HARPS3 Data Reduction Pipeline with Synthetic Spectra to achieve Earth-Twin RV Precision Exploring high-redshift quenched galaxies: from low-mass to massive On disappearing material practices: from folding and modeling in the 20th century to AI ‘produced’ diagrams in the 21st century Kirk Public Lecture: Title TBC Lean for teaching university mathematics: current work and future research On the residue sequence in logarithmic THH |