COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Data Intensive Science Seminar Series > Learning generalizable models on large-scale multi-modal data
Learning generalizable models on large-scale multi-modal dataAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact James Fergusson. The abundant spectrum of multi-modal data provides a significant opportunity for augmenting the training of foundational models beyond mere text. In this talk, I will introduce two lines of work that leverage large-scale models, trained on Internet-scale multi-modal datasets, to achieve good generalization performance. The first work trains an audio-visual model on YouTube datasets of videos and enables automatic video translation and dubbing. The model is able to learn the correspondence between audio and visual features, and use this knowledge to translate videos from one language to another. The second work trains a multi-modal, multi-task, multi-embodiment generalist policy on a massive collection of simulated control tasks, vision, language, and robotics. The model is able to learn to perform a variety of tasks, including controlling a robot arm, playing a game, and translating text. Both lines of work exhibit the potential future trajectory of foundational models, highlighting the transformative power of integrating multi-modal inputs and outputs. This talk is part of the Data Intensive Science Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsCambridge Assessment Network Seminars Algebra and Representation Theory Seminar Cambridge Global Food Security IRCOther talksEvolving Concepts of Planetary Habitability from Earth Analogue Environments Phase separation modeling and chromatin reconstruction Mechanical Frustration of Phase Separation in the Cell Nucleus by Chromatin MMV OfB The Dawn of Artificial General Intelligence? Modelling Chemical Kinetics in a non-Markovian environment |