University of Cambridge > Talks.cam > CUED Computer Vision Research Seminars > Self-supervised Learning from Images, Videos, and a single Image plus Augmentations

Self-supervised Learning from Images, Videos, and a single Image plus Augmentations

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Gwangbin Bae.

Abstract

In this talk I will talk about pushing the limits of what can be learnt without using any human annotations. After a first overview of what self-supervised learning is, we will first dive into how clustering can be combined with representation learning using optimal transport and analyize how semantic the resulting clusters are ([1] at ICLR ’20). We will then talk about how multi-modal data can be leveraged for detecting objects fully unsupervisedly. Finally, as augmentations are crucial for all of self-supervised learning, we will analyze these in more detail in recent preprint [3]. Here, we show that it is possible to extrapolate to semantic classes such as those of ImageNet using just a single datum as visual input when combined with strong augmentations.

[1] https://arxiv.org/abs/1911.05371

[2] https://arxiv.org/abs/2104.06401

[3] https://arxiv.org/abs/2112.00725

Bio

Yuki Asano is an assistant professor for computer vision and machine learning at the Qualcomm-UvA lab at the University of Amsterdam, where he works with Cees Snoek, Max Welling and Efstratios Gavves. His current research interests are multi-modal and self-supervised learning and ethics in computer vision. Prior to his current appointment, he finished his PhD at the Visual Geometry Group (VGG) at the University of Oxford working with Andrea Vedaldi and Christian Rupprecht. During his time as a PhD student he also interned at Facebook AI Research and worked at TransferWise. Prior to the PhD he studied physics at the University of Munich (LMU) and Economics in Hagen as well as a MSc in Mathematical Modelling and Scientific Computing at the Mathematical Institute in Oxford.

Location

The talk will be given at Lecture Theatre 1 (LT1) in the Engineering Department (Trumpington St, Cambridge CB2 1PZ ).

Zoom link

Zoom link: https://zoom.us/j/6492509351?pwd=U0hoSzJ0anlhRGhzYVFmTzltNk9wZz09

Meeting ID: 649 250 9351 / Passcode: 7mu5ZJ

Google Calendar

To get updates on future seminars, please subscribe to the following Google calendar: https://calendar.google.com/calendar/u/0?cid=c2pjcHN0YXM2N3QyMWU3c2FqNjBqNWNiYXNAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ

This talk is part of the CUED Computer Vision Research Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity