BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:CAPE Advanced Technology Lecture Series
SUMMARY:DetClip: Scalable Open-Vocabulary Object Detection
   via Fine-grained Visual-language Alignment - Dr 
 Wei Zhang\, Huawei London Research Center
DTSTART;TZID=Europe/London:20230118T140000
DTEND;TZID=Europe/London:20230118T150000
UID:TALK194356AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/194356
DESCRIPTION:<b>Abstract</b><p>We will present efficient and sc
 alable training framework that incorporates large-
 scale image-text pairs to achieve open-vocabulary 
 object detection (OVD). Unlike previous OVD framew
 orks that typically rely on a pre-trained vision-l
 anguage model (e.g.\, CLIP) or exploit image-text 
 pairs via a pseudo labelling process\, DetCLIP dir
 ectly learns the fine-grained word-region alignmen
 t from massive image-text pairs in an end-to-end m
 anner. We employ a maximum word-region similarity 
 between region proposals and textual words to guid
 e the contrastive objective. To enable the model t
 o gain localization capability while learning broa
 d concepts\, DetCLIP is trained with a hybrid supe
 rvision from detection\, grounding and image-text 
 pair data under a unified data formulation. By joi
 ntly training with an alternating scheme and adopt
 ing low-resolution input for image-text pairs\, De
 tCLIP exploits image-text pair data efficiently an
 d effectively.\n</p>\n<b>Biography</b><br>\n<p>Dr 
 Wei ZHang joined Huawei in 2012. Before that\, he 
 was an assistant researcher in Shenzhen Institute 
 of Advanced Technology Chinese Academy of Sciences
  and in The Chinese University of Hong Kong (CUHK)
 . He received his Ph.D. degree in computer science
  from CUHK in 2010\, his MS degree from Tsinghua U
 niversity in 2005 and his B.S. from Nankai Univers
 ity in 2002. Co-organizer of the “Self-supervised 
 Learning for Next-Generation Industry-level Autono
 mous Driving” workshop at ECCV 2022 and ICCV 2021.
 \n
LOCATION:EEDB Seminar Room\, Electrical Engineering and Onl
 ine (registration required)
CONTACT:Dr Mark Leadbeater
END:VEVENT
END:VCALENDAR
