University of Cambridge > Talks.cam > Microsoft Research Cambridge, public talks > Rich semantic representations for detailed visual recognition

Rich semantic representations for detailed visual recognition

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

This event may be recorded and made available internally or externally via http://research.microsoft.com. Microsoft will own the copyright of any recordings made. If you do not wish to have your image/voice recorded please consider this before attending

Several problems in computer vision can be cast as a mapping from input (e.g., images and video) to richly structured spaces (e.g., attributes, 3D layout, and pose). Often the choice of the underlying representation of the input is crucial to the success of automatic methods for such mappings. On one hand, representations that are semantically aligned can enable better human-centric applications, but on the other hand, representations that are not necessarily semantic when learned from `big-data’ tends to have better empirical performance.

I’ll show that with a careful design of the learning/inference method and small amounts of additional supervision, one can learn representations that achieve both the goals. Our methods leverage noisy annotations collected via “crowdsourcing” to discover semantically aligned representations that enable several high-level recognition tasks. In particular, we achieve state of the art results for person detection and attribute recognition on the PASCAL VOC datasets, and material recognition on the KTH -TIPS/Flickr datasets. I’ll also present instances where algorithms consider humans “in the loop” to solve challenging tasks, such as, fine-grained category recognition (e.g. is this bird a Quetzal?), discriminative part/attribute discovery, and to enable faster annotation interfaces.

This talk is part of the Microsoft Research Cambridge, public talks series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity