Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Rich semantic representations for detailed visual recognition

Add to your list(s) Download to your calendar using vCal

Subhransu Maji, Toyota Technological Institute at Chicago
Wednesday 26 March 2014, 11:00-12:00
Auditorium, Microsoft Research Ltd, 21 Station Road, Cambridge, CB1 2FB.

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

This event may be recorded and made available internally or externally via http://research.microsoft.com. Microsoft will own the copyright of any recordings made. If you do not wish to have your image/voice recorded please consider this before attending

Several problems in computer vision can be cast as a mapping from input (e.g., images and video) to richly structured spaces (e.g., attributes, 3D layout, and pose). Often the choice of the underlying representation of the input is crucial to the success of automatic methods for such mappings. On one hand, representations that are semantically aligned can enable better human-centric applications, but on the other hand, representations that are not necessarily semantic when learned from `big-data’ tends to have better empirical performance.

I’ll show that with a careful design of the learning/inference method and small amounts of additional supervision, one can learn representations that achieve both the goals. Our methods leverage noisy annotations collected via “crowdsourcing” to discover semantically aligned representations that enable several high-level recognition tasks. In particular, we achieve state of the art results for person detection and attribute recognition on the PASCAL VOC datasets, and material recognition on the KTH -TIPS/Flickr datasets. I’ll also present instances where algorithms consider humans “in the loop” to solve challenging tasks, such as, fine-grained category recognition (e.g. is this bird a Quetzal?), discriminative part/attribute discovery, and to enable faster annotation interfaces.

This talk is part of the Microsoft Research Cambridge, public talks series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Rich semantic representations for detailed visual recognition

This talk is included in these lists:

Other lists

Other talks