University of Cambridge > > Dr Fabien Petitcolas's list > Semantic Image Segmentation and Web-Supervised Visual Learning

Semantic Image Segmentation and Web-Supervised Visual Learning

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Dr Fabien Petitcolas.

Abstract: In object recognition, the goal is to recognise objects of certain categories, usually known and trained in advance, despite intra-class appearance variations and small inter-class differences. The appearance of objects is influenced by lighting, scale, different poses, viewpoints, articulation of objects, clutter and occlusion. Two different aspects of object recognition are investigated in this thesis. The first part develops models for semantic object segmentation of natural images and relies on groundtruth labelling for training. The second part uses the implicit supervision that is available on the Internet to learn visual object-class models automatically. It can then provide a groundtruth labelling for object detection or segmentation algorithms.

The goal in the first part is to label connected regions in an image as belonging to specific object classes, such as grass or cow. We introduce a compact model to the bag of visual words approach, where each class is modelled by one single histogram of visual words, this is in contrast to common nearest-neighbour approaches which model each class by many histograms. After introducing segmentation algorithms based on these histogram models we extend the Random Forest classifier and evaluate its feature selection properties as well as the suitability of certain low-level features for the semantic object segmentation task.

Most object recognition methods rely on labelled training images. For each object category to be recognised, the system is trained on a set of images containing instances of these categories. The last part of this thesis focuses on the automatic creation of sets of images that contain a certain object class. The idea is to download an initial set of images from the Internet based on a search query ( penguin). Given the images a text based ranking that exploits the information on the web-pages is performed. This ranking is then used to automatically learn visual models for 18 object categories. We compare the performance of our system to previous work and show that it performs equally well without the need of explicit manual supervision.

Biography: Florian Schroff is currently a fourth year DPhil student in the Departement of Engineering Science at the University of Oxford funded by Microsoft Research through the European PhD Scholarship Programme. He is jointly supervised by Professor Andrew Zisserman and Antonio Criminisi at Microsoft Research Cambridge. Before joining the Visual Geometry Group (VGG) in Oxford he was working as a researcher at the German Research Center for Artificial Intelligence in Kaiserslautern. He received his degree (Diploma) in computer science at the University of Karlsruhe end of 2004, where he was working with Professor H.-H. Nagel on camera calibration and focused on artificial intelligence, cryptography and algebra. In 2003 he received the Master of Science in computer science from the University of Massachusetts – Amherst, where he had started his studies under the Baden-W├╝rttemberg exchange scholarship in 2002.

This talk is part of the Dr Fabien Petitcolas's list series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2019, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity