Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Training Random Forests with Ambiguously Labeled Data

Add to your list(s) Download to your calendar using vCal

Christian Leistner
Thursday 07 April 2011, 10:30-11:30
Small lecture theatre, Microsoft Research Ltd, 7 J J Thomson Avenue (Off Madingley Road), Cambridge.

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

Nowadays, an increasing number of computer vision applications rely on the usage of powerful machine learning algorithms. For the learning, usually supervised algorithms are applied, which demand large amounts of hand-labeled samples in order to yield accurate results. Although nowadays the number of digital images is exploding, collecting large amounts of labeled data can still be tedious and, if labeled, the labels can be noisy or formatted in a way which might not be optimal to exploit by the learning method – consider bounding box annotations in images. This motivates the development and usage of learning algorithms that are able to exploit both small amounts of labeled data and large amounts of unlabeled data, which are usually easy to get, and, additionally, allow for a certain amount of flexibility in the labeling.

In this talk, I will show how to use Random Forests (RFs) to tackle these challenges. RFs are able to deliver state-of-the-art results in various applications. They are fast in both training and evaluation, are inherently multi-class, run on parallel architectures and are robust to label noise. This makes them perfect candidates to exploit large amounts of unlabeled or ambiguously labeled samples. In contrast, they demand large amounts of data to leverage their full potential, which in turn motivates the incorporation of unlabeled samples into their training. In particular, I will present extensions of RFs to semi-supervised and multiple-instance learning as well as to online learning, which is needed in many applications. Finally, I will present a new method that is able to benefit from unlabeled data, even when the samples are coming from different distributions or the samples are only weakly-related to the actual task.

This talk is part of the Microsoft Research Cambridge, public talks series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Training Random Forests with Ambiguously Labeled Data

This talk is included in these lists:

Other lists

Other talks