University of Cambridge > Talks.cam > Microsoft Research Cambridge, public talks > Training Random Forests with Ambiguously Labeled Data

Training Random Forests with Ambiguously Labeled Data

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

Nowadays, an increasing number of computer vision applications rely on the usage of powerful machine learning algorithms. For the learning, usually supervised algorithms are applied, which demand large amounts of hand-labeled samples in order to yield accurate results. Although nowadays the number of digital images is exploding, collecting large amounts of labeled data can still be tedious and, if labeled, the labels can be noisy or formatted in a way which might not be optimal to exploit by the learning method – consider bounding box annotations in images. This motivates the development and usage of learning algorithms that are able to exploit both small amounts of labeled data and large amounts of unlabeled data, which are usually easy to get, and, additionally, allow for a certain amount of flexibility in the labeling.

In this talk, I will show how to use Random Forests (RFs) to tackle these challenges. RFs are able to deliver state-of-the-art results in various applications. They are fast in both training and evaluation, are inherently multi-class, run on parallel architectures and are robust to label noise. This makes them perfect candidates to exploit large amounts of unlabeled or ambiguously labeled samples. In contrast, they demand large amounts of data to leverage their full potential, which in turn motivates the incorporation of unlabeled samples into their training. In particular, I will present extensions of RFs to semi-supervised and multiple-instance learning as well as to online learning, which is needed in many applications. Finally, I will present a new method that is able to benefit from unlabeled data, even when the samples are coming from different distributions or the samples are only weakly-related to the actual task.

This talk is part of the Microsoft Research Cambridge, public talks series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity