COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Microsoft Research Cambridge, public talks > Training Random Forests with Ambiguously Labeled Data
Training Random Forests with Ambiguously Labeled DataAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins. Nowadays, an increasing number of computer vision applications rely on the usage of powerful machine learning algorithms. For the learning, usually supervised algorithms are applied, which demand large amounts of hand-labeled samples in order to yield accurate results. Although nowadays the number of digital images is exploding, collecting large amounts of labeled data can still be tedious and, if labeled, the labels can be noisy or formatted in a way which might not be optimal to exploit by the learning method – consider bounding box annotations in images. This motivates the development and usage of learning algorithms that are able to exploit both small amounts of labeled data and large amounts of unlabeled data, which are usually easy to get, and, additionally, allow for a certain amount of flexibility in the labeling. In this talk, I will show how to use Random Forests (RFs) to tackle these challenges. RFs are able to deliver state-of-the-art results in various applications. They are fast in both training and evaluation, are inherently multi-class, run on parallel architectures and are robust to label noise. This makes them perfect candidates to exploit large amounts of unlabeled or ambiguously labeled samples. In contrast, they demand large amounts of data to leverage their full potential, which in turn motivates the incorporation of unlabeled samples into their training. In particular, I will present extensions of RFs to semi-supervised and multiple-instance learning as well as to online learning, which is needed in many applications. Finally, I will present a new method that is able to benefit from unlabeled data, even when the samples are coming from different distributions or the samples are only weakly-related to the actual task. This talk is part of the Microsoft Research Cambridge, public talks series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsMarshall Holiday Lectures Sustainability in the Built Environment (GreenBRIDGE) RCEAL Tuesday ColloquiaOther talksThe role of myosin VI in connexin 43 gap junction accretion Chemical genetic approaches to accelerate antimalarial target discovery Skyrmions, Quantum Graphs and Carbon-12 Beating your final boss battle, or presenting with confidence and style (tough mode) Faster C++ Single Cell Seminars (September) "Mechanosensitive regulation of cancer epigenetics and pluripotency" A rose by any other name Formation and disease relevance of axonal endoplasmic reticulum, a "neuron within a neuron”. The Productivity Paradox: are we too busy to get anything done? Stopping the Biological Clock – The Lazarus factor and Pulling Life back from the Edge. |