University of Cambridge > > Bioinformatics joint CRI-BSU series > Novel Methods for Data Integration in Bioinformatics

Novel Methods for Data Integration in Bioinformatics

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Andrew Teschendorff.

In this talk we will outline some novel methods for data integration for supervised and unsupervised learning. The principal applications will be to bioinformatics and cancer informatics.

In the first part of the talk we briefly review previous work on the use of Bayesian unsupervised and semi-supervised methods to determine structure in cancer datasets: specifically we consider expression array datasets for breast cancer and prostate cancer (work with Luke Carrivick, Mark Girolami and others).

We then extend these approaches to the joint unsupervised modelling of two types of data which are assumed functionally dependent. The model we propose is loosely based on correspondence Latent Dirichlet Allocation (LDA) and we illustrate its performance on a dataset consisting of breast cancer microRNA and expression array data with both types of data derived from the same patients (work with Phaedra Agius,Yiming Ying and others).

Next we consider supervised learning with multiple types of data (work with Yiming Ying and others). Thus a classifier which is based on multiple types of input data is potentially more accurate than a classifier which uses only one type of input data. These algorithms are applicable to many types of problem in bioinformatics ranging from network inference to protein fold prediction. We consider several new approaches based on probabilistic multi-kernel multi-class algorithms. We consider several application domains for these methods including an application to protein fold prediction based on a dataset with 27 fold classes in which the proposed method outperforms the closest rival by 4 per centage points (work with Yiming Ying).

This talk is part of the Bioinformatics joint CRI-BSU series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity