University of Cambridge > > Machine Learning @ CUED > Efficient multi-task Gaussian process models for genome-wide association studies

Efficient multi-task Gaussian process models for genome-wide association studies

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Carl Edward Rasmussen.

Population-level data, where genotype and phenotype data are available in large sample sizes, have enabled genome-wide association studies (GWAS), both in human and in a wide range of model organisms. GWAS present many critical analysis challenges that current approaches address only in isolation. Among these are confounding factors, such as population structure, which result in non-IID sample structure. Additionally, for many complex traits genetic effects can be weak and dispersed across a large number of genetic features. Finally, individual phenotypes can rarely be considered as independent and instead it is important and beneficial to model the correlation structure between them.

In this talk, I will present approaches based on multi-task Gaussian processes to comprehensively address the challenges above. The method enables testing for association between sets of genetic features and multiple (correlated) phenotypes while simultaneously accounting for non-IID sample structure in the data. I will discuss both the modeling aspects and alternative scalable exact and approximate inference schemes for applications to large datasets. Finally, I will present applications to real data with thousands of samples and tens of traits, where we find that our method outperforms established methods in GWAS .

This talk is part of the Machine Learning @ CUED series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity