University of Cambridge > > Inference Group > Nested Sampling for Motif Discovery

Nested Sampling for Motif Discovery

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Phil Cowans.

Many key processes in molecular biology, including transcription and alternative splicing, are regulated by short sequence motifs which serve as specific binding sites for protein factors. Building a comprehensive catalogue of such motifs is an important step towards understanding gene regulation. A number of methods have been developed which apply statistical and learning techniques to discover likely functional motifs from biological sequences. However, such techniques have generally been aimed at discovering one motif at a time from small amounts of sequence data, and they have generally suffered from limited sensitivity.

NestedMICA is a new motif-discovery program which aims to address these issues. It uses a mixture model which allows the simultaneous discovery of many motifs in a single run. It also applies a new inference technique called Nested Sampling which allows very complex model-spaces to be explored in a principled manner, giving substantially better sensitivity than previously described heuristic methods.

NestedMICA has been tested on several large sets of non-coding genomic sequence from the fruitfly, Drosophila melanogaster. In this talk, we present results which we believe represent a significant part of the flys core regulatory vocabulary, and discuss strategies for analysing and validating large sets of candidate motifs.

This talk is part of the Inference Group series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2023, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity