University of Cambridge > Talks.cam > Statistics > Predictive Modeling Approaches to Gene Regulation

Predictive Modeling Approaches to Gene Regulation

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact rbg24.

Transcription factors regulate gene expression through their binding to target DNA sequences. With the help of motif finding, we can predict transcription factor binding sites and thus decode the regulatory network. The adoption of large-scale biological data generation techniques such as the mRNA microarrays has enabled researchers to tackle the gene regulation problem in a global way. We will describe some computational and statistical strategies developed by our group for combining the gene upstream sequence information with mRNA expression microarray data to dissect the gene regulatory network. We term this type of methods “predictive modeling approaches” because they take the expression data (or other gene-level quantitative measurements) as the response variable and attempt to use sequence information to predict the response. Main advantages of these approached are: (a) they are much more sensitive and specific than those sequence-only motif-discovery approaches when the motif signal is weak; (b) many advanced statistical learning tools can be used and various sophisticated dimension reduction and variable selection techniques can be applied under this coherent framework; (c) the discovered motifs or other sequence patterns can be “statistically” confirmed by cross-validations instead of relying purely on previous biological knowledge or further follow-up experiments.

We first demonstrate a re-analysis of the dataset from Beer and Tavazoie (2004), which serves to warn against “over-interpretation” when a pedictive mdoeling approach is used. Then we describe some successful applications of the methods, such as statistical analyses of histone modification and nucleosome occupancy data. If time permit, I will also discuss some related statistical problems.

This talk is part of the Statistics series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2019 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity