University of Cambridge > > Microsoft Research Cambridge, general interest public talks > Machine Learning Reveals the Genetic Code Controlling Splicing

Machine Learning Reveals the Genetic Code Controlling Splicing

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

Abstract: Thirty years after the proposal of DNA , Roberts and Sharp discovered that DNA does not directly encode messenger RNA , but that a process called splicing assembles each mRNA based on carefully selected DNA subsequences. Because of this, a gene can encode many different mRNAs and which mRNAs are generated can depend on tissue type, age and disease. There are 22,000 human genes, but there are over 1,000,000 different mRNAs produced by splicing. One gene encodes 38,000 mRNAs that are involved in wiring together neurons. Another gene encodes two mRNAs that determine the organism`s sexual preference. Roberts and Sharp received the Nobel Prize for their work in 1993, but the genetic information responsible for controlling splicing has mostly remained a mystery. In the past 3 years it became possible to detect mRNAs with sufficient resolution that researchers can attempt to infer for the first time such a `splicing code`. In this talk, I`ll describe a machine learning technique that we used to infer a splicing code that is explanatory as well as predictive. Its interpretation is consistent with known mechanisms, but suggests new ones. The code achieves 93% prediction accuracy and was verified using different genes, different species and different experimental assays. Mutation of the identified genetic information leads to corresponding changes in splicing. In addition to describing these results, I`ll talk about how the objective was formulated as a machine learning problem, how the need for human interpretability shaped the approach and what was done to isolate causation from correlation.

This talk is part of the Microsoft Research Cambridge, general interest public talks series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2023, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity