University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Model-based base-calling and de novo error correction algorithms for short-read sequencing

Model-based base-calling and de novo error correction algorithms for short-read sequencing

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Mustapha Amrani.

Statistical Challenges Arising from Genome Resequencing

An important computational challenge associated with recent advances in sequencing technology is to develop efficient methods that can extract accurate sequence information from raw instrument data. In this talk, I will describe a couple of algorithms which significantly improve the accuracy of short-read sequence data, particularly in the later cycles of a sequencing run. First, I will describe a novel model-based base-calling algorithm for the Illumina sequencing platform. Being founded on the tools of statistical learning, our approach is flexible enough to incorporate various features of the sequencing process. In particular, it can easily incorporate cycle-dependent parameters and model residual effects. I will then describe an efficient algorithm for correcting base-call errors. Our algorithm does not require a reference genome and it significantly outperforms previous error correction algorithms under various realistic settings. Finally, I will demonstrate how improved data quality resulting from our algorithms may facilitate de novo assembly and SNP calling.

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity