Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Model-based base-calling and de novo error correction algorithms for short-read sequencing

Add to your list(s) Download to your calendar using vCal

Song, Y (UC Berkeley)
Tuesday 13 July 2010, 17:00-17:30
Seminar Room 1, Newton Institute.

If you have a question about this talk, please contact Mustapha Amrani.

Statistical Challenges Arising from Genome Resequencing

An important computational challenge associated with recent advances in sequencing technology is to develop efficient methods that can extract accurate sequence information from raw instrument data. In this talk, I will describe a couple of algorithms which significantly improve the accuracy of short-read sequence data, particularly in the later cycles of a sequencing run. First, I will describe a novel model-based base-calling algorithm for the Illumina sequencing platform. Being founded on the tools of statistical learning, our approach is flexible enough to incorporate various features of the sequencing process. In particular, it can easily incorporate cycle-dependent parameters and model residual effects. I will then describe an efficient algorithm for correcting base-call errors. Our algorithm does not require a reference genome and it significantly outperforms previous error correction algorithms under various realistic settings. Finally, I will demonstrate how improved data quality resulting from our algorithms may facilitate de novo assembly and SNP calling.

This talk is part of the Isaac Newton Institute Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Model-based base-calling and de novo error correction algorithms for short-read sequencing

This talk is included in these lists:

Other lists

Other talks