COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Stochastic Errors vs. Modeling Errors in Distance Based Phylogenetic Reconstructions
Stochastic Errors vs. Modeling Errors in Distance Based Phylogenetic ReconstructionsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Mustapha Amrani. Phylogenetics Distance-based phylogenetic reconstruction methods rely heavily on accurate pairwise distance estimates. There are two separate sources of error in this estimation process: (1) the relatively short sequence alignments used to obtain distance estimates induce a “stochastic error” corresponding to estimation of model parameters from finite data; (2) model misspecification leads to a “fixed error” which does not depend on sequence length. It is common practice to assume some substitution model over the sequence data and use an additive substitution rate function for that model when computing pairwise distances. In the providential case when the assumed model coincides with the true model, which is typically unkown, the distance estimates will not be afflicted with fixed error. But even then, there is no reason to a-priori enforce a zero fixed error, when this causes elevated rates of stochastic error, especially in the case of short sequence alignments. This work challenges this paradigm of “using the most additive distance function at any cost”. We do this by studying the contribution and effect of both fixed and stochastic error in distance estimation. We present a formal framework for quantifying the fixed error associated with a specific distance function and a given phylogenetic tree in a homogeneous substitution model. As an example, we study the behavior of the Jukes-Cantor distance formula in homogeneous instances of Kimura’s two parameter substitution model. The effects of fixed error are observed through analytic results and experiments on simulated data. In addition, we compare the performance of various distance functions on biological sequences. We evaluate reconstruction accuracy by comparing the reconstructed trees to an independently validated species tree. Our study indicates that often enough simple distance functions outperform more sophisticated functions, despite the fact that the given sequence data appears to have poor fit to the substitution model they assume. This talk is part of the Isaac Newton Institute Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsSociolinguistics Seminar Pathology Valedictory Seminars Seminars in Ageing ResearchOther talksKnot Floer homology and algebraic methods New Insights in Immunopsychiatry (Provisional Title) Intelligent Self-Driving Vehicles Cosmology from the Kilo-Degree Survey Southern Africa; Northern Cape Horizontal transfer of antimicrobial resistance drives multi-species population level epidemics From Euler to Poincare Scale and anisotropic effects in necking of metallic tensile specimens Protein Folding, Evolution and Interactions Symposium Making Refuge: Issam Kourbaj |