BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Isaac Newton Institute Seminar Series
SUMMARY:Stochastic Errors vs. Modeling Errors in Distance
Based Phylogenetic Reconstructions - Doerr\, D (Bi
elefeld )
DTSTART;TZID=Europe/London:20110624T102000
DTEND;TZID=Europe/London:20110624T104000
UID:TALK31882AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/31882
DESCRIPTION:Distance-based phylogenetic reconstruction methods
rely heavily on accurate pairwise distance estima
tes. There are two separate sources of error in th
is estimation process: \n\n(1) the relatively shor
t sequence alignments used to obtain distance esti
mates induce a "stochastic error" corresponding to
estimation of model parameters from finite data\;
\n\n(2) model misspecification leads to a "fixed
error" which does not depend on sequence length. \
n\nIt is common practice to assume some substituti
on model over the sequence data and use an additiv
e substitution rate function for that model when c
omputing pairwise distances. In the providential c
ase when the assumed model coincides with the true
model\, which is typically unkown\, the distance
estimates will not be afflicted with fixed error.
But even then\, there is no reason to a-priori enf
orce a zero fixed error\, when this causes elevate
d rates of stochastic error\, especially in the ca
se of short sequence alignments.\n\nThis work chal
lenges this paradigm of "using the most additive d
istance function at any cost". We do this by study
ing the contribution and effect of both fixed and
stochastic error in distance estimation. We presen
t a formal framework for quantifying the fixed err
or associated with a specific distance function an
d a given phylogenetic tree in a homogeneous subst
itution model. As an example\, we study the behav
ior of the Jukes-Cantor distance formula in homoge
neous instances of Kimura's two parameter substitu
tion model. The effects of fixed error are observe
d through analytic results and experiments on simu
lated data. In addition\, we compare the performan
ce of various distance functions on biological seq
uences. We evaluate reconstruction accuracy by com
paring the reconstructed trees to an independently
validated species tree. Our study indicates that
often enough simple distance functions outperform
more sophisticated functions\, despite the fact th
at the given sequence data appears to have poor fi
t to the substitution model they assume.\n\n
LOCATION:Seminar Room 1\, Newton Institute
CONTACT:Mustapha Amrani
END:VEVENT
END:VCALENDAR