COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Computational and Systems Biology Seminar Series 2023 - 24 > An introduction to counts-of-counts data
An introduction to counts-of-counts dataAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Samantha Noel. Meetings are planned to take place in person. Seminars are principally for MPhil students. Please email the adminstrator should you wish to attend as a guest Counts-of-counts data arise in many areas of biology and medicine, and have been studied by statisticians since the 1940s. One of the first examples, discussed by R. A. Fisher and collaborators in 1943 [1], concerns estimation of the number of unobserved species based on summary counts of the number of species observed once, twice, … in a sample of specimens. The data are summarized by the numbers C1, C2, … of species represented once, twice, … in a sample of size N = C1 2 C2 3 C3 …. containing S = C1 C2 + … species; the vector C = (C1, C2, …) gives the counts-of-counts. Other examples include the frequencies of the distinct alleles in a human genetics sample, the counts of distinct variants of the SARS -CoV-2 S protein obtained from consensus sequencing experiments, counts of sizes of components in certain combinatorial structures [2], and counts of the numbers of SNVs arising in one cell, two cells, … in a cancer sequencing experiment. In this talk I will outline some of the stochastic models used to model the distribution of C, and some of the inferential issues that come from estimating the parameters of these models. I will touch on the celebrated Ewens Sampling Formula [3] and Fisher’s multiple sampling problem concerning the variance expected between values of S in samples taken from the same population [3]. Variants of birth-death-immigration processes can be used, for example when different variants grow at different rates. The classical Yule process with immigration can be used to derive some of the combinatorial results in a simple way, through a probabilistic trick known as embedding. References [1] Fisher RA, Corbet AS & Williams CB. J Animal Ecology, 12, 1943 [2] Arratia R, Barbour AD & Tavaré S. Logarithmic Combinatorial Structures, EMS , 2002 [3] Ewens WJ. Theoret Popul Biol, 3, 1972 [4] Da Silva P, Jamshidpey A, McCullagh P & Tavaré S. Bernoulli, in press, 2022 This talk is part of the Computational and Systems Biology Seminar Series 2023 - 24 series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsUK~IRC Summit Graphene CDT Advanced Technology Lectures EMBL-EBI Science & SocietyOther talksCan we use network analysis to predict violence? Large-charge expansions and their asymptotics. How to reduce body size: dimorphic development of the bone-eating Osedax (Annelida) Metal Pad Roll Instability in Two or Three Layers of Liquid Metals Rethinking Materials Discovery with Generative Models |