University of Cambridge > > Statistics > Decoding from Pooled Data: Information-Theoretic bounds and a Message-Passing Algorithm

Decoding from Pooled Data: Information-Theoretic bounds and a Message-Passing Algorithm

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Quentin Berthet.

Consider a population consisting of n individuals, each of whom has one of d types (e.g. their blood type, in which case d=4). We are allowed to query this database by specifying a subset of the population, and in response we observe a noiseless histogram (a d-dimensional vector of counts) of types of the pooled individuals. This measurement model arises in practical situations such as pooling of genetic data and may also be motivated by privacy considerations. We study this model in the “random dense” setting where each query involves a random subset of individuals of size proportional to n. We are interested in the number of queries one needs to unambiguously determine the type of each individual, both information-theoretically and in an algorithmically efficient way. We discuss the information-theoretic question and present a message-passing algorithm for recovering the signal from a minimal number of measurements and characterize its exact asymptotic behavior. The analysis reveals a gap between what is statistically possible and what is achieved by our algorithm.

This is joint work with Aaditya Ramdas, Florent Krzakala, Lenka Zdeborova, and Michael Jordan. Preprints: arxiv:1611.09981, arxiv:1702.02279

This talk is part of the Statistics series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2023, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity