University of Cambridge > > Statistics Reading Group > Bayesian Clustering with the Dirichlet-Process Prior

Bayesian Clustering with the Dirichlet-Process Prior

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Richard Samworth.

We consider the problem of clustering measurements collected from multiple replicates at multiple time points, with an unknown number of clusters. We propose a mixture random-effects model coupled with a Dirichlet-process prior. The mixture model formulation allows for probabilistic cluster assignments. The random-effects formulation enables decomposition of total variability in the data into variabilities that are consistent with the experimental design. The Dirichlet-process prior induces a prior distribution on partitions and helps to estimate the number of clusters (or mixture components) from the data. We also tackle two challenges associated with Dirichlet-process prior-based methods. One is efficient sampling, for which we develop a novel Metropolis-Hastings Markov Chain Monte Carlo (MCMC) procedure. The other is efficient use of the MCMC samples in forming clusters, for which we propose a two-step procedure for posterior inference, which involves resampling and relabeling to estimate the posterior allocation probability matrix. The effectiveness of this model and sampling procedure is demonstrated on simulated data. We use this method to analyze time-course gene expression data from Drosophila cells to characterize the genome-wide temporal responses to Notch activation.

Fraley, C and Rafery, AE (2002) Model-based clustering, discriminant analysis, and density estimation. JASA 97 , 611-631.

Heard, N et al. (2006) A quantitative study of gene regulation involved in the immune response of Anopheline mosquitoes: an application of Bayesian hierarchical clustering of curves. JASA 101, 18-29.

Neal, RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9, 249-265.

This talk is part of the Statistics Reading Group series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2019, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity