Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Bayesian Semi-supervised Multicategory Classification under Nonparanormality

Add to your list(s) Download to your calendar using vCal

Subhashis Ghoshal (North Carolina State University)
Wednesday 30 July 2025, 14:00-15:00
Seminar Room 2, Newton Institute.

If you have a question about this talk, please contact nobody.

RCL - Representing, calibrating & leveraging prediction uncertainty from statistics to machine learning

Semi-supervised learning is a machine learning technique that combines supervised and unsupervised learning by utilizing both labeled and unlabeled data to train statistical models for classification and regression tasks. This paper addresses the problem of semi-supervised binary classification, assuming that the underlying data has only a few observations labeled in each class. Some methods have been developed for semi-supervised classification outside the Bayesian domain, but most works in the Bayesian domain utilize Gaussian mixture models. However, the assumption that the subpopulations are Gaussian may not be realistic in some situations. We generalize the data-generating process to the nonparanormality setting: the observations result from an unknown component-wise monotone increasing transformation applied to a hidden layer of multivariate normal latent variables. We assign a prior distribution to the transformation functions using B-splines, which naturally maintain monotonicity and satisfy the required identifiability constraints. We use a Gibbs sampler to coordinate draws from the posterior distribution of four objects: the missing labels, the coefficients of the B-spline expansions of the transformation functions, the parameters of the multivariate normal distributions of the component populations, and the population mixing proportions. The posterior draws of these objects use the Bayes formula for categories, Hamiltonian Monte Carlo, normal-normal conjugacy, and beta-binomial conjugacy, respectively. Using a low-density at separation assumption, we tune the number of terms in the B-spline expansions. We evaluate the performance of the proposed method based on extensive simulated data. We conclude that the proposed method yields low classification error rates, even when the nonparanormality assumption is violated, and outperforms many state-of-the-art semi-supervised machine learning techniques. The method performs well on several benchmark binary classification datasets.

This talk is part of the Isaac Newton Institute Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Bayesian Semi-supervised Multicategory Classification under Nonparanormality

This talk is included in these lists:

Other lists

Other talks