University of Cambridge > > DPMMS PhD student colloquium > Efficient multivariate entropy estimation via k-nearest neighbour distances

Efficient multivariate entropy estimation via k-nearest neighbour distances

Add to your list(s) Download to your calendar using vCal

  • UserTom Berrett, DPMMS/Statslab
  • ClockThursday 27 April 2017, 14:40-15:20
  • HouseMR3, CMS.

If you have a question about this talk, please contact Jack Smith.

Many widely-used statistical procedures, including methods for goodness-of-fit tests, feature selection and changepoint analysis, rely critically on the estimation of the entropy of a distribution. I will initially present new results on a commonly used generalisation of the estimator originally proposed by Kozachenko and Leonenko (1987), which is based on the k-nearest neighbour distances of a sample of independent and identically distributed random vectors. These results show that, in up to 3 dimensions and under regularity conditions, the estimator is efficient for certain choices of k, in the sense of achieving the local asymptotic minimax lower bound. However, they also show that in higher dimensions a non-trivial bias precludes its efficiency regardless of the choice of k. This motivates us to consider a new entropy estimator, formed as a weighted average of Kozachenko-Leonenko estimators for different values of k. A careful choice of weights enables us to reduce the bias of the first estimator and thus obtain an efficient estimator in arbitrary dimensions, given sufficient smoothness. Our results provided theoretical insight and have important methodological implications.

This talk is part of the DPMMS PhD student colloquium series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity