Analysis by synthesis for interpretable image collection analysis

Analysis by synthesis for interpretable image collection analysis

I will present our recent work on analyzing the content of image collections by learning a simple prototype-based model of images. I will start by introducing the idea and framework of Deep Transformation Invariant image analysis in the case of image clustering [1], where that a simple modification of the standard K-means algorithm can lead to state of the art image clustering, while computing distances in pixel space and being easy to interpret. I will then show how the idea can be extended to perform object recovery [3], decomposing every image in a collection into layers derived from a small set of image prototypes. This can be applied to real world data, such as collection of Instagram images, and provide models and segmentation of repeated objects. Finally, I will explain how a similar idea can be used to perform single view reconstruction from a categorical image collection without any supervision.

[1] Deep Transformation-Invariant Clustering, T. Monnier, T. Groueix, M. Aubry, NeurIPS 2020, link

[2] Unsupervised Layered Image Decomposition into Object Prototypes, T. Monnier, E. Vincent, J. Ponce, M. Aubry, ICCV 2021 , link

[3] Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency, T. Monnier, M. Fisher, A. Efros, M. Aubry, ECCV 2022 , link


Mathieu Aubry is a tenured researcher in the Imagine team of Ecole des Ponts ParisTech. His work is mainly focussed on Computer Vision and Deep Learning, and their intersection with Computer Graphics, Machine Learning, and Digital Humanities. His PhD on 3D shapes representations obtained in 2015 at ENS was co-advised by Josef Sivic (INRIA) and Daniel Cremers (TUM). In 2015, he spent a year working as a postdoc with Alexei Efros in UC Berkeley.


