University of Cambridge > Talks.cam > Computer Laboratory Wednesday Seminars > The Automatic Statistician - an AI for Data Science

The Automatic Statistician - an AI for Data Science

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact David Greaves.

We live an era of abundant data and there is an increasing need for methods to automate data analysis and statistics. I will describe the “Automatic Statistician” (http://www.automaticstatistician.com/) , a project which aims to automate the exploratory analysis and modelling of data. Our approach starts by defining a large space of related probabilistic models via a grammar over models, and then uses Bayesian marginal likelihood computations to search over this space for one or a few good models of the data. The aim is to find models which have both good predictive performance, and are somewhat interpretable. Our initial work has focused on the learning of unknown nonparametric regression functions, and on learning models of time series data, both using Gaussian processes. Once a good model has been found, the Automatic Statistician generates a natural language summary of the analysis, producing a 10-15 page report with plots and tables describing the analysis. I will discuss challenges such as: how to trade off predictive performance and interpretability, how to translate complex statistical concepts into natural language text that is understandable by a numerate non-statistician, and how to integrate model checking.

This is joint work with James Lloyd, David Duvenaud, Roger Grosse and Josh Tenenbaum.

This talk is part of the Computer Laboratory Wednesday Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2020 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity