Frank-Wolfe optimization insights in machine learning
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Zoubin Ghahramani.
The Frank-Wolfe optimization algorithm (also called
conditional gradient) is a very simple and intuitive optimization
algorithm proposed in the 1950s by Marguerite Frank and Phil Wolfe. It
was partly forgotten as it became superseded by faster algorithms, but
it is making a recent revival in machine learning, thanks to its
ability to exploit well the structure of the machine learning
optimization problems. In this talk, I will mention two recent
advances making use of Frank-Wolfe. In the first part, I will describe
how it can be efficiently applied to large margin learning for
structured prediction. I will show how several previous algorithms
were special cases of Frank-Wolfe, and I will present a new
block-coordinate version of Frank-Wolfe which yields a simple
algorithm which outperforms the state-of-the-art. In the second part,
I will describe how the herding algorithm recently proposed by Max
Welling is actually equivalent to the Frank-Wolfe optimization of a
quadratic moment discrepancy. This link enables us to obtain a
weighted version of herding which converge faster for the task of
approximating integrals (obtaining adaptive quadrature rules). On the
other hand, our experiments indicate that herding could still be
better for the task learning, shedding more light on the properties of
the herding algorithm.
This is joint work with Francis Bach, Martin Jaggi, Guillaume
Obozinski, Mark Schmidt and Patrick Pletscher.
This talk is part of the Machine Learning @ CUED series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|