University of Cambridge > > Isaac Newton Institute Seminar Series > On computational barriers in data science and the paradoxes of deep learning

On computational barriers in data science and the paradoxes of deep learning

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact

VMVW02 - Generative models, parameter learning and sparsity

The use of regularisation techniques such as l^1 and Total Variation in Basis Pursuit and Lasso, as well as linear and semidefinite programming and neural networks (deep learning) has seen great success in data science. Yet, we will discuss the following paradox: it is impossible to design algorithms to find minimisers accurately for these problems when given inaccurate input data, even when the inaccuracies can be made arbitrarily small. The paradox implies that any algorithm designed to solve these problems will fail in the following way: For fixed dimensions and any small accuracy parameter epsilon > 0, one can choose an arbitrary large time T and find an input such that the algorithm will run for longer than T and still not have reached epsilon accuracy. Moreover, it is impossible to determine when the algorithm should halt to achieve an epsilon accurate solution. The largest epsilon for which this failure happens is called the Breakdown-epsilon. Typically, the Breakdown-epsilon > 1/2 even when the the input is bounded by one, is well-conditioned, and the objective function can be computed with arbitrary accuracy.

Despite the paradox we explain why empirically many modern algorithms perform very well in real-world scenarios. In particular, when restricting to subclasses of problems the Breakdown epsilon may shrink. Moreover, typically one can find polynomial time algorithms in L and n, where L < log(1/Breakdown-epsilon) is the number of correct digits in the computed solution and n is the size of the input data. However, the Breakdown-epsilon is the breaking point, and for L >  log(1/Breakdown-epsilon), any algorithm, even randomised, becomes arbitrarily slow and will not be able to halt and guarantee L correct digits in the output.

The above result leads to the paradoxes of deep learning: (1) One cannot guarantee the existence of algorithms for accurately training the neural network, even if there is one minimum and no local minima. Moreover, (2) one can have 100% success rate on arbitrarily many test cases, yet uncountably many misclassifications on elements that are arbitrarily close to the training set.

This is joint work with Alex Bastounis (Cambridge) and Verner Vlacic (ETH)

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2018, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity