We consider neural networks with a singl e hidden layer and non-decreasing positively homog eneous activation functions like the rectified lin ear units. By letting the number of hidden units g row unbounded and using classical non-Euclidean re gularization tools on the output weights\, they le ad to a convex optimization problem and we provide a detailed theoretical analysis of their generali zation performance\, with a study of both the appr oximation and the estimation errors. We show in pa rticular that they are adaptive to unknown underly ing linear structures\, such as the dependence on the projection of the input variables onto a low-d imensional subspace. Moreover\, when using sparsit y-inducing norms on the input weights\, we show th at high-dimensional non-linear variable selection may be achieved\, without any strong assumption re garding the data and with a total number of variab les potentially exponential in the number of obser vations. However\, solving this convex optimizatio n pro blem in infinite dimensions is only possible if the non-convex subproblem of addition of a new unit can be solved efficiently. We provide a simp le geometric interpretation for our choice of acti vation functions and describe simple conditions fo r convex relaxations of the finite-dimensional non -convex subproblem to achieve the same generalizat ion error bounds\, even when constant-factor appro ximations cannot be found. We were not able to fin d strong enough convex relaxations to obtain prova bly polynomial-time algorithms and leave open the existence or non-existence of such tractable algor ithms with non-exponential sample complexities.

Rela ted links: http://jmlr.org/papers/volume18/14-546/ 14-546.pdf \;- JMLR paper LOCATION:Seminar Room 1\, Newton Institute CONTACT:INI IT END:VEVENT END:VCALENDAR