COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Applied and Computational Analysis > Numerical properties of solutions of lasso regression
Numerical properties of solutions of lasso regressionAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Nicolas Boulle. The determination of a concise model of a linear system when there are fewer samples m than predictors n requires the solution of the equation Ax = b where A∈Rm×n and rank A = m, such that the selected solution from the infinite set of solutions is sparse, that is, many of its components are zero. This leads to the minimisation with respect to x of f(x,λ) = ||Ax-b||2_2+ λ||x||_2, where λ is the regularisation parameter. This problem, which is called lasso regression, is more difficult than Tikhonov regularisation because there does not exist a 1-norm matrix decomposition and the 1-norm of a vector or matrix is not a differentiable function. Lasso regression yields a family of functions xlasso(λ) and it is necessary to determine the optimal value of λ, that is, the value of λ that balances the fidelity of the model, ||A xlasso(λ)-b||≈0, and the satisfaction of the constraint that xlasso(λ) be sparse. A solution that satisfies both these conditions, that is, the objective function and the constraint assume, approximately, their minimum values is called an optimal sparse solution. In particular, a sparse solution exists for many values of λ, but the restriction of interest to an optimal sparse solution places restrictions on λ, A and b. It is shown that there does not exist an optimal sparse solution when the least squares (LS) solution xls = A†b = AT(AAT)-1b is well conditioned. It is also shown that, by contrast, an optimal sparse solution that has few zero entries exists if xls is ill conditioned. This association between sparsity and numerical stability has been observed in feature selection and the analysis of fMRI images of the brain. The relationship between the numerical condition of the LS problem and the existence of an optimal sparse solution requires that a refined condition number of xls be developed because it cannot be obtained from the condition number κ(A) of A. This inadequacy of κ(A) follows because it is a function of A only, but xls is a function of A and b. The optimal value of λ is usually computed by cross validation (CV), but it is problematic because it requires the determination of a shallow minimum of a function. A better method, which requires the computation of the coordinates of the corner of a curve that has the form of an L, is described and examples that demonstrate its superiority with respect to CV are presented. The talk includes examples of well conditioned and ill conditioned solutions xls of the LS problem that do, and do not, possess an optimal sparse solution. The examples include the effect of noise on the results obtained by the L-curve and CV. This talk is part of the Applied and Computational Analysis series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsConspiracy & Democracy Society To Science Type the title of a new list herecirculOther talksBloomberg: Software Design Dilemmas: Choices We Make In a System With Over Two Million Hits A Day Towards a social(ist) model of disability? Exploring the Soviet roots of twentieth-century disability politics Stochastic Thermodynamics of Energy Harvester Models The strange world of AdS2 integrability Gravitational edge mode for 2D gravity LCLU Coffee |