Truncated $L_1$ Regularized Linear Regression: Theory and Algorithm
DOI:
https://doi.org/10.4208/cicp.OA-2020-0250Keywords:
High-dimensional linear regression, sparsity, truncated $L_1$ regularization, primal dual active set algorithm.Abstract
Truncated $L_1$ regularization proposed by Fan in [5], is an approximation to the $L_0$ regularization in high-dimensional sparse models. In this work, we prove the non-asymptotic error bound for the global optimal solution to the truncated $L_1$ regularized linear regression problem and study the support recovery property. Moreover, a primal dual active set algorithm (PDAS) for variable estimation and selection is proposed. Coupled with continuation by a warm-start strategy leads to a primal dual active set with continuation algorithm (PDASC). Data-driven parameter selection rules such as cross validation, BIC or voting method can be applied to select a proper regularization parameter. The application of the proposed method is demonstrated by applying it to simulation data and a breast cancer gene expression data set (bcTCGA).