Truncated $L_1$ Regularized Linear Regression: Theory and Algorithm

Authors

  • Mingwei Dai
  • Shuyang Dai
  • Junjun Huang
  • Lican Kang
  • Xiliang Lu

DOI:

https://doi.org/10.4208/cicp.OA-2020-0250

Keywords:

High-dimensional linear regression, sparsity, truncated $L_1$ regularization, primal dual active set algorithm.

Abstract

Truncated $L_1$ regularization proposed by Fan in [5], is an approximation to the $L_0$ regularization in high-dimensional sparse models. In this work, we prove the non-asymptotic error bound for the global optimal solution to the truncated $L_1$ regularized linear regression problem and study the support recovery property. Moreover, a primal dual active set algorithm (PDAS) for variable estimation and selection is proposed. Coupled with continuation by a warm-start strategy leads to a primal dual active set with continuation algorithm (PDASC). Data-driven parameter selection rules such as cross validation, BIC or voting method can be applied to select a proper regularization parameter. The application of the proposed method is demonstrated by applying it to simulation data and a breast cancer gene expression data set (bcTCGA).

Published

2021-04-30

Issue

Section

Articles