Reduced-Rank Modeling for High-Dimensional Model-Based Clustering

Authors

Lei Yang Department of Environmental Medicine, New York University, New York, NY, USA
Junhui Wang Department of Mathematics, City University of Hong Kong, Kowloon Tong, Hong Kong
Shiqian Ma Department of Mathematics, University of California, Davis, CA 95616, USA

DOI:

https://doi.org/10.4208/jcm.1708-m2016-0830

Keywords:

Clustering, Gaussian mixture model, Group Lasso, ADMM, Reduced-rank model.

Abstract

Model-based clustering is popularly used in statistical literature, which often models the data with a Gaussian mixture model. As a consequence, it requires estimation of a large amount of parameters, especially when the data dimension is relatively large. In this paper, reduced-rank model and group-sparsity regularization are proposed to equip with the model-based clustering, which substantially reduce the number of parameters and thus facilitate the high-dimensional clustering and variable selection simultaneously. We propose an EM algorithm for this task, in which the M-step is solved using alternating minimization. One of the alternating steps involves both nonsmooth function and nonconvex constraint, and thus we propose a linearized alternating direction method of multipliers (ADMM) for solving it. This leads to an efficient algorithm whose subproblems are all easy to solve. In addition, a model selection criterion based on the concept of clustering stability is developed for tuning the clustering model. The effectiveness of the proposed method is supported in a variety of simulated and real examples, as well as its asymptotic estimation and selection consistencies.

Downloads

Published

2018-09-17

Issue

Vol. 36 No. 3 (2018)

Section

Articles