Proximal-Proximal-Gradient Method

Authors

Ernest K. Ryu Department of Mathematics, University of California, Los Angeles, CA 90095, USA
Wotao Yin Department of Mathematics, University of California, Los Angeles, CA 90095, USA

DOI:

https://doi.org/10.4208/jcm.1906-m2018-0282

Keywords:

Proximal-gradient, ADMM, Finito, MISO, SAGA, Operator splitting, First-order methods, Distributed optimization, Stochastic optimization, Almost sure convergence, linear convergence.

Abstract

In this paper, we present the proximal-proximal-gradient method (PPG), a novel optimization method that is simple to implement and simple to parallelize. PPG generalizes the proximal-gradient method and ADMM and is applicable to minimization problems written as a sum of many differentiable and many non-differentiable convex functions. The non-differentiable functions can be coupled. We furthermore present a related stochastic variation, which we call stochastic PPG (S-PPG). S-PPG can be interpreted as a generalization of Finito and MISO over to the sum of many coupled non-differentiable convex functions.
We present many applications that can benefit from PPG and S-PPG and prove convergence for both methods. We demonstrate the empirical effectiveness of both methods through experiments on a CUDA GPU. A key strength of PPG and S-PPG is, compared to existing methods, their ability to directly handle a large sum of non-differentiable non-separable functions with a constant step size independent of the number of functions. Such non-diminishing step size allows them to be fast.

Downloads

Published

2021-07-01

Issue

Vol. 37 No. 6 (2019)

Section

Articles