Convergence of Backpropagation with Momentum for Network Architectures with Skip Connections

Authors

  • Chirag Agarwal Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, IL 60607, USA
  • Joe Klobusicky Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
  • Dan Schonfeld Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, IL 60607, USA

DOI:

https://doi.org/10.4208/jcm.1912-m2018-0279

Keywords:

Backpropagation with momentum, Autoencoders, Directed acyclic graphs.

Abstract

We study a class of deep neural networks with architectures that form a directed acyclic graph (DAG). For backpropagation defined by gradient descent with adaptive momentum, we show weights converge for a large class of nonlinear activation functions. The proof generalizes the results of Wu et al. (2008) who showed convergence for a feed-forward network with one hidden layer. For an example of the effectiveness of DAG architectures, we describe an example of compression through an AutoEncoder, and compare against sequential feed-forward networks under several metrics.

Published

2021-06-10

Issue

Section

Articles