Skip to footer
Home Research Artificial Intelligence / Machine Learning LCA: Loss Change Allocation for Neural Network Training

LCA: Loss Change Allocation for Neural Network Training

Abstract

Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into this high-dimensional, dynamic process. We propose a new window into training called Loss Change Allocation (LCA), in which credit for changes to the network loss is conservatively partitioned to the parameters. This measurement is accomplished by decomposing the components of an approximate path integral along the training trajectory using a Runge-Kutta integrator. This rich view shows which parameters are responsible for decreasing or increasing the loss during training, or which parameters “help” or “hurt” the network’s learning, respectively. LCA may be summed over training iterations and/or over neurons, channels, or layers for increasingly coarse views. This new measurement device produces several insights into training. (1) We find that barely over 50% of parameters help during any given iteration. (2) Some entire layers hurt overall, moving on average against the training gradient, a phenomenon we hypothesize may be due to phase lag in an oscillatory training process. (3) Finally, increments in learning proceed in a synchronized manner across layers, often peaking on identical iterations.

Authors

Janice Lan, Rosanne Liu, Hattie Zhou, Jason Yosinski

Conference

NeurIPS 2019

Full Paper

‘LCA: Loss Change Allocation for Neural Network Training’ (PDF)

Uber AI

Comments
Previous article Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
Next article Hamiltonian Neural Networks
Janice Lan
Janice Lan is a research scientist with Uber AI.
Rosanne Liu
Rosanne is a senior research scientist and a founding member of Uber AI. She obtained her PhD in Computer Science at Northwestern University, where she used neural networks to help discover novel materials. She is currently working on the multiple fronts where machine learning and neural networks are mysterious. She attempts to write in her spare time.
Hattie Zhou
Hattie Zhou is a data scientist with Uber's Marketing Analytics team.
Jason Yosinski
Jason Yosinski is a founding member of Uber AI Labs and there leads the Deep Collective research group. He is known for contributions to understanding neural network modeling, representations, and training. Prior to Uber, Jason worked on robotics at Caltech, co-founded two web companies, and started a robotics program in Los Angeles middle schools that now serves over 500 students. He completed his PhD working at the Cornell Creative Machines Lab, University of Montreal, JPL, and Google DeepMind. He is a recipient of the NASA Space Technology Research Fellowship, has co-authored over 50 papers and patents, and was VP of ML at Geometric Intelligence, which Uber acquired. His work has been profiled by NPR, the BBC, Wired, The Economist, Science, and the NY Times. In his free time, Jason enjoys cooking, reading, paragliding, and pretending he's an artist.