Skip to footer

Janice Lan

Janice Lan
Janice Lan is a research scientist with Uber AI.

Engineering Blog Articles

Introducing LCA: Loss Change Allocation for Neural Network Training

Uber AI Labs proposes Loss Change Allocation (LCA), a new method that provides a rich window into the neural network training process.

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

Uber builds upon the Lottery Ticket Hypothesis by proposing explanations behind these mechanisms and deriving a surprising by-product: the Supermask.

Research Papers

First-Order Preconditioning via Hypergradient Descent

T. Moskovitz, R. Wang, J. Lan, S. Kapoor, T. Miconi, J. Yosinski, A. Rawal
Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence. Unfortunately, such algorithms typically struggle to scale to high-dimensional problems, in part because the calculation of specific preconditioners such as the inverse Hessian or Fisher information matrix is highly expensive. We introduce first-order preconditioning (FOP), a fast, scalable approach that generalizes previous work on hypergradient descent (Almeida et al., 1998; Maclaurin et al., 2015; Baydin et al.,2017) to learn a preconditioning matrix that only makes use of first-order information. [...] [PDF]
Conference on Neural Information Processing Systems (NeurlPS), 2019

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

S. Dathathri, A. Madotto, J. Lan, J. Hung, E. Frank, P. Molino, J. Yosinski, R. Liu
Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. [PDF]
International Conference on Learning Representations (ICLR), 2020

LCA: Loss Change Allocation for Neural Network Training

J. Lan, R. Liu, H. Zhou, J. Yosinski
Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into this high-dimensional, dynamic process. We propose a new window into training called Loss Change Allocation (LCA), in which credit for changes to the network loss is conservatively partitioned to the parameters. [...] [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

H. Zhou, J. Lan, R. Liu, J. Yosinski
Optical Character Recognition (OCR) approaches have been widely advanced in recent years thanks to the resurgence of deep learning. The state-of-the-art models are mainly trained on the datasets consisting of the constrained scenes. Detecting and recognizing text from the real-world images remains a technical challenge. [...] [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

Popular Articles