Skip to footer
Home Research Artificial Intelligence / Machine Learning Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints

Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints



In most practical settings and theoretical analysis, one assumes that a model can be trained until convergence. However, the growing complexity of machine learning datasets and models may violate such assumptions. Moreover, current approaches for hyper-parameter tuning and neural architecture search tend to be limited by practical resource constraints. Therefore, we introduce a formal setting for studying training under the non-asymptotic, resource-constrained regime, i.e. budgeted training. We analyze the following problem: “given a dataset, algorithm, and resource budget, what is the best achievable performance?” We focus on the number of optimization iterations as the representative resource. Under such a setting, we show that it is critical to adjust the learning rate schedule according to the given budget. Among budget-aware learning schedules, we find simple linear decay to be both robust and high-performing. We support our claim through extensive experiments with state-of-the-art models on ImageNet (image classification), Cityscapes (semantic segmentation), MS COCO (object detection and instance segmentation), and Kinetics (video classification). We also analyze our results and find that the key to a good schedule is budgeted convergence, a phenomenon whereby the gradient vanishes at the end of each allowed budget. We also revisit existing approaches for fast convergence, and show that budget-aware learning schedules readily outperform such approaches under (the practical but under-explored) budgeted setting.


Mengtian Li, Ersin Yumer, Deva Ramanan


ICLR 2020

Full Paper

‘Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints’ (PDF)

Uber ATG

Previous article Evolvability ES: Scalable and Direct Optimization of Evolvability
Next article NullAway: Practical Type-Based Null Safety for Java
Ersin Yumer is a Staff Research Scientist, leading the San Francisco research team within Uber ATG R&D. Prior to joining Uber, he led the perception machine learning team at Argo AI, and before that he spent three years at Adobe Research. He completed his PhD studies at Carnegie Mellon University, during which he spent several summers at Google Research as well. His current research interests lie at the intersection of machine learning, 3D computer vision, and graphics. He develops end-to-end learning systems and holistic machine learning applications that bring signals of the visual world together: images, point clouds, videos, 3D shapes and depth scans.