Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
AI

Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data

December 18, 2019 / Global
Featured image for Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
Figure 1: An overview of generative teaching networks (GTNs). The generator (a deep neural network) generates synthetic data that a newly created learner neural network trains on. After training on GTN-produced data, the learner is able to perform well on the target task despite never having seen real data.
Figure 2: Training is faster on GTN-produced synthetic data than real data, leading to higher MNIST performance when only training for a few number of SGD steps.
Figure 3: MNIST images generated by a GTN with a curriculum. The curriculum proceeds from left to right (each column is one of the 32 batches of data).
Figure 4: On CIFAR, training is also faster on GTN-produced synthetic data than real data, enabling a 4x speedup for the same performance level.
Figure 5: Correlation plot between final performance after training for 30 seconds with GTN synthetic data compared to. four hours with real data for the top 50 percent of architectures according to the GTN estimate. The correlation is high enough (0.5582 Spearman rank-correlation) that selecting the top architectures according to the GTN estimate will also select architectures that are truly high-performing. Blue squares represent the top 10 percent of architectures according to the GTN estimate.
Table 1: GTNs can serve as a drop-in replacement for real data to speed up NAS. Here, results are with simple random search NAS, but GTNs should speed up any NAS method. The number of parameters refers to the number of weights in the learner neural network.
Felipe Petroski Such

Felipe Petroski Such

Felipe Petroski Such is a research scientist focusing on deep neuroevolution, reinforcement learning, and HPC. Prior to joining the Uber AI labs he obtained a BS/MS from the RIT where he developed deep learning architectures for graph applications and ICR as well as hardware acceleration using FPGAs.

Aditya Rawal

Aditya Rawal

Aditya Rawal is a research scientist at Uber AI Labs. His interests lies at the convergence of two research fields - neuroevolution and deep learning. His belief is that evolutionary search can replace human ingenuity in creating next generation of deep networks. Previously, Aditya received his MS/PhD in Computer Science from University of Texas at Austin, advised by Prof. Risto Miikkulainen. During his PhD, he developed neuroevolution algorithms to evolve recurrent architectures for sequence-prediction problems and construct multi-agent systems that cooperate, compete and communicate.

Joel Lehman

Joel Lehman

Joel Lehman was previously an assistant professor at the IT University of Copenhagen, and researches neural networks, evolutionary algorithms, and reinforcement learning.

Kenneth O. Stanley

Kenneth O. Stanley

Before joining Uber AI Labs full time, Ken was an associate professor of computer science at the University of Central Florida (he is currently on leave). He is a leader in neuroevolution (combining neural networks with evolutionary techniques), where he helped invent prominent algorithms such as NEAT, CPPNs, HyperNEAT, and novelty search. His ideas have also reached a broader audience through the recent popular science book, Why Greatness Cannot Be Planned: The Myth of the Objective.

Jeff Clune

Jeff Clune

Jeff Clune is the former Loy and Edith Harris Associate Professor in Computer Science at the University of Wyoming, a Senior Research Manager and founding member of Uber AI Labs, and currently a Research Team Leader at OpenAI. Jeff focuses on robotics and training neural networks via deep learning and deep reinforcement learning. He has also researched open questions in evolutionary biology using computational models of evolution, including studying the evolutionary origins of modularity, hierarchy, and evolvability. Prior to becoming a professor, he was a Research Scientist at Cornell University, received a PhD in computer science and an MA in philosophy from Michigan State University, and received a BA in philosophy from the University of Michigan. More about Jeff’s research can be found at JeffClune.com

Posted by Felipe Petroski Such, Aditya Rawal, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Category: