# Joel Lehman

## Engineering Blog Articles

### Creating a Zoo of Atari-Playing Agents to Catalyze the Understanding of Deep Reinforcement Learning

Uber AI Labs releases Atari Model Zoo, an open source repository of both trained Atari Learning Environment agents and tools to better understand them.

### POET: Endlessly Generating Increasingly Complex and Diverse Learning Environments and their Solutions through the...

Uber AI Labs introduces the Paired Open-Ended Trailblazer (POET), an algorithm that leverages open-endedness to push the bounds of machine learning.

### Montezuma’s Revenge Solved by Go-Explore, a New Algorithm for Hard-Exploration Problems (Sets Records on...

Uber AI Labs introduces Go-Explore, a new reinforcement learning algorithm for solving a variety of challenging problems, especially in robotics.

### An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution

As powerful and widespread as convolutional neural networks are in deep learning, AI Labs’ latest research reveals both an underappreciated failing and a simple fix.

## Research Papers

### Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their...

**R. Wang**,

**J. Lehman**,

**J. Clune**,

**K. Stanley**

While the history of machine learning so far encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. [...]

**[PDF at arXiv]**

*2019*

### Go-Explore: a New Approach for Hard-Exploration Problems

**A. Ecoffet**,

**J. Huizinga**,

**J. Lehman**,

**K. Stanley**,

**J. Clune**

A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. [...]

**[PDF at arXiv]**

*2019*

### An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents

**F. Such**,

**V. Madhavan**,

**R. Liu**,

**R. Wang**, P. Castro,

**Y. Li**, L. Schubert, M. Bellemare,

**J. Clune**,

**J. Lehman**

Much human and computational effort has aimed to improve how deep reinforcement learning algorithms perform on benchmarks such as the Atari Learning Environment. Comparatively less effort has focused on understanding what has been learned by such methods, and investigating and comparing the representations learned by different families of reinforcement learning (RL) algorithms. [...]

**[PDF at arXiv]**

*2019*

### An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution

**R. Liu**,

**J. Lehman**,

**P. Molino**,

**F.i Such**,

**E. Frank**,

**A. Sergeev**,

**J. Yosinski**

Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and one-hot pixel space. [...]

**[PDF at arXiv]**

*Advances in Neural Information Processing Systems*

**(NeurIPS)**, 2018### The surprising creativity of digital evolution: A collection of anecdotes from the evolutionary computation...

**J. Lehman**,

**J. Clune**, D. Misevic, C. Adami, L. Altenberg, J. Beaulieu, P. Bentley, S. Bernard, G. Beslon, D. Bryson, P. Chrabaszcz, N. Cheney, A. Cully, S. Doncieux, F. Dyer, K. Ellefsen, R. Feldt, S. Fischer, S. Forrest, A. Frénoy, C. Gagné, L. Goff, L. Grabowski, B. Hodjat, F. Hutter, L. Keller, C. Knibbe, P. Krcah, R. Lenski, H. Lipson, R. MacCurdy, C. Maestre, R. Miikkulainen, S. Mitri, D. Moriarty, J. Mouret, A. Nguyen, C. Ofria, M. Parizeau, D. Parsons, R. Pennock, W. Punch, T. Ray, M. Schoenauer, E. Shulte, K. Sims,

**K. Stanley**, F. Taddei, D. Tarapore, S. Thibault, W. Weimer, R. Watson,

**J. Yosinski**

Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution's creativity is not limited to nature. [...]

**[PDF at arXiv]**

*2018*

### Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking...

E. Conti,

Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. hours vs. days) because they parallelize better. [...]

**V. Madhavan**,**F. Such**,**J. Lehman**,**K. Stanley**,**J. Clune**Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. hours vs. days) because they parallelize better. [...]

**[PDF at arXiv]***ViGIL @ NeurIPS 2017***(NeurIPS)**, 2017### Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients

**J. Lehman**,

**J. Chen**,

**J. Clune**,

**K. Stanley**

While neuroevolution (evolving neural networks) has a successful track record across a variety of domains from reinforcement learning to artificial life, it is rarely applied to large, deep neural networks. A central reason is that while random mutation generally works in low dimensions, a random perturbation of thousands or millions of weights is likely to break existing functionality, providing no learning signal even if some individual weight changes were beneficial. [...]

**[PDF at arXiv]**

*The Genetic and Evolutionary Computation Conference*

**(GECCO)**, 2018### Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for...

**F. Such**,

**V. Madhavan**, E. Conti,

**J. Lehman**,

**K. Stanley**,

**J. Clune**

Deep artificial neural networks (DNNs) are typically trained via gradient-based learning algorithms, namely backpropagation. Evolution strategies (ES) can rival backprop-based algorithms such as Q-learning and policy gradients on challenging deep reinforcement learning (RL) problems. [...]

**[PDF at arXiv]**

*Deep RL @ NeurIPS 2018, 2017*

### ES Is More Than Just a Traditional Finite-Difference Approximator

**J. Lehman**,

**J. Chen**,

**Jeff Clune**,

**Kenneth O. Stanley**

An evolution strategy (ES) variant based on a simplification of a natural evolution strategy recently attracted attention because it performs surprisingly well in challenging deep reinforcement learning domains. It searches for neural network parameters by generating perturbations to the current set of parameters, checking their performance, and moving in the aggregate direction of higher reward. [...]

**[PDF at arXiv]**

*The Genetic and Evolutionary Computation Conference*

**(GECCO)**, 2018