Skip to footer

Avatar
369 POSTS 0 COMMENTS
Hotels.com is the simple, fast and secure way to book your perfect hotel.

Engineering Blog Articles

Get Your Vacation Started With Hotels.com™ and Uber

Starting today, when you bring up your accommodation reservation on the Hotels.com Android app, on the day of your stay, you’ll be able to call an Uber with one tap.

Research Papers

Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their...

R. Wang, J. Lehman, J. Clune, K. Stanley
While the history of machine learning so far encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. [...] [PDF at arXiv]
arXiv, 2019

Go-Explore: a New Approach for Hard-Exploration Problems

A. Ecoffet, J. Huizinga, J. Lehman, K. Stanley, J. Clune
A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. [...] [PDF at arXiv]
arXiv, 2019

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents

F. Such, V. Madhavan, R. Liu, R. Wang, P. Castro, Y. Li, L. Schubert, M. Bellemare, J. Clune, J. Lehman
Much human and computational effort has aimed to improve how deep reinforcement learning algorithms perform on benchmarks such as the Atari Learning Environment. Comparatively less effort has focused on understanding what has been learned by such methods, and investigating and comparing the representations learned by different families of reinforcement learning (RL) algorithms. [...] [PDF at arXiv]
arXiv, 2019

Robustness to out-of-distribution inputs via taskaware generative uncertainty

R. McAllister, G. Kahn, J. Clune, S. Levine
Deep learning provides a powerful tool for machine perception when the observations resemble the training data. However, real-world robotic systems must react intelligently to their observations even in unexpected circumstances. This requires a system to reason about its own uncertainty given unfamiliar, out-of-distribution observations. [...] [PDF at arXiv]
International Conference on Robotics and Automation (ICRA ), 2019

LanczosNet: Multi-Scale Deep Graph Convolutional Networks

R. Liao, Z. Zhao, R. Urtasun, R. Zemel
Relational data can generally be represented as graphs. For processing such graph structured data, we propose LanczosNet, which uses the Lanczos algorithm to construct low rank approximations of the graph Laplacian for graph convolution. [...] [PDF at University of Toronto]
Neural Information Processing Systems (NIPS), 2018

From Nodes to Networks: Evolving Recurrent Neural Networks

A. Rawal, R. Miikkulainen
Gated recurrent networks such as those composed of Long Short-Term Memory (LSTM) nodes have recently been used to improve state of the art in many sequential processing tasks such as speech recognition and machine translation. However, the basic structure of the LSTM node is essentially the same as when it was first conceived 25 years ago. Recently, evolutionary and reinforcement learning mechanisms have been employed to create new variations of this structure. This paper proposes a new method, evolution of a tree-based encoding of the gated memory nodes, and shows that it makes it possible to explore new variations more effectively than other methods. [...] [PDF at arXiv]
Workshop on Meta-Learning at Conference on Neural Information Processing Systems (MetaLearn @ NeurIPS), 2018

Rotated Rectangles for Symbolized Building Footprint Extraction

M. Dickenson, L. Gueguen
Building footprints (BFP) provide useful visual context for users of digital maps when navigating in space. This paper proposes a method for extracting and symbolizing building footprints from satellite imagery using a convolutional neural network (CNN). [...] [PDF at Computer Vision Foundation open access]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Profiling Android Applications with Nanoscope

L. Liu, L. Takamine, A. Welc
User-level tooling support for profiling Java applications executing on modern JVMs for desktop and server is quite mature – from Open JDK’s Java Flight Recorder enabling low-overhead CPU and heap profiling, through third-party async profilers (e.g. async-profiler, honest-profiler), to Open JDK’s support for low-overhead tracking of allocation call sites. [...] [PDF at UCLA]
Virtual Machines and Language Implementations (VMIL), 2018

Joint Mapping and Calibration via Differentiable Sensor Fusion

J. Chen, F. Obermeyer, V. Lyapunov, L. Gueguen, N. Goodman
We leverage automatic differentiation (AD) and probabilistic programming to develop an end-to-end optimization algorithm for batch triangulation of a large number of unknown objects. Given noisy detections extracted from noisily geo-located street level imagery without depth information, we jointly estimate the number and location of objects of different types, together with parameters for sensor noise characteristics and prior distribution of objects conditioned on side information. [...] [PDF at arXiv]
Computer Vision and Pattern Recognition (CVPR), 2018

SurfConv: Bridging 3D and 2D Convolution for RGBD Images

H. Chu, W. Ma, K. Kundu, R. Urtasun, S. Fidler
The last few years have seen approaches trying to combine the increasing popularity of depth sensors and the success of the convolutional neural networks. Using depth as additional channel alongside the RGB input has the scale variance problem present in image convolution based approaches. [...] [PDF at Computer Vision Foundation open access]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a...

W. Luo, B. Yang, R. Urtasun
In this paper we propose a novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor. By jointly reasoning about these tasks, our holistic approach is more robust to occlusion as well as sparse data at range. [...] [PDF at Computer Vision Foundation open access]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Metropolis-Hastings Generative Adversarial Networks

R. Turner, J. Hung, Y. Saatci, J. Yosinski
We introduce the Metropolis-Hastings generative adversarial network (MH-GAN), which combines aspects of Markov chain Monte Carlo and GANs. The MH-GAN draws samples from the distribution implicitly defined by a GAN's discriminator-generator pair, as opposed to sampling in a standard GAN which draws samples from the distribution defined by the generator. [...] [PDF at arXiv]
arXiv, 2018

Learning to Localize Using a LiDAR Intensity Map

I. Bârsan, S. Wang, A. Pokrovsky, R. Urtasun
In this paper we propose a real-time, calibration-agnostic and effective localization system for self-driving cars. Our method learns to embed the online LiDAR sweeps and intensity map into a joint deep embedding space. [...] [PDF at Proceedings of Machine Learning Research]
Conference on Robot Learning (CORL), 2018

IntentNet: Learning to Predict Intention from Raw Sensor Data

S. Casas, W. Luo, R. Urtasun
In order to plan a safe maneuver, self-driving vehicles need to understand the intent of other traffic participants. We define intent as a combination of discrete high level behaviors as well as continuous trajectories describing future motion. [...] [PDF at Proceedings of Machine Learning Research]
Conference on Robot Learning (CORL), 2018

HDNET: Exploiting HD Maps for 3D Object Detection

B. Yang, M. Liang, R. Urtasun
In this paper we show that High-Definition (HD) maps provide strong priors that can boost the performance and robustness of modern 3D object detectors. Towards this goal, we design a single stage detector that extracts geometric and semantic features from the HD maps. [...] [PDF at Proceedings of Machine Learning Research]
Conference on Robot Learning (CORL), 2018

Probabilistic Meta-Representations Of Neural Networks

T. Karaletsos, P. Dayan, Z. Ghahramani
Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently. Here, we consider a richer prior distribution in which units in the network are represented by latent variables, and the weights between units are drawn conditionally on the values of the collection of those variables. [...] [PDF at arXiv]
UAI 2018 Uncertainty In Deep Learning Workshop (UDL), 2018

Deep Continuous Fusion for Multi-Sensor 3D Object Detection

M. Liang, B. Yang, S. WangR. Urtasun
In this paper, we propose a novel 3D object detector that can exploit both LIDAR as well as cameras to perform very accurate localization. Towards this goal, we design an end-to-end learnable architecture that exploits continuous convolutions to fuse image and LIDAR feature maps at different levels of resolution. [...] [PDF at google.drive.com]
European Conference on Computer Vision (ECCV), 2018

IntentNet: Learning to Predict Intention from Raw Sensor Data

S. Casas, W. Luo, R. Urtasun
In order to plan a safe maneuver, self-driving vehicles need to understand the intent of other traffic participants. We define intent as a combination of discrete high level behaviors as well as continuous trajectories describing future motion. In this paper we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment. [...] [PDF at Proceedings of Machine Learning Research]
Conference on Robot Learning (CORL), 2018

Incremental Few-Shot Learning with Attention Attractor Networks

M. Ren, R. Liao, E. Fetaya, R. Zemel
This paper addresses the problem, incremental few-shot learning, where a regular classification network has already been trained to recognize a set of base classes; and several extra novel classes are being considered, each with only a few labeled examples. [...] [PDF at Workshop on Meta-Learning (MetaLearn 2018)]
Neural Information Processing Systems (NIPS), 2018

Predicting Motion of Vulnerable Road Users using High-Definition Maps and Efficient ConvNets

F. Chou, T.-H. Lin, H. Cui, V. Radosavljevic, T. Nguyen, T. Huang, M. Niedoba, J. Schneider, N. Djuric
Following detection and tracking of traffic actors, prediction of their future motion is the next critical component of a self-driving vehicle (SDV), allowing the SDV to move safely and efficiently in its environment. This is particularly important when it comes to vulnerable road users (VRUs), such as pedestrians and bicyclists. We present a deep learning method for predicting VRU movement where we rasterize high-definition maps and actor's surroundings into bird's-eye view image used as input to convolutional networks. [...] [PDF at OpenReview.net]
Neural Information Processing Systems (NeurIPS) - MLITS workshop, 2018

LSQ++: lower running time and higher recall in multi-codebook quantization

J. Martinez, S. Zakhmi, H. Hoos, and J. Little
Multi-codebook quantization (MCQ) is the task of expressing a set of vectors as accurately as possible in terms of discrete entries in multiple bases. Work in MCQ is heavily focused on lowering quantization error, thereby improving distance estimation and recall on benchmarks of visual descriptors at a fixed memory budget. [...] [PDF at Computer Vision Foundation open access]
European Conference on Computer Vision (ECCV), 2018

End-to-End Deep Structured Models for Drawing Crosswalks

J. Liang, R. Urtasun
In this paper we address the problem of detecting crosswalks from LiDAR and camera imagery. Towards this goal, given multiple Li-DAR sweeps and the corresponding imagery, we project both inputs onto the ground surface to produce a top down view of the scene. [...] [PDF at drive.google.com]
European Conference on Computer Vision (ECCV), 2018

Neural Guided Constraint Logic Programming for Program Synthesis

Lisa Z., G. Rosenblatt, E. Fetaya, R. Liao, W. Byrd, M. Might, R. Urtasun, R. Zemel
Synthesizing programs using example input/outputs is a classic problem in artificial intelligence. We present a method for solving Programming By Example (PBE) problems by using a neural model to guide the search of a constraint logic programming system called miniKanren. [...] [PDF at arXiv]
Advances in Neural Information Processing Systems (NIPS), 2018

Efficient Convolutions for Real-Time Semantic Segmentation of 3D Point Clouds

C. Zhang, W. Luo, R. Urtasun
We propose an approach for semi-automatic annotation of object instances. While most current methods treat object segmentation as a pixel-labeling problem, we here cast it as a polygon prediction task, mimicking how most current datasets have been annotated. [...] [PDF at University of Toronto]
International Conference on 3D Vision (3DV), 2018

Single Image Intrinsic Decomposition Without a Single Intrinsic Image

W. Ma, H, Chu, B. Zhou, R. Urtasun, A. Torralba
We propose an approach for semi-automatic annotation of object instances. While most current methods treat object segmentation as a pixel-labeling problem, we here cast it as a polygon prediction task, mimicking how most current datasets have been annotated. [...] [PDF at MIT]
European Conference on Computer Vision (ECCV), 2018

Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity

T. Miconi, A. Rawal, J. Clune, K. Stanley A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. To address this shortfall, we introduce a new algorithm called Go-Explore. [...] [PDF at OpenReview]
International Conference on Learning Representations (ICLR), 2019

Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks

H. Cui, V. Radosavljevic, F. Chou, T.-H. Lin, T. Nguyen, T. Huang, J. Schneider, N. Djuric
Autonomous driving presents one of the largest problems that the robotics and artificial intelligence communities are facing at the moment, both in terms of difficulty and potential societal impact. Self-driving vehicles (SDVs) are expected to prevent road accidents and save millions of lives while improving the livelihood and life quality of many more. [...] [PDF at arXiv]
International Conference on Robotics and Automation (ICRA), 2019

Functional Programming for Modular Bayesian Inference

A. Ecoffet, J. Huizinga, J. Lehman, K. Stanley, J. Clune
A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. [...] [PDF at University of Cambridge]
arXiv, 2019

Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models

J. Zhang, Y. Wang, P. Molino, L. Li, D. Ebert
Interpretation and diagnosis of machine learning models have gained renewed interest in recent years with breakthroughs in new approaches. We present Manifold, a framework that utilizes visual analysis techniques to support interpretation, debugging, and comparison of machine learning models in a more transparent and interactive manner. [...] [PDF at arXiv]
IEEE Visualization (IEEE VIS), 2018

Discovering Interpretable Representations for Both Deep Generative and Discriminative Models

T. Adel, Z. Ghahramani, A. Weller
Interpretability of representations in both deep generative and discriminative models is highly desirable. Current methods jointly optimize an objective combining accuracy and interpretability. However, this may reduce accuracy, and is not applicable to already trained models. We propose two interpretability frameworks. First, we provide an interpretable lens for an existing model. We use a generative model which takes as input the representation in an existing (generative or discriminative) model, weakly supervised by limited side information. [...] [PDF at Proceedings of Machine Learning Research]
International Conference on Machine Learning (ICML), 2018

COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks

P. Molino, H. Zheng, Y.-C. Wang
For a company looking to provide delightful user experiences, it is of paramount importance to take care of any customer issues. This paper proposes COTA, a system to improve speed and reliability of customer support for end users through automated ticket classification and answers selection for support representatives. [...] [PDF at arXiv]
ACM SIGKDD International Conference on Knowledge Discovery and Data Science (KDD), 2018

Variational Bayesian dropout: pitfalls and fixes

J. Hron, A. Matthews, Z. Ghahramani
Dropout, a stochastic regularisation technique for training of neural networks, has recently been reinterpreted as a specific type of approximate inference algorithm for Bayesian neural networks. The main contribution of the reinterpretation is in providing a theoretical framework useful for analysing and extending the algorithm [...] [PDF on arXiv]
International Conference on Machine Learning (ICML), 2018

Evolving Multimodal Robot Behavior via Many Stepping Stones with the Combinatorial Multi-Objective Evolutionary Algorithm

J. Huizinga, J. Clune
An important challenge in reinforcement learning, including evolutionary robotics, is to solve multimodal problems, where agents have to act in qualitatively different ways depending on the circumstances. Because multimodal problems are often too difficult to solve directly, it is helpful to take advantage of staging, where a difficult task is divided into simpler subtasks that can serve as stepping stones for solving the overall problem. [...] [PDF at arXiv]
arXiv, 2017

An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution

R. Liu, J. Lehman, P. Molino, F.i Such, E. Frank, A. Sergeev, J. Yosinski
Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and one-hot pixel space. [...] [PDF at arXiv]
Advances in Neural Information Processing Systems (NIPS), 2018

Differentiable Compositional Kernel Learning for Gaussian Processes

S. Sun, G. Zhang, C. Wang, W. Zeng, J. Li, R. Grosse
The generalization properties of Gaussian processes depend heavily on the choice of kernel, and this choice remains a dark art. We present the Neural Kernel Network (NKN), a flexible family of kernels represented by a neural network. [...] [PDF at arXiv]
International Conference on Machine Learning (ICML), 2018

GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation

X. Qi, R. Liao, Z. Liu, R. Urtasun, J. Jia
In this paper, we propose Geometric Neural Network (GeoNet) to jointly predict depth and surface normal maps from a single image. Building on top of two-stream CNNs, our GeoNet incorporates geometric relation between depth and surface normal via the new depth-to-normal and normal-to-depth networks. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems

C. Stanton, J. Clune
Traditional exploration methods in RL require agents to perform random actions to find rewards. But these approaches struggle on sparse-reward domains like Montezuma's Revenge where the probability that any random action sequence leads to reward is extremely low. [...] [PDF at arXiv]
arXiv, 2018

Matching Adversarial Networks

G. Mattyus, R. Urtasun
Generative Adversarial Nets (GANs) and Conditonal GANs (CGANs) show that using a trained network as loss function (discriminator) enables to synthesize highly structured outputs (e.g. natural images). However, applying a discriminator network as a universal loss function for common supervised tasks (e.g. semantic segmentation, line detection, depth estimation) is considerably less successful. [...] [PDF at arXiv]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Deep Parametric Continuous Convolutional Neural Networks

S. Wang, S. Suo, W. Ma, A. PokrovskyR. Urtasun
We propose an approach for semi-automatic annotation of object instances. While most current methods treat object segmentation as a pixel-labeling problem, we here cast it as a polygon prediction task, mimicking how most current datasets have been annotated. [...] [PDF at Computer Vision Foundation open access]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

End-to-end Learning of Multi-sensor 3D Tracking by Detection

D. Frossard, R. Urtasun
In this paper we propose a novel approach to tracking by detection that can exploit both cameras as well as LIDAR data to produce very accurate 3D trajectories. Towards this goal, we formulate the problem as a linear program that can be solved exactly, and learn convolutional networks for detection as well as matching in an end-to-end manner. [...] [PDF at arXiv]
International Conference on Robotics and Automation (ICRA), 2018

Pathwise Derivatives for Multivariate Distributions

M. Jankowiak, T. Karaletsos
We exploit the link between the transport equation and derivatives of expectations to construct efficient pathwise gradient estimators for multivariate distributions. We focus on two main threads. [...] [PDF at arXiv]
International Conference on Artificial Intelligence and Statistics (AI STATS) (in submission), 2019

Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning

M. Norouzzadeh, A. Nguyen, M. Kosmala, A. Swanson, M. Palmer, C. Parker, J. Clune
Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would revolutionize our ability to study and conserve ecosystems. We investigate the ability to automatically, accurately, and inexpensively collect such data, which could transform many fields of biology, ecology, and zoology into "big data" sciences. [...] [PDF at University of Wyoming]
PNAS Vol. 115 no. 25, 2018

Hierarchical Recurrent Attention Networks for Structured Online Maps

N. Homayounfar, W. Ma, S. Lakshmikanth, R. Urtasun
In this paper, we tackle the problem of online road network extraction from sparse 3D point clouds. Our method is inspired by how an annotator builds a lane graph, by first identifying how many lanes there are and then drawing each one in turn. [...] [PDF at Computer Vision Foundation open access]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems

C. Stanton, J. Clune
Traditional exploration methods in RL require agents to perform random actions to find rewards. But these approaches struggle on sparse-reward domains like Montezuma's Revenge where the probability that any random action sequence leads to reward is extremely low. Recent algorithms have performed well on such tasks by encouraging agents to visit new states or perform new actions in relation to all prior training episodes (which we call across-training novelty). [...] [PDF at arXiv]
arXiv:1806.00553v3, 2018

Pathwise Derivatives Beyond the Reparameterization Trick

M. Jankowiak, F. Obermeyer
We observe that gradients computed via the reparameterization trick are in direct correspondence with solutions of the transport equation in the formalism of optimal transport. We use this perspective to compute (approximate) pathwise gradients for probability distributions not directly amenable to the reparameterization trick: Gamma, Beta, and Dirichlet. [...] [PDF at arXiv]
International Conference on Machine Learning (ICML), 2018

VINE: An Open Source Interactive Data Visualization Tool for Neuroevolution

R. Wang, J. Clune, K. Stanley
Recent advances in deep neuroevolution have demonstrated that evolutionary algorithms, such as evolution strategies (ES) and genetic algorithms (GA), can scale to train deep neural networks to solve difficult reinforcement learning (RL) problems. However, it remains a challenge to analyze and interpret the underlying process of neuroevolution in such high dimensions. To begin to address this challenge, this paper presents an interactive data visualization tool called VINE (Visual Inspector for NeuroEvolution) aimed at helping neuroevolution researchers and end-users better understand and explore this family of algorithms. [...] [PDF on arXiv]
Visualization Workshop at The Genetic and Evolutionary Computation Conference (GECCO), 2018

Robust Dense Mapping for Large-Scale Dynamic Environments

I. Bârsan, P. Liu, M. Pollefeys, A. Geiger
We present a stereo-based dense mapping algorithm for large-scale dynamic urban environments. In contrast to other existing methods, we simultaneously reconstruct the static background, the moving objects, and the potentially moving but currently stationary objects separately, which is desirable for high-level mobile robotic tasks such as path planning in crowded environments. [...] [PDF at cvlibs.net]
Video: [LINK]
Project Page: [LINK]
International Conference on Robotics and Automation (ICRA), 2018

Measuring the Intrinsic Dimension of Objective Landscapes

Chunyuan Li, Heerad Farkhoor, R. Liu, J. Yosinski
Many recently trained neural networks employ large numbers of parameters to achieve good performance. One may intuitively use the number of parameters required as a rough gauge of the difficulty of a problem. But how accurate are such notions? How many parameters are really needed? In this paper we attempt to answer this question by training networks not in their native parameter space, but instead in a smaller, randomly oriented subspace. [...] [PDF at arXiv]
International Conference on Learning Representations (ICLR), 2018

Gaussian Process Behaviour in Wide Deep Neural Networks

Alexander G. de G. Matthews, Mark Rowland, Jiri Hron, Richard E. Turner, Zoubin Ghahramani
Whilst deep neural networks have shown great empirical success, there is still much work to be done to understand their theoretical properties. In this paper, we study the relationship between random, wide, fully connected, feedforward networks with more than one hidden layer and Gaussian processes with a recursive kernel definition. [...] [PDF at OpenReview.net]
International Conference on Learning Representations (ICLR), 2018

Differentiable plasticity: training plastic neural networks with backpropagation

T. Miconi, J. Clune, K. Stanley
How can we build agents that keep learning from experience, quickly and efficiently, after their initial training? Here we take inspiration from the main mechanism of learning in biological brains: synaptic plasticity, carefully tuned by evolution to produce efficient lifelong learning. We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections. [...] [PDF at arXiv]
International Conference on Machine Learning (ICML), 2018

Sports Field Localization via Deep Structured Models

N. Homayounfar, S. Fidler, R. Urtasun
In this work, we propose a novel way of efficiently localizing a soccer field from a single broadcast image of the game. Related work in this area relies on manually annotating a few key frames and extending the localization to similar images, or installing fixed specialized cameras in the stadium from which the layout of the field can be obtained. [...] [PDF at MIT]
Reference & Citations: [LINK]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

The surprising creativity of digital evolution: A collection of anecdotes from the evolutionary computation...

J. Lehman, J. Clune, D. Misevic, C. Adami, L. Altenberg, J. Beaulieu, P. Bentley, S. Bernard, G. Beslon, D. Bryson, P. Chrabaszcz, N. Cheney, A. Cully, S. Doncieux, F. Dyer, K. Ellefsen, R. Feldt, S. Fischer, S. Forrest, A. Frénoy, C. Gagné, L. Goff, L. Grabowski, B. Hodjat, F. Hutter, L. Keller, C. Knibbe, P. Krcah, R. Lenski, H. Lipson, R. MacCurdy, C. Maestre, R. Miikkulainen, S. Mitri, D. Moriarty, J. Mouret, A. Nguyen, C. Ofria, M. Parizeau, D. Parsons, R. Pennock, W. Punch, T. Ray, M. Schoenauer, E. Shulte, K. Sims, K. Stanley, F. Taddei, D. Tarapore, S. Thibault, W. Weimer, R. Watson, J. Yosinski
Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution's creativity is not limited to nature. [...] [PDF on arXiv]
References & Citations: NASA ADS
arXiv, 2018

Understanding Short-Horizon Bias in Stochastic Meta-Optimization

Y. Wu*, M. Ren*, R. Liao, R. Grosse
Careful tuning of the learning rate, or even schedules thereof, can be crucial to effective neural net training. There has been much recent interest in gradient-based meta-optimization, where one tunes hyperparameters, or even learns an optimizer, in order to minimize the expected loss when the training procedure is unrolled. [...] [PDF on arXiv]
International Conference on Machine Learning (ICLR), 2018

Learning deep structured active contours end-to-end

D. Marcos, D. Tuia, B. Kellenberger, L. Zhang, M. Bai, R. Liao, R. Urtasun
The world is covered with millions of buildings, and precisely knowing each instance's position and extents is vital to a multitude of applications. Recently, automated building footprint segmentation models have shown superior detection accuracy thanks to the usage of Convolutional Neural Networks (CNN). [...] [PDF at Computer Vision Foundation open access]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Meta-Learning for Semi-Supervised Few-Shot Classification

M. Ren, E. Triantafilou, S. Ravi, J. Snell, K. Swersky, J. Tenenbaum, H. Larochelle, R. Zemel
In few-shot classification, we are interested in learning algorithms that train a classifier from only a handful of labeled examples. Recent progress in few-shot classification has featured meta-learning, in which a parameterized model for a learning algorithm is defined and trained on episodes representing different classification problems, each with a small labeled training set and its corresponding test set. [...] [PDF at University of Toronto]
Code & Datasets: [LINK]
International Conference on Machine Learning (ICLR), 2018

Inference in Probabilistic Graphical Models by Graph Neural Networks

K. Yoon, R. Liao, Y. Xiong, L. Zhang, E. Fetaya, R. Urtasun, R. Zemel, X. Pitkow
A fundamental computation for statistical inference and accurate decision-making is to compute the marginal probabilities or most probable states of task-relevant variables. Probabilistic graphical models can efficiently represent the structure of such complex data, but performing these inferences is generally difficult. [...] [PDF at arXiv]
International Conference on Learning Representations (ICLR), 2018

Reviving and Improving Recurrent Back Propagation

R. Liao, Y. Xiong, E. Fetaya, L. Zhang, K. Yoon, X. Pitkow, R. Urtasun, R. Zemel
In this paper, we revisit the recurrent back-propagation (RBP) algorithm, discuss the conditions under which it applies as well as how to satisfy them in deep neural networks. We show that RBP can be unstable and propose two variants based on conjugate gradient on the normal equations (CG-RBP) and Neumann series (Neumann-RBP). [...] [PDF at arXiv]
Conference on Computer Vision and Pattern Recognition (ICML), 2018

Learning to Reweight Examples for Robust Deep Learning

M. Ren, W. Zeng, B. Yang, R. Urtasun
Deep neural networks have been shown to be very powerful modeling tools for many supervised learning tasks involving complex input patterns. However, they can also easily overfit to training set biases and label noises. [...] [PDF at arXiv]
Conference on Computer Vision and Pattern ( ICML), 2018

Incorporating the Structure of the Belief State in End-to-End Task-Oriented Dialogue Systems

L. Shu, P. Molino, M. Namazifar, B. Liu, H. Xu, H. Zheng, and G. Tur
End-to-end trainable networks try to overcome error propagation, lack of generalization and overall brittleness of traditional modularized task-oriented dialogue system architectures. Most proposed models expand on the sequence-to-sequence architecture. Some of them don’t track belief state, which makes it difficult to interact with ever-changing knowledge bases, while the ones that explicitly track the belief state do it with classifiers. The use of classifiers suffers from the out-of-vocabulary words problem, making these models hard to use in real-world applications with ever-changing knowledge bases. We propose Structured Belief Copy Network (SBCN), a novel end-to-end trainable architecture that allows for interaction with external symbolic knowledge bases and solves the out-of-vocabulary problem at the same time. [...] [PDF at alborz-geramifard.com]
Conversational Intelligence Challenge at Conference on Neural Information Processing Systems (ConvAI @ NeurIPS), 2018

Can You be More Polite and Positive? Infusing Social Language into Task-Oriented Conversational Agents

Y.-C. Wang, R. Wang, G. Tur, H. Williams
Goal-oriented conversational agents are becoming ubiquitous in daily life for tasks ranging from personal assistants to customer support systems. For these systems to engage users and achieve their goals in a more natural manner, they need to not just provide informative replies and guide users through the problems but also to socialize with users. To this end, we extend the line of style transfer research on developing generative deep learning models to control for a specific style such as sentiment and personality. [...] [PDF at alborz-geramifard.com]
Conversational Intelligence Challenge at Conference on Neural Information Processing Systems (ConvAI @ NeurIPS), 2018

Graph Partition Neural Networks for Semi-Supervised Classification

R. Liao, M. Brockschmidt, D. Tarlow, A. Gaunt, R. Urtasun, R. Zemel
We present graph partition neural networks (GPNN), an extension of graph neural networks (GNNs) able to handle extremely large graphs. GPNNs alternate between locally propagating information between nodes in small subgraphs and globally propagating information between the subgraphs. [...] [PDF at arXiv]
International Conference on Machine Learning (ICLR Workshop), 2018

The Mirage of Action-Dependent Baselines in Reinforcement Learning

G. Tucker, S. Bhupatiraju, S. Gu, R. Turner, Z. Ghahramani, S. Levine
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance. Several recent papers extend the baseline to depend on both the state and action and suggest that this significantly reduces variance and improves sample efficiency without introducing bias into the gradient estimates. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Leveraging Constraint Logic Programming for Neural Guided Program Synthesis

L. Zhang, G. Rosenblatt, E. Fetaya, R. Liao, W. Byrd, R. Urtasun, R. Zemel
We present a method for solving Programming by Example (PBE) problems that tightly integrates a neural network with a constraint logic programming system called miniKanren. Internally, miniKanren searches for a program that satisfies the recursive constraints imposed by the provided examples. [...] [PDF at OpenReview.net]
International Conference on Machine Learning (ICLR), 2018

Weakly supervised collective feature learning from curated media

Y. Mukuta, A. Kimura, D. Adrian, Z. Ghahramani
The current state-of-the-art in feature learning relies on the supervised learning of large-scale datasets consisting of target content items and their respective category labels. However, constructing such large-scale fully-labeled datasets generally requires painstaking manual effort. One possible solution to this problem is to employ community contributed text tags as weak labels, however, the concepts underlying a single text tag strongly depends on the users. [...] [PDF on arXiv]
AAAI Conference on Artificial Intelligence (AAAI), 2018

NerveNet: Learning Structured Policy with Graph Neural Networks

L. Castrejón, K. Kundu, R. Urtasun, S. Fidler
We address the problem of learning structured policies for continuous control. In traditional reinforcement learning, policies of agents are learned by multi-layer perceptrons (MLPs) which take the concatenation of all observations from the environment as input for predicting actions. [...] [PDF at OpenReview.net]
International Conference on Machine Learning (ICLR), 2018

Faster Neural Networks Straight from JPEG

L. Gueguen, A. Sergeev, B. Kadlec, R. Liu, J. Yosinski
The simple, elegant approach of training convolutional neural networks (CNNs) directly from RGB pixels has enjoyed overwhelming empirical success. But can more performance be squeezed out of networks by using different input representations? In this paper we propose and explore a simple idea: train CNNs directly on the blockwise discrete cosine transform (DCT) coefficients computed and available in the middle of the JPEG codec. [...] [PDF at NIPS Proceedings]
Advances in Neural Information Processing Systems (NIPS), 2018

SBNet: Sparse Block’s Network for Fast Inference

M. Ren, A. Pokrovsky, B. Yang, R. Urtasun
Conventional deep convolutional neural networks (CNNs) apply convolution operators uniformly in space across all feature maps for hundreds of layers - this incurs a high computational cost for real-time applications. For many problems such as object detection and semantic segmentation, we are able to obtain a low-cost computation mask, either from a priori problem knowledge, or from a low-resolution segmentation network. [...] [PDF at arXiv]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking...

E. Conti, V. Madhavan, F. Such, J. Lehman, K. Stanley, J. Clune
Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. hours vs. days) because they parallelize better. [...] [PDF at arXiv]
Advances in Neural Information Processing Systems (NIPS), 2017

On the Relationship Between the OpenAI Evolution Strategy and Stochastic Gradient Descent

X. Zhang, J. Clune, K. Stanley
Because stochastic gradient descent (SGD) has shown promise optimizing neural networks with millions of parameters and few if any alternatives are known to exist, it has moved to the heart of leading approaches to reinforcement learning (RL). [...] [PDF at arXiv]
ArXiv, 2017

Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients

J. Lehman, J. Chen, J. Clune, K. Stanley
While neuroevolution (evolving neural networks) has a successful track record across a variety of domains from reinforcement learning to artificial life, it is rarely applied to large, deep neural networks. A central reason is that while random mutation generally works in low dimensions, a random perturbation of thousands or millions of weights is likely to break existing functionality, providing no learning signal even if some individual weight changes were beneficial. [...] [PDF at arXiv]
The Genetic and Evolutionary Computation Conference (GECCO), 2018

Characterizing how Visual Question Answering models scale with the world

E. Bingham, P. Molino, P. Szerlip, F. Obermeyer, N. Goodman
Detecting differences in generalization ability between models for visual question answering tasks has proven to be surprisingly difficult. We propose a new statistic, asymptotic sample complexity, for model comparison, and construct a synthetic data distribution to compare a strong baseline CNN-LSTM model to a structured neural network with powerful inductive biases. [...] [PDF at Github]
Advances in Neural Information Processing Systems (NeurIPS), 2017

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for...

F. Such, V. Madhavan, E. Conti, J. Lehman, K. Stanley, J. Clune
Deep artificial neural networks (DNNs) are typically trained via gradient-based learning algorithms, namely backpropagation. Evolution strategies (ES) can rival backprop-based algorithms such as Q-learning and policy gradients on challenging deep reinforcement learning (RL) problems. [...] [PDF at arXiv]
Deep RL @ NeurIPS 2018, 2017

Open-endedness: The last grand challenge you’ve never heard of

K. Stanley
Artificial intelligence (AI) is a grand challenge for computer science. Lifetimes of effort and billions of dollars have powered its pursuit. Yet, today its most ambitious vision remains unmet: though progress continues, no human-competitive general digital intelligence is within our reach. [..] [HTML at O’Reilly Online]
O’Reilly Online, 2017

The Reversible Residual Network: Backpropagation Without Storing Activations

A. Gomez, M. Ren, Raquel Urtasun, R. Grosse
Residual Networks (ResNets) have demonstrated significant improvement over traditional Convolutional Neural Networks (CNNs) on image classification, increasing in performance as networks grow both deeper and wider. However, memory consumption becomes a bottleneck as one needs to store all the intermediate activations for calculating gradients using backpropagation. [...] [PDF on NIPS Proceedings]
Supplemental: [LINK]
Advances in Neural Information Processing Systems (NIPS), 2017

ES Is More Than Just a Traditional Finite-Difference Approximator

J. Lehman, J. Chen, Jeff Clune, Kenneth O. Stanley
An evolution strategy (ES) variant based on a simplification of a natural evolution strategy recently attracted attention because it performs surprisingly well in challenging deep reinforcement learning domains. It searches for neural network parameters by generating perturbations to the current set of parameters, checking their performance, and moving in the aggregate direction of higher reward. [...] [PDF at arXiv]
The Genetic and Evolutionary Computation Conference (GECCO), 2018

Automated Identification of Northern Leaf Blight-Infected Maize Plants from Field Imagery Using Deep Learning

C. DeChant, T. Wiesner-Hanks, S, Chen, E. Stewart, J. Yosinski, M. Gore, R. Nelson, and H. Lipson
Northern leaf blight (NLB) can cause severe yield loss in maize; however, scouting large areas to accurately diagnose the disease is time consuming and difficult. We demonstrate a system capable of automatically identifying NLB lesions in field-acquired images of maize plants with high reliability. [...] [PDF at Phytopathology]
Phytopathology, 2017

Variational Gaussian Dropout is not Bayesian

J. Hron, A. Matthews, Z. Ghahramani
Gaussian multiplicative noise is commonly used as a stochastic regularisation technique in training of deterministic neural networks. A recent paper reinterpreted the technique as a specific algorithm for approximate inference in Bayesian neural networks; several extensions ensued. [...] [PDF on Bayesian Deep Learning]
Advances in Neural Information Processing Systems (NeurIPS), 2017

Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks

R. Velez, J. Clune
A long-term goal of AI is to produce agents that can learn a diversity of skills throughout their lifetimes and continuously improve those skills via experience. A longstanding obstacle towards that goal is catastrophic forgetting, which is when learning new information erases previously learned information. [...] [PDF at University of Wyoming]
PLoS One, 2017

Be Your Own Prada: Fashion Synthesis With Structural Coherence

S. Zhu, R. Urtasun, S. Fidler, D. Lin, C. Loy
We present a novel and effective approach for generating new clothing on a wearer through generative adversarial learning. Given an input image of a person and a sentence describing a different outfit, our model "redresses" the person as desired, while at the same time keeping the wearer and her/his pose unchanged. [...] [PDF at arXiv]
International Conference on Computer Vision (ICCV), 2017

DeepRoadMapper: Extracting Road Topology From Aerial Images

G. Máttyus, W. Luo, R. Urtasun
Creating road maps is essential for applications such as autonomous driving and city planning. Most approaches in industry focus on leveraging expensive sensors mounted on top of a fleet of cars. This results in very accurate estimates when exploiting a user in the loop. [...] [PDF at University of Toronto]
International Conference on Computer Vision (ICCV), 2017

3D Graph Neural Networks for RGBD Semantic Segmentation

X. Qi, R. Liao, J. Jia, S. Fidler, R. Urtasun
RGBD semantic segmentation requires joint reasoning about 2D appearance and 3D geometric information. In this paper we propose a 3D graph neural network (3DGNN) that builds a k-nearest neighbor graph on top of 3D point cloud. [...] [PDF at University of Toronto]
International Conference on Computer Vision (ICCV), 2017

SGN: Sequential Grouping Networks for Instance Segmentation

S. Liu, J. Jia, S. Fidler, R. Urtasun
In this paper, we propose Sequential Grouping Networks (SGN) to tackle the problem of object instance segmentation. SGNs employ a sequence of neural networks, each solving a sub-grouping problem of increasing semantic complexity in order to gradually compose objects out of pixels. [...] [PDF at University of Toronto]
International Conference on Computer Vision (ICCV), 2017

Synthesizing Entity Matching Rules by Examples

R. Singh, V. Vamsikrishna Meduri, A. Elmagarmid, S. Madden, P. Papotti, Jo. Quiané-Ruiz, A. Solar-Lezama, N. Tang
Entity matching (EM) is a critical part of data integration. We study how to synthesize entity matching rules from positive-negative matching examples. The core of our solution is program synthesis, a powerful tool to automatically generate rules (or programs) that satisfy a given highlevel specification, via a predefined grammar. [...] [PDF at Very Large Data Base Endowment Inc.]
Proceedings of the VLDB Endowment (CVPR) 11(2): 189-202, 2017

Situation Recognition With Graph Neural Networks

R. Li, M. Tapaswi, R. Liao, J. Jia, R. Urtasun, S. Fidler
We address the problem of recognizing situations in images. Given an image, the task is to predict the most salient verb (action), and fill its semantic roles such as who is performing the action, what is the source and target of the action, etc. [...] [PDF at arXiv]
International Conference on Computer Vision (ICCV), 2017

Lost Relatives of the Gumbel Trick

M. Balog, N. Tripuraneni, Z. Ghahramani, A. Weller
The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. [...] [PDF at Proceedings of Machine Learning Research]
International Conference on Machine Learning (ICML), 2017

A birth-death process for feature allocation

K. Palla, D. Knowles, Z. Ghahramani
We propose a Bayesian nonparametric prior over feature allocations for sequential data, the birthdeath feature allocation process (BDFP). The BDFP models the evolution of the feature allocation of a set of N objects across a covariate (e.g. time) by creating and deleting features. [...] [PDF at University of Oxford]
International Conference on Machine Learning (ICML), 2017

Automatic Discovery of the Statistical Types of Variables in a Dataset

I. Valera, Z. Ghahramani
A common practice in statistics and machine learning is to assume that the statistical data types (e.g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known. However, as the availability of real-world data increases, this assumption becomes too restrictive. [...] [PDF at Proceedings of Machine Learning Research]
International Conference on Machine Learning (ICML), 2017

Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery

Y. Zhang, L. Gueguen, I. Zharkov, P. Zhang, K. Seifert, B. Kadlec Optical Character Recognition (OCR) approaches have been widely advanced in recent years thanks to the resurgence of deep learning. The state-of-the-art models are mainly trained on the datasets consisting of the constrained scenes. Detecting and recognizing text from the real-world images remains a technical challenge. [...] [PDF on MIT]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Neuroevolution: A Different Kind of Deep Learning

K. Stanley
Neuroevolution is making a comeback. Prominent artificial intelligence labs and researchers are experimenting with it, a string of new successes have bolstered enthusiasm, and new opportunities for impact in deep learning are emerging. [...] [HTML at O'Reilly Online]
O’Reilly Online, 2017

Few-Shot Learning Through an Information Retrieval Lens

E. Triantafillou, R. Zemel, R. Urtasun
Few-shot learning refers to understanding new concepts from only a few examples. We propose an information retrieval-inspired approach for this problem that is motivated by the increased importance of maximally leveraging all the available information in this low-data regime. [PDF at NIPS Proceedings]
Code: [LINK]
Advances in Neural Information Processing Systems (NIPS), 2017

General Latent Feature Modeling for Data Exploration Tasks

I. Valera, M. Pradier, Z. Ghahramani
This paper introduces a general Bayesian non- parametric latent feature model suitable to per- form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. [...] [PDF at OpenReview.net]
ICML Workshop on Human Interpretability in Machine Learning (ICML), 2017

Time-series extreme event forecasting with neural networks at Uber

N. Laptev, J. Yosinski, L. Li, S. Smyl
Accurate time-series forecasting during high variance segments (e.g., holidays), is critical for anomaly detection, optimal resource allocation, budget planning and other related tasks. At Uber accurate prediction for completed trips during special events can lead to a more efficient driver allocation resulting in a decreased wait time for the riders. [PDF on roseyu.com]
International Conference on Machine Learning (ICML), 2017

Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

S. Gu, T. Lillicrap, R. Turner, Z. Ghahramani, B. Schölkopf, S. Levine
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques. On the other hand, on-policy algorithms are often more stable and easier to use. [...] [PDF at NIPS Proceedings]
Supplemental: [LINK]
Advances in Neural Information Processing Systems (NIPS), 2017

SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability

M. Raghu, J. Gilmer, J. Yosinski, J. Sohl-Dickstein
We propose a new technique, Singular Vector Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations in a way that is both invariant to affine transform (allowing comparison between different layers and networks) and fast to compute (allowing more comparisons to be calculated than with previous methods). [...] [PDF at arXiv]
Neural Information Processing Systems (NIPS), 2017

Bayesian Generative Adversarial Networks

Y. Saatchi, A. Wilson
Generative adversarial networks (GANs) can implicitly learn rich distributions over images, audio, and data which are hard to model with an explicit likelihood. We present a practical Bayesian formulation for unsupervised and semi-supervised learning with GANs. [...] [PDF at arXiv]
Advances in Neural Information Processing Systems (NeurIPS), 2017

Annotating Object Instances with a Polygon-RNN

L. Castrejón, K. Kundu, R. Urtasun, S. Fidler
We propose an approach for semi-automatic annotation of object instances. While most current methods treat object segmentation as a pixel-labeling problem, we here cast it as a polygon prediction task, mimicking how most current datasets have been annotated. [...] [PDF at University of Toronto]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

The emergence of canalization and evolvability in an open-ended, interactive evolutionary system

J. Huizinga, K. Stanley, J. Clune
Natural evolution has produced a tremendous diversity of functional organisms. Many believe an essential component of this process was the evolution of evolvability, whereby evolution speeds up its ability to innovate by generating a more adaptive pool of offspring. [...] [PDF on arXiv]
Artificial Life (to appear), 2017

Detail-Revealing Deep Video Super-Resolution

X. Tao, H. Gao, R. Liao, J. Wang, J. Jia, K. Kundu
Previous CNN-based video super-resolution approaches need to align multiple frames to the reference. In this paper, we show that proper frame alignment and motion compensation is crucial for achieving high quality results. [...] [PDF at University of Toronto)]
International Conference on Computer Vision (ICCV), 2017

Deep Bayesian Active Learning with Image Data

Y. Gal, R. Islam, Z. Ghahramani
Even though active learning forms an important pillar of machine learning, deep learning tools are not prevalent within it. Deep learning poses several difficulties when used in an active learning setting. [...] [PDF at Proceedings of Machine Learning Research]
International Conference on Machine Learning (ICML), 2017

Towards Diverse and Natural Image Descriptions via a Conditional GAN

B. Dai, S. Fidler, R. Urtasun, D. Lin
In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712.5 km² of land, 8439 km of road and around 400,000 buildings. Our benchmark provides different perspectives of the world captured from airplanes, drones and cars driving around the city. [...] [PDF on ArXiv]
International Conference on Computer Vision (ICCV), 2017

Bayesian inference on random simple graphs with power law degree distributions

J. Lee, C. Heaukulani, Z. Ghahramani, L. James, S. Choi
We present a model for random simple graphs with a degree distribution that obeys a power law (i.e., is heavy-tailed). To attain this behavior, the edge probabilities in the graph are constructed from Bertoin-Fujita-Roynette-Yor (BFRY) random variables, which have been recently utilized in Bayesian statistics for the construction of power law models in several applications. [...] [PDF at arXiv]
International Conference on Machine Learning (ICML), 2017

TorontoCity: Seeing the World With a Million Eyes

S. Wang; M. Bai; G. Mattyus; H. Chu; W. Luo; B. Yang; J. Liang; J. Cheverie; S. Fidler; R. Urtasun; D. Lin.
Despite the substantial progress in recent years, the image captioning techniques are still far from being perfect. Sentences produced by existing methods, e.g. those based on RNNs, are often overly rigid and lacking in variability. [...] [PDF on arXiv]
International Conference on Computer Vision (ICCV), 2017

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space

A. Nguyen, J. Yosinski, Y. Bengio, A. Dosovitskiy, J. Clune
Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network. In this paper we extend this method by introducing an additional prior on the latent code, improving both sample quality and sample diversity, leading to a state-of-the-art generative model that produces high quality images at higher resolutions (227x227) than previous generative models, and does so for all 1000 ImageNet categories. [...] [PDF at arXiv]
Computer Vision and Pattern Recognition (CVPR), 2017

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

S. Gu, T. Lillicrap, Z. Ghahramani, R. Turner, S. Levine
Model-free deep reinforcement learning (RL) methods have been successful in a wide variety of simulated domains. However, a major obstacle facing deep RL in the real world is their high sample complexity. [...] [PDF at OpenReview.net]
International Conference on Learning Representations (ICLR), 2017

Deep Watershed Transform for Instance Segmentation

M. Bai, R. Urtasun
Most contemporary approaches to instance segmentation use complex pipelines involving conditional random fields, recurrent neural networks, object proposals, or template matching schemes. In our paper, we present a simple yet powerful end-to-end convolutional neural network to tackle this task. [...] [PDF at arXiv]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Magnetic Hamiltonian Monte Carlo

N. Tripuraneni, M. Rowland, Z. Ghahramani, R. Turner
Hamiltonian Monte Carlo (HMC) exploits Hamiltonian dynamics to construct efficient proposals for Markov chain Monte Carlo (MCMC). In this paper, we present a generalization of HMC which exploits \textit{non-canonical} Hamiltonian dynamics. [...] [PDF at arXiv]
International Conference on Machine Learning (ICML), 2017

End-To-End Instance Segmentation With Recurrent Attention

M. Ren, R. Zemel
While convolutional neural networks have gained impressive success recently in solving structured prediction problems such as semantic segmentation, it remains a challenge to differentiate individual object instances in the scene. Instance segmentation is very important in a variety of applications, such as autonomous driving, image captioning, and visual question answering. [...] [PDF at University of Toronto]
Supplementary Materials: [LINK]
Code: [LINK]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Forecasting Interactive Dynamics of Pedestrians with Fictitious Play

W. Ma, D. Huang, N. Lee, K. Kitani
We develop predictive models of pedestrian dynamics by encoding the coupled nature of multi-pedestrian interaction using game theory, and deep learning-based visual analysis to estimate person-specific behavior parameters. Building predictive models for multi-pedestrian interactions however, is very challenging due to two reasons [...] [PDF at arXiv]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Conditional Similarity Networks

A. Veit, S. Belongie, T. Karaletsos What makes images similar? To measure the similarity between images, they are typically embedded in a feature-vector space, in which their distance preserve the relative dissimilarity. However, when learning such similarity embeddings the simplifying assumption is commonly made that images are only compared to one unique measure of similarity. [...] [PDF at arXiv]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Popular Articles