Skip to footer
Home Authors Posts by

Director, Product Management @PaperlessPost

Engineering Blog Articles

Let’s Go to the Party.

An Uber Developers guest post from Paperless Post by Kaspar Alexander

There’s a party tonight, and it’s time to get dressed (what’s the vibe at that place again?), round up your friends (ugh, why are they always late?), and figure …

Research Papers

Physically Realizable Adversarial Examples for LiDAR Object Detection

J. Tu, M.Ren, S.Manivasagam, B. Yang, M. Liang, R. Du, F.Cheng, R. Urtasun
Modern autonomous driving systems rely heavily on deep learning models to process point cloud sensory data; meanwhile, deep models have been shown to be susceptible to adversarial attacks with visually imperceptible perturbations. Despite the fact that this poses a security concern for the self-driving industry, there has been very little exploration in terms of 3D perception, as most adversarial attacks have only been applied to 2D flat images. [...] [PDF]
Computer Vision and Pattern Recognition (CVPR), 2017

First-Order Preconditioning via Hypergradient Descent

T. Moskovitz, R. Wang, J. Lan, S. Kapoor, T. Miconi, J. Yosinski, A. Rawal
Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence. Unfortunately, such algorithms typically struggle to scale to high-dimensional problems, in part because the calculation of specific preconditioners such as the inverse Hessian or Fisher information matrix is highly expensive. We introduce first-order preconditioning (FOP), a fast, scalable approach that generalizes previous work on hypergradient descent (Almeida et al., 1998; Maclaurin et al., 2015; Baydin et al.,2017) to learn a preconditioning matrix that only makes use of first-order information. [...] [PDF]
Conference on Neural Information Processing Systems (NeurlPS), 2019

Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods

J. Zhi, R. Wang, J. Clune, K. Stanley
Recent advances in machine learning are consistently enabled by increasing amounts of computation. Reinforcement learning (RL) and population-based methods in particular pose unique challenges for efficiency and flexibility to the underlying distributed computing frameworks. These challenges include frequent interaction with simulations, the need for dynamic scaling, and the need for a user interface with low adoption cost and consistency across different backends. In this paper we address these challenges while still retaining development efficiency and flexibility for both research and practical applications by introducing Fiber, a scalable distributed computing framework for RL and population-based methods. [...] [PDF]

Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients

A. Edwards, Himanshu Sahni, R. Liu, J. Hung, A. Jain, R. Wang, A. Ecoffet, T. Miconi, C. Isbell, J. Yosinski
In this paper, we introduce a novel form of value function, Q(s,s′), that expresses the utility of transitioning from a state s to a neighboring state s′ and then acting optimally thereafter. In order to derive an optimal policy, we develop a forward dynamics model that learns to make next-state predictions that maximize this value. [...] [PDF]
International Conference on Machine Learning (ICML), 2020

Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions

R. Wang, J. Lehman, A. Rawal, J. Zhi, Y. Li, J. Clune, K. Stanley
Creating open-ended algorithms, which generate their own never-ending stream of novel and appropriately challenging learning opportunities, could help to automate and accelerate progress in machine learning. A recent step in this direction is the Paired Open-Ended Trailblazer (POET), an algorithm that generates and solves its own challenges, and allows solutions to goal-switch between challenges to avoid local optima. Here we introduce and empirically validate two new innovations to the original algorithm, as well as two external innovations designed to help elucidate its full potential. [...] [PDF]
International Conference on Machine Learning (ICML), 2020

Heterogeneous Causal Learning for Effectiveness Optimization in User Marketing

W. Y. Zou, S. Du, J. Lee, J. Pedersen
User marketing is a key focus of consumer-based internet companies. Learning algorithms are effective to optimize marketing campaigns which increase user engagement, and facilitates cross-marketing to related products. By attracting users with rewards, marketing methods are effective to boost user activity in the desired products. Rewards incur significant cost that can be off-set by increase in future revenue. [...] [PDF]

Learning Continuous Treatment Policy and Bipartite Embeddings for Matching with Heterogeneous Causal Effects

W. Y. Zou, S. Shyam, M. Mui, M. Wang, J. Pedersen, Z. Ghahramani
Causal inference methods are widely applied in the fields of medicine, policy, and economics. Central to these applications is the estimation of treatment effects to make decisions. Current methods make binary yes-or-no decisions based on the treatment effect of a single outcome dimension. These methods are unable to capture continuous space treatment policies with a measure of intensity. [...] [PDF]

Discovering Essential Multiple Gene Effects through Large Scale Optimization: an Application to Human Cancer Metabolism

T. Durieux, Y. Hamadi, M. Monperrus
Over the last few years, the complexity of web applications has increased to provide more dynamic web applications to users. The drawback of this complexity is the growing number of errors in the front‐end applications. In this paper, we present an approach to provide self‐healing for the web. [...] [PDF]
Software Testing Verification and Reliability 30(2), March 2018

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

S. Dathathri, A. Madotto, J. Lan, J. Hung, E. Frank, P. Molino, J. Yosinski, R. Liu
Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. [PDF]
International Conference on Learning Representations (ICLR), 2020

Fully Automated HTML and JavaScript Rewriting for Constructing a Self‐healing Web Proxy

T. Durieux, Y. Hamadi, M. Monperrus
Over the last few years, the complexity of web applications has increased to provide more dynamic web applications to users. The drawback of this complexity is the growing number of errors in the front‐end applications. In this paper, we present an approach to provide self‐healing for the web. [...] [PDF]
Software Testing Verification and Reliability 30(2), March 2018

Joint Interaction and Trajectory Prediction for Autonomous Driving using Graph Neural Networks

D. Lee, Y. Gu, J. Hoang, M. Marchetti-Bowick
Using weakly intent label can potentially predict the interaction and the resulting trajectory better. We use a GNN to model the interaction. [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

Identifying Unknown Instances for Autonomous Driving

K. Wong, S. Wang, M. Ren, M. Liang, R. Urtasun
We propose a novel open-set instance segmentation algorithm for point clouds that identifies instances from both known and unknown classes. In particular, we train a deep convolutional neural network that projects points belonging to the same instance together in a category-agnostic embedding space. [PDF]
The Conference on Robot Learning (CoRL), 2019

Discrete Residual Flow for Probabilistic Pedestrian Behavior Prediction

A. Jain, S. Casas, R. Liao, Y. Xiong, S. Feng, S. Segal, R. Urtasun
Our research shows that non-parametric distributions can capture extremely well the (erratic) pedestrian behavior. We propose Discrete Residual Flow, a convolutional neural network for human motion prediction that accurately models the temporal dependencies and captures the uncertainty inherent in long-range motion forecasting. In particular, our method captures multi-modal posteriors over future human motion very realistically. [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

Incremental Few-Shot Learning with Attention Attractor Networks

M. Ren, R. Liao, E. Fetaya, R. Zemel
This paper addresses this problem, incremental few- shot learning, where a regular classification network has already been trained to recognize a set of base classes, and several extra novel classes are being considered, each with only a few labeled examples. After learning the novel classes, the model is then evaluated on the overall classification performance on both base and novel classes. To this end, we propose a meta-learning model, the Attention Attractor Networks, which regularizes the learning of novel classes. [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

Efficient Graph Generation with Graph Recurrent Attention Networks

R. Liao, Y. Li, Y. Song, S. Wang, C. Nash, W. L. Hamilton, . Duvenaud, R. Urtasun, R.S. Zemel
We propose a new family of efficient and expressive generative models of graphs, called Graph Recurrent Attention Networks (GRANs). On standard benchmarks, our model generates graphs comparable in quality with the previous state-of-the-art, and is at least an order of magnitude faster. [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

Optimization of Swift Protocols

R. Barik, M. Sridharan, M. K. Ramanathan, M. Chabbi
Swift, an increasingly-popular programming language, advocates the use of protocols, which define a set of required methods and properties for conforming types. Protocols are commonly used in Swift programs for abstracting away implementation details; e.g., in a large industrial app from Uber, they are heavily used to enable mock objects for unit testing. Unfortunately, heavy use of protocols can result in significant performance overhead. [...] [PDF]
Object-Oriented Programming, Systems, Languages & Applications (OOPSLA), 2019

Learning Joint 2D-3D Representations for Depth Completion

Y. Chen, B. Yang, M. Liang, R. Urtasun
We design a simple yet effective architecture that fuses information between 2D and 3D representations at multiple levels to learn fully fused joint representations at multiple levels, and show state-of-the-art results on the KITTI depth completion benchmark. [PDF]
International Conference on Computer Vision (ICCV), 2019

Improving Movement Prediction of Traffic Actors using Off-road Loss and Bias Mitigation

M. Niedoba, H. Cui, K. Luo, D. Hegde, F.-C. Chou, N. Djuric
In this work improves the predictions for traffic actor with two novel methods: off-road losses and action category upweighting. The off-road losses compliment the traditional L2 distance loss by penalizing the unrealistic off-road predictions. [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

DAGMapper: Learning to Map by Discovering Lane Topology

N. Homayounfar, W.-C. Ma\*, J. Liang\*, X. Wu, J. Fan, R. Urtasun
We map complex lane topologies in highways by formulating the problem as a deep directed graphical model. As an interesting result, we can train our model in I40 and generalize to unseen highways in SF. [PDF]
International Conference on Computer Vision (ICCV), 2019

DMM-Net: Differentiable Mask-Matching Network for Video Instance Segmentation

X. Zeng, R. Liao, L. Gu, Y. Xiong, S. Fidler, R. Urtasun
We propose the differentiable mask-matching network (DMM-Net) for solving the video instance segmentation problem where the initial instance masks are provided. On DAVIS 2017 dataset, DMM-Net achieves the best performance without online learning on the first frames and the 2nd best with it. Without any fine-tuning, DMM-Net performs comparably to state-of-the-art methods on SegTrack v2 dataset. [PDF]
International Conference on Computer Vision (ICCV), 2019

Uncertainty-aware Short-term Motion Prediction of Traffic Actors for Autonomous Driving

N. Djuric, V. Radosavljevic, H. Cui, T. Nguyen, F.-C. Chou, T.-H. Lin, N. Singh, J. Schneider
We introduce an approach that takes into account a current world state and produces rasterized representations of each traffic actor's vicinity. The raster images are then used as inputs to deep convnets to infer future movement of actors while also accounting for and capturing inherent uncertainty of the prediction task, with extensive experiments on real-world data strongly suggest benefits of the proposed approach. [PDF]
Winter Conference on Applications of Computer Vision (WACV), 2020

Hamiltonian Neural Networks

S. Greydanus, M. Dzamba, J. Yosinski
Even though neural networks enjoy widespread use, they still struggle to learn the basic laws of physics. How might we endow them with better inductive biases? In this paper, we draw inspiration from Hamiltonian mechanics to train models that learn and respect exact conservation laws in an unsupervised manner. [...] [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

LCA: Loss Change Allocation for Neural Network Training

J. Lan, R. Liu, H. Zhou, J. Yosinski
Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into this high-dimensional, dynamic process. We propose a new window into training called Loss Change Allocation (LCA), in which credit for changes to the network loss is conservatively partitioned to the parameters. [...] [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

H. Zhou, J. Lan, R. Liu, J. Yosinski
Optical Character Recognition (OCR) approaches have been widely advanced in recent years thanks to the resurgence of deep learning. The state-of-the-art models are mainly trained on the datasets consisting of the constrained scenes. Detecting and recognizing text from the real-world images remains a technical challenge. [...] [PDF]
Conference on Neural Information Processing Systems (NeurIPS), 2019

DeepPruner: Learning Efficient Stereo Matching via Differentiable PatchMatch

S. Duggal, S. Wang, W.-C. Ma, R. Hu, R. Urtasun
We propose a real-time dense depth estimation approach using stereo image pairs, which utilizes differentiable Patch Match to progressively prune the stereo matching search space. Our model achieves competitive performance on the KITTI benchmark despite running in real time. [PDF]
International Conference on Computer Vision (ICCV), 2019

Maximum Relevance and Minimum Redundancy Feature Selection Methods for a Marketing Machine Learning Platform

Z. Zhao, R. Anand, M. Wang
In machine learning applications for online product offerings and marketing strategies, there are often hundreds or thousands of features available to build such models. Feature selection is one essential method in such applications for multiple objectives: improving the prediction accuracy by eliminating irrelevant features, accelerating the model training and prediction speed, reducing the monitoring and maintenance workload for feature data pipeline, and providing better model interpretation and diagnosis capability. [...] [PDF]
IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2019

Uplift Modeling for Multiple Treatments with Cost Optimization

Z. Zhao, T. Harinen
Uplift modeling is an emerging machine learning approach for estimating the treatment effect at an individual or subgroup level. It can be used for optimizing the performance of interventions such as marketing campaigns and product designs. [...] [PDF]
IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2019

DSIC: Deep Stereo Image Compression

J. Liu, S. Wang, R. Urtasun
We design a novel architecture for compressing a stereo image pair that tries to extract as much shared information from the first image in order to reduce the bitrate of the second image. We demonstrate an impressive 30-50% reduction in the second image bitrate at low bitrates. [PDF]
International Conference on Computer Vision (ICCV), 2019

Flexibly-Structured Model for Task-Oriented Dialogues

L. Shu, P. Molino, M. Namazifar, H. Xu, B. Liu, H. Zheng, G. Tur
This paper proposes a novel end-to-end architecture for task-oriented dialogue systems. It is based on a simple and practical yet very effective sequence-to-sequence approach, where language understanding and state tracking tasks are modeled jointly with a structured copy-augmented sequential decoder and a multi-label decoder for each slot. The policy engine and language generation tasks are modeled jointly following that. [...] [PDF]

Improve User Retention with Causal Learning

S. Du, J. Lee, F. Ghaffarizadeh
User retention is a key focus for consumer based internet companies and promotions are an effective lever to improve retention. However, companies rely either on non-causal churn prediction to capture heterogeneity or on regular A/B testing to capture average treatment effect. In this paper, we propose a heterogeneous treatment effect optimization framework to capture both heterogeneity and causal effect. [...] [PDF]
SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2019

NullAway: Practical Type-Based Null Safety for Java

S. Banerjee, L. Clapp, M. Sridharan
NullPointerExceptions (NPEs) are a key source of crashes in modern Java programs. Previous work has shown how such errors can be prevented at compile time via code annotations and pluggable type checking. However, such systems have been difficult to deploy on large-scale software projects, due to significant build-time overhead and / or a high annotation burden. This paper presents NullAway, a new type-based null safety checker for Java that overcomes these issues. [...] [PDF]
The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE), 2019

Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints

M. Li, E. Yumer, D. Ramanan
Current approaches for hyper-parameter tuning and neural architecture search tend to be limited by practical resource constraints. Therefore, we introduce a formal setting for studying training under the non-asymptotic, resource-constrained regime, i.e. budgeted training. We analyze the following problem: "given a dataset, algorithm, and resource budget, what is the best achievable performance?" [PDF]
International Conference on Learning Representations (ICLR), 2020

Evolvability ES: Scalable and Direct Optimization of Evolvability

A. Gajewski, J. Clune, K. O. Stanley, J. Lehman
Designing evolutionary algorithms capable of uncovering highly evolvable representations is an open challenge; such evolvability is important because it accelerates evolution and enables fast adaptation to changing circumstances. This paper introduces evolvability ES, an evolutionary algorithm designed to explicitly and efficiently optimize for evolvability, i.e. the ability to further adapt. [...] [PDF]
The Genetic and Evolutionary Computation Conference (GECCO), 2019

Probabilistic Programming for Birth-Death Models of Evolution Using an Alive Particle Filter with Delayed Sampling

J. Kudlicka, L. M. Murray, F. Ronquist, T. B. Schön
We consider probabilistic programming for birth-death models of evolution and introduce a new widely-applicable inference method that combines an extension of the alive particle filter (APF) with automatic Rao-Blackwellization via delayed sampling. [...] [PDF]
Conference on Uncertainty in Artificial Intelligence (UAI), 2019

Collaborative Multi-Agent Dialogue Model Training Via Reinforcement Learning

A. Papangelis, Y.-C. Wang, P. Molino, G. Tur
We present the first complete attempt at concurrently training conversational agents that communicate only via self-generated language. Using DSTC2 as seed data, we trained natural language understanding (NLU) and generation (NLG) networks for each agent and let the agents interact online. [...] [PDF]
Special Interest Group on Discourse and Dialogue (SIGDIAL), 2019

Stakeholders as Researchers: Empowering non-researchers to interact directly with consumers

Marta Ponte Fissgus
An investigation into the trends of user experience research revealed that businesses and stakeholders will increasingly value human insights, and hence, as research becomes more mainstream, “organizations will continue to develop new tools to democratize those practices and adapt to company needs (dscout, 2018).” [...] [PDF]
Delft University of Technology (TU Delft), 2019

LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving

G. P. Meyer, A. Laddha, E. Kee, C. Vallespi-Gonzalez, C. Wellington
In this paper, we present LaserNet, a computationally efficient method for 3D object detection from LiDAR data for autonomous driving. The efficiency results from processing LiDAR data in the native range view of the sensor, where the input data is naturally compact. [...]
Computer Vision and Pattern Recognition (CVPR), 2019

Understanding and Designing for Deaf or Hard of Hearing Drivers on Uber

S. Lee, B. Hubert-Wallander, M. Stevens, J. M. Carroll
We used content analysis of in-app driver survey responses, customer support tickets, and tweets, and face-to-face interviews of DHH Uber drivers to better understand the DHH driver experience. Here we describe challenges DHH drivers experience and how they address those difficulties via Uber's accessibility features and their own workarounds. [...]
Conference on Human Factors in Computing Systems (CHI), 2019

End-to-end Interpretable Neural Motion Planner

W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, R. Urtasun
In this paper, we propose a neural motion planner for learning to drive autonomously in complex urban scenarios that include traffic-light handling, yielding, and interactions with multiple road-users. Towards this goal, we design a holistic model that takes as input raw LIDAR data and an HD map and produces interpretable intermediate representations in the form of 3D detections and their future trajectories, as well as a cost volume defining the goodness of each position that the self-driving car can take within the planning horizon. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Learning to Localize through Compressed Binary Maps

X. Wei, I. A. Bârsan, S. Wang, J. Martinez, R. Urtasun
One of the main difficulties of scaling current localization systems to large environments is the on-board storage required for the maps. In this paper we propose to learn to compress the map representation such that it is optimal for the localization task. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Convolutional Recurrent Network for Road Boundary Extraction

J. Liang, N. Homayounfar, S. Wang, W.-C. Ma, R. Urtasun
Creating high definition maps that contain precise information of static elements of the scene is of utmost importance for enabling self driving cars to drive safely. In this paper, we tackle the problem of drivable road boundary extraction from LiDAR and camera imagery. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Multi-Task Multi-Sensor Fusion for 3D Object Detection

M. Liang, B. Yang, Y. Chen, R. Hu, R. Urtasun
In this paper we propose to exploit multiple related tasks for accurate multi-sensor 3D object detection. Towards this goal we present an end-to-end learnable architecture that reasons about 2D and 3D object detection as well as ground estimation and depth completion. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Deep Rigid Instance Scene Flow

W.-C. Ma, S. Wang, R. Hu, Y. Xiong, R. Urtasun
In this paper we tackle the problem of scene flow estimation in the context of self-driving. We leverage deep learning techniques as well as strong priors as in our application domain the motion of the scene can be composed by the motion of the robot and the 3D motion of the actors in the scene. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Dimensionality Reduction for Representing the Knowledge of Probabilistic Models

M. T. Law, J. Snell, A.-M. Farahmand, R. Urtasun, R. S. Zemel
Most deep learning models rely on expressive high-dimensional representations to achieve good performance on tasks such as classification. However, the high dimensionality of these representations makes them difficult to interpret and prone to over-fitting. We propose a simple, intuitive and scalable dimension reduction framework that takes into account the soft probabilistic interpretation of standard deep models for classification. [...] [PDF]
International Conference on Learning Representations (ICLR), 2019

DARNet: Deep Active Ray Network for Building Segmentation

D. Cheng, R. Liao, S. Fidler, R. Urtasun
In this paper, we propose a Deep Active Ray Network (DARNet) for automatic building segmentation. Taking an image as input, it first exploits a deep convolutional neural network (CNN) as the backbone to predict energy maps, which are further utilized to construct an energy function. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Building Together: When Research Went Viral at Uber

B. Hubert-Wallander, E. G. Ruiz, M. Jain, L. G. Barrio, S. S. Mitra, M. Stevens
In late 2017, Uber was nearly a year into a complete redesign of its driver-facing mobile app. This case study describes the research program we executed to support the app's global beta launch, which aimed to "Build Together" with drivers across different geographies. [...]
Conference on Human Factors in Computing Systems (CHI), 2019

UPSNet: A Unified Panoptic Segmentation Network

Y. Xiong, R. Liao, H. Zhao, R. Hu, M. Bai, E. Yumer, R. Urtasun
In this paper we tackle the problem of scene flow estimation in the context of self-driving. We leverage deep learning techniques as well as strong priors as in our application domain the motion of the scene can be composed by the motion of the robot and the 3D motion of the actors in the scene. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

DeepSignals: Predicting Intent of Drivers Through Visual Attributes

D. Frossard, E. Kee, R. Urtasun
Detecting the intention of drivers is an essential task in self-driving, necessary to anticipate sudden events like lane changes and stops. Turn signals and emergency flashers communicate such intentions, providing seconds of potentially critical reaction time. In this paper, we propose to detect these signals in video sequences by using a deep neural network that reasons about both spatial and temporal information. [...] [PDF]
International Conference on Robotics and Automation (ICRA), 2019

Learning a Generative Model for Multi-Step Human-Object Interactions from Videos

H. Wang, S. Pirk, V. Kim, E. Yumer, L. Guibas
Creating dynamic virtual environments consisting of humans interacting with objects is a fundamental problem in computer graphics. While it is well-accepted that agent interactions play an essential role in synthesizing such scenes, most extant techniques exclusively focus on static scenes, leaving the dynamic component out. In this paper, we present a generative model to synthesize plausible multi-step dynamic human–object interactions. [...] [PDF]
European Association for Computer Graphics (Eurographics), 2019

Exploratory Stage Lighting Design using Visual Objectives

E. Shimizu, S. Paris, M. Fisher, E. Yumer, K. Fatahalian
Lighting is a critical element of theater. A lighting designer is responsible for drawing the audience’s attention to a specific part of the stage, setting time of day, creating a mood, and conveying emotions. Designers often begin the lighting design process by collecting reference visual imagery that captures different aspects of their artistic intent. Then, they experiment with various lighting options to determine which ideas work best on stage. However, modern stages contain tens to hundreds of lights, and setting each light source’s parameters individually to realize an idea is both tedious and requires expert skill. In this paper, we describe an exploratory lighting design tool based on feedback from professional designers. [...] [PDF]
European Association for Computer Graphics (Eurographics), 2019

Metropolis-Hastings Generative Adversarial Networks

R. Turner, J. Hung, Y. Saatci, J. Yosinski
We introduce the Metropolis-Hastings generative adversarial network (MH-GAN), which combines aspects of Markov chain Monte Carlo and GANs. The MH-GAN draws samples from the distribution implicitly defined by a GAN's discriminator-generator pair, as opposed to sampling in a standard GAN which draws samples from the distribution defined by the generator. [...] [PDF]
International Conference on Machine Learning (ICML), 2019

Understanding Neural Networks via Feature Visualization: A survey

A. Nguyen, J. Yosinski, J. Clune
A neuroscience method to understanding the brain is to find and study the preferred stimuli that highly activate an individual cell or groups of cells. Recent advances in machine learning enable a family of methods to synthesize preferred stimuli that cause a neuron in an artificial or biological brain to fire strongly. [...] [PDF]
Interpretable AI: Interpreting, Explaining and Visualizing Deep Learning, 2019

Exact Gaussian Processes on a Million Data Points

K. A. Wang, G. Pleiss, J. R. Gardner, S. Tyree, K. Q. Weinberger, A. G. Wilson
Gaussian processes (GPs) are flexible models with state-of-the-art performance on many impactful applications. However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets. In this paper, we develop a scalable approach for exact GPs that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication. [...] [PDF]
arXiv, 2019

Keeping master green at scale

S. Ananthanarayanan, M. S. Ardekani, D. Haenikel, B. Varadarajan, S. Soriano, D. Patel, A.-R. Adl-Tabatabai
This paper presents the design and implementation of SubmitQueue. It guarantees an always green master branch at scale: all build steps (e.g., compilation, unit tests, UI tests) successfully execute for every commit point. SubmitQueue has been in production for over a year, and can scale to thousands of daily commits to giant monolithic repositories. [...] [PDF]
European Conference on Computer Systems (EuroSys), 2019

Quantum speedup at zero temperature via coherent catalysis

G. A. Durkin
Proving quantum speed-up is possible in certain models of quantum annealing with non-stoquastic drivers. The results contradict conventional mean-field analysis in the thermodynamic limit. Asymptotic analysis of finite size system predicts dominant behaviour -- both scaling and coefficients of numerical results for systems of more than 50 qubits, indicating the legitmacy and importance of quantum transport by vacuum delocalization. [PDF]
American Physical Society (APS), 2019

Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions

R. Wang, J. Lehman, J. Clune, K. Stanley
While the history of machine learning so far encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. [...] [PDF]

Go-Explore: a New Approach for Hard-Exploration Problems

A. Ecoffet, J. Huizinga, J. Lehman, K. Stanley, J. Clune
A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. [...] [PDF]

Photo-Sketching: Inferring Contour Drawings from Images

M. Li, Z. Lin, R. Mech, E. Yumer, D. Ramanan
Edges, boundaries and contours are important subjects of study in both computer graphics and computer vision. On one hand, they are the 2D elements that convey 3D shapes, on the other hand, they are indicative of occlusion events and thus separation of objects or semantic concepts. In this paper, we aim to generate contour drawings, boundary-like drawings that capture the outline of the visual scene. Prior art often cast this problem as boundary detection. [...] [PDF]
Winter Conference on Applications of Computer Vision (WACV), 2019

Neural Guided Constraint Logic Programming for Program Synthesis

L. Zhang, G. Rosenblatt, E. Fetaya, R. Liao, W. Byrd, M. Might, R. Urtasun, R. Zemel
Synthesizing programs using example input/outputs is a classic problem in artificial intelligence. We present a method for solving Programming By Example (PBE) problems by using a neural model to guide the search of a constraint logic programming system called miniKanren. [...] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2018

Robustness to out-of-distribution inputs via taskaware generative uncertainty

R. McAllister, G. Kahn, J. Clune, S. Levine
Deep learning provides a powerful tool for machine perception when the observations resemble the training data. However, real-world robotic systems must react intelligently to their observations even in unexpected circumstances. This requires a system to reason about its own uncertainty given unfamiliar, out-of-distribution observations. [...] [PDF]
International Conference on Robotics and Automation (ICRA), 2019

LanczosNet: Multi-Scale Deep Graph Convolutional Networks

R. Liao, Z. Zhao, R. Urtasun, R. Zemel
Relational data can generally be represented as graphs. For processing such graph structured data, we propose LanczosNet, which uses the Lanczos algorithm to construct low rank approximations of the graph Laplacian for graph convolution. [...] [PDF]
Neural Information Processing Systems (NeurIPS), 2018

Graph HyperNetworks for Neural Architecture Search

C. Zhang, M. Ren, R. Urtasun
Neural architecture search (NAS) automatically finds the best task-specific neural network topology, outperforming many manual architecture designs. However, it can be prohibitively expensive as the search requires training thousands of different networks, while each can last for hours. In this work, we propose the Graph HyperNetwork (GHN) to amortize the search cost: given an architecture, it directly generates the weights by running inference on a graph neural network. [...] [PDF]
Meta Learning workshop @ Neural Information Processing Systems (NeurIPS), 2018

Predicting Motion of Vulnerable Road Users using High-Definition Maps and Efficient ConvNets

F. Chou, T.-H. Lin, H. Cui, V. Radosavljevic, T. Nguyen, T. Huang, M. Niedoba, J. Schneider, N. Djuric
Following detection and tracking of traffic actors, prediction of their future motion is the next critical component of a self-driving vehicle (SDV), allowing the SDV to move safely and efficiently in its environment. This is particularly important when it comes to vulnerable road users (VRUs), such as pedestrians and bicyclists. We present a deep learning method for predicting VRU movement where we rasterize high-definition maps and actor's surroundings into bird's-eye view image used as input to convolutional networks. [...] [PDF]
MLITS workshop @ Neural Information Processing Systems (NeurIPS), 2018

Rotated Rectangles for Symbolized Building Footprint Extraction

M. Dickenson, L. Gueguen
Building footprints (BFP) provide useful visual context for users of digital maps when navigating in space. This paper proposes a method for extracting and symbolizing building footprints from satellite imagery using a convolutional neural network (CNN). [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents

F. Such, V. Madhavan, R. Liu, R. Wang, P. Castro, Y. Li, L. Schubert, M. Bellemare, J. Clune, J. Lehman
Much human and computational effort has aimed to improve how deep reinforcement learning algorithms perform on benchmarks such as the Atari Learning Environment. Comparatively less effort has focused on understanding what has been learned by such methods, and investigating and comparing the representations learned by different families of reinforcement learning (RL) algorithms. [...] [PDF]

Faster Neural Networks Straight from JPEG

L. Gueguen, A. Sergeev, B. Kadlec, R. Liu, J. Yosinski
The simple, elegant approach of training convolutional neural networks (CNNs) directly from RGB pixels has enjoyed overwhelming empirical success. But can more performance be squeezed out of networks by using different input representations? In this paper we propose and explore a simple idea: train CNNs directly on the blockwise discrete cosine transform (DCT) coefficients computed and available in the middle of the JPEG codec. [...] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2018

Profiling Android Applications with Nanoscope

L. Liu, L. Takamine, A. Welc
User-level tooling support for profiling Java applications executing on modern JVMs for desktop and server is quite mature – from Open JDK’s Java Flight Recorder enabling low-overhead CPU and heap profiling, through third-party async profilers (e.g. async-profiler, honest-profiler), to Open JDK’s support for low-overhead tracking of allocation call sites. [...] [PDF]
Virtual Machines and Language Implementations (VMIL), 2018

Joint Mapping and Calibration via Differentiable Sensor Fusion

J. Chen, F. Obermeyer, V. Lyapunov, L. Gueguen, N. Goodman
We leverage automatic differentiation (AD) and probabilistic programming to develop an end-to-end optimization algorithm for batch triangulation of a large number of unknown objects. Given noisy detections extracted from noisily geo-located street level imagery without depth information, we jointly estimate the number and location of objects of different types, together with parameters for sensor noise characteristics and prior distribution of objects conditioned on side information. [...] [PDF]
Computing Research Repository (CoRR), 2018

Learning to Localize Using a LiDAR Intensity Map

I. Bârsan, S. Wang, A. Pokrovsky, R. Urtasun
In this paper we propose a real-time, calibration-agnostic and effective localization system for self-driving cars. Our method learns to embed the online LiDAR sweeps and intensity map into a joint deep embedding space. [...] [PDF]
Conference on Robot Learning (CORL), 2018

Deep Multi-Sensor Lane Detection

M. Bai, G. Mattyus, N. Homayounfar, S. Wang, S. K. Lakshmikanth, R. Urtasun
Reliable and accurate lane detection has been a long-standing problem in the field of autonomous driving. In recent years, many approaches have been developed that use images (or videos) as input and reason in image space. In this paper we argue that accurate image estimates do not translate to precise 3D lane boundaries, which are the input required by modern motion planning algorithms. [...] [PDF]
International Conference on Intelligent Robots and Systems (IROS), 2018

HDNET: Exploiting HD Maps for 3D Object Detection

B. Yang, M. Liang, R. Urtasun
In this paper we show that High-Definition (HD) maps provide strong priors that can boost the performance and robustness of modern 3D object detectors. Towards this goal, we design a single stage detector that extracts geometric and semantic features from the HD maps. [...] [PDF]
Conference on Robot Learning (CORL), 2018

Probabilistic Meta-Representations Of Neural Networks

T. Karaletsos, P. Dayan, Z. Ghahramani
Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently. Here, we consider a richer prior distribution in which units in the network are represented by latent variables, and the weights between units are drawn conditionally on the values of the collection of those variables. [...] [PDF]
UAI 2018 Uncertainty In Deep Learning Workshop (UDL), 2018

Pyro: Deep Universal Probabilistic Programming

E. Bingham, J. Chen, M. Jankowiak, F. Obermeyer, N. Pradhan, T. Karaletsos, R. Singh, P. Szerlip, P. Horsfall, N. Goodman
Pyro is a probabilistic programming language built on Python as a platform for developing advanced probabilistic models in AI research. [...] [PDF]
Journal of Machine Learning Research (JMLR), 2018

Dynamic Pricing and Matching in Ride-Hailing Platforms

N. Korolko, D. Woodard, C. Yan, H. Zhu
Ride-hailing platforms such as Uber, Lyft and DiDi have achieved explosive growth and reshaped urban transportation. The theory and technologies behind these platforms have become one of the most active research areas in the fields of economics, operations research, computer science, and transportation engineering. [...] [PDF]

IntentNet: Learning to Predict Intention from Raw Sensor Data

S. Casas, W. Luo, R. Urtasun
In order to plan a safe maneuver, self-driving vehicles need to understand the intent of other traffic participants. We define intent as a combination of discrete high level behaviors as well as continuous trajectories describing future motion. In this paper we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment. [...] [PDF]
Conference on Robot Learning (CORL), 2018

The Perfect uberPOOL: A Case Study on Trade-Offs

J. Lo, S. Morseman
Case Study—One of Uber’s company missions is to make carpooling more affordable and reliable for riders, and effortless for drivers. In 2014 the company launched uberPOOL to make it easy for riders to share their trip with others heading in the same direction. Fundamental to the mechanics of uberPOOL is the intelligence that matches riders for a trip, which can introduce various uncertainties into the user experience. [...]
Ethnographic Praxis in Industry Conference (EPIC), 2018

Efficient Convolutions for Real-Time Semantic Segmentation of 3D Point Clouds

C. Zhang, W. Luo, R. Urtasun
We propose an approach for semi-automatic annotation of object instances. While most current methods treat object segmentation as a pixel-labeling problem, we here cast it as a polygon prediction task, mimicking how most current datasets have been annotated. [...] [PDF]
International Conference on 3D Vision (3DV), 2018

Deep Continuous Fusion for Multi-Sensor 3D Object Detection

M. Liang, B. Yang, S. WangR. Urtasun
In this paper, we propose a novel 3D object detector that can exploit both LIDAR as well as cameras to perform very accurate localization. Towards this goal, we design an end-to-end learnable architecture that exploits continuous convolutions to fuse image and LIDAR feature maps at different levels of resolution. [...] [PDF]
European Conference on Computer Vision (ECCV), 2018

Single Image Intrinsic Decomposition Without a Single Intrinsic Image

W. Ma, H, Chu, B. Zhou, R. Urtasun, A. Torralba
We propose an approach for semi-automatic annotation of object instances. While most current methods treat object segmentation as a pixel-labeling problem, we here cast it as a polygon prediction task, mimicking how most current datasets have been annotated. [...] [PDF]
European Conference on Computer Vision (ECCV), 2018

Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity

T. Miconi, A. Rawal, J. Clune, K. Stanley
A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. To address this shortfall, we introduce a new algorithm called Go-Explore. [...] [PDF]
International Conference on Learning Representations (ICLR), 2019

Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks

H. Cui, V. Radosavljevic, F. Chou, T.-H. Lin, T. Nguyen, T. Huang, J. Schneider, N. Djuric
Autonomous driving presents one of the largest problems that the robotics and artificial intelligence communities are facing at the moment, both in terms of difficulty and potential societal impact. Self-driving vehicles (SDVs) are expected to prevent road accidents and save millions of lives while improving the livelihood and life quality of many more. [...] [PDF]
International Conference on Robotics and Automation (ICRA), 2019

LSQ++: lower running time and higher recall in multi-codebook quantization

J. Martinez, S. Zakhmi, H. Hoos, and J. Little
Multi-codebook quantization (MCQ) is the task of expressing a set of vectors as accurately as possible in terms of discrete entries in multiple bases. Work in MCQ is heavily focused on lowering quantization error, thereby improving distance estimation and recall on benchmarks of visual descriptors at a fixed memory budget. [...] [PDF]
European Conference on Computer Vision (ECCV), 2018

Functional Programming for Modular Bayesian Inference

A. Ścibior, O. Kammar, Z. Ghahramani
We present an architectural design of a library for Bayesian modelling and inference in modern functional programming languages. The novel aspect of our approach are modular implementations of existing state-ofthe-art inference algorithms. Our design relies on three inherently functional features: higher-order functions, inductive data-types, and support for either type-classes or an expressive module system. [...] [PDF]

End-to-End Deep Structured Models for Drawing Crosswalks

J. Liang, R. Urtasun
In this paper we address the problem of detecting crosswalks from LiDAR and camera imagery. Towards this goal, given multiple Li-DAR sweeps and the corresponding imagery, we project both inputs onto the ground surface to produce a top down view of the scene. [...] [PDF]
European Conference on Computer Vision (ECCV), 2018

Safe stream-based programming with refinement types

B. Stein, L. Clapp, M. Sridharan, B.-Y. E. Chang
A type-based approach that can statically prove the thread-safety of UI accesses in stream-based software. We implement the system as an annotation-based Java typechecker for Android programs built upon the popular ReactiveX. We evaluate on 8 open-source apps and report on our experience applying the typechecker to two much larger apps from the Uber. [...] [PDF]
IEEE/ACM International Conference on Automated Software Engineering (ASE), 2018

Labor Market Equilibration: Evidence from Uber

J. Hall, J. Horton, D. Knoepfle
Using a city-week panel of US ride-sharing markets created by Uber, we estimate the effects of sudden fare changes on market outcomes, focusing on the supply-side. [...] [PDF]

Safely and Quickly Deploying New Features with a Staged Rollout Framework Using Sequential Test and Adaptive Experimental Design

Z. Zhao, M. Liu, A. Deb
During the rapid development cycle for Internet products (websites and mobile apps), new features are developed and rolled out to users constantly. Features with code defects or design flaws can cause outages and significant degradation of user experience. The traditional method of code review and change management can be time-consuming and error-prone. In order to make the feature rollout process safe and fast, this paper proposes a methodology for rolling out features in an automated way using an adaptive experimental design. [...] [PDF]
International Conference on Computational Intelligence and Applications, (ICCIA), 2018

Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models

J. Zhang, Y. Wang, P. Molino, L. Li, D. Ebert
Interpretation and diagnosis of machine learning models have gained renewed interest in recent years with breakthroughs in new approaches. We present Manifold, a framework that utilizes visual analysis techniques to support interpretation, debugging, and comparison of machine learning models in a more transparent and interactive manner. [...] [PDF]
IEEE Visualization (IEEE VIS), 2018

Uber Happy? Work and Well-being in the “Gig Economy”

T. Berger, C. B. Frey, G. Levin, S. R. Danda
We explore the rise of the so-called “gig economy” through the lens of Uber and its drivers in the United Kingdom. Using administrative data from Uber and a new representative survey of London drivers, we explore their backgrounds, earnings, and well being. [...] [PDF]
The 68th Panel Meeting of Economic Policy, 2019

Discovering Interpretable Representations for Both Deep Generative and Discriminative Models

T. Adel, Z. Ghahramani, A. Weller
Interpretability of representations in both deep generative and discriminative models is highly desirable. Current methods jointly optimize an objective combining accuracy and interpretability. However, this may reduce accuracy, and is not applicable to already trained models. We propose two interpretability frameworks. First, we provide an interpretable lens for an existing model. We use a generative model which takes as input the representation in an existing (generative or discriminative) model, weakly supervised by limited side information. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Differentiable Compositional Kernel Learning for Gaussian Processes

S. Sun, G. Zhang, C. Wang, W. Zeng, J. Li, R. Grosse
The generalization properties of Gaussian processes depend heavily on the choice of kernel, and this choice remains a dark art. We present the Neural Kernel Network (NKN), a flexible family of kernels represented by a neural network. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving

M. Teichmann, M. Weber, M. Zöllner, R. Cipolla, R. Urtasun
While most approaches to semantic reasoning have focused on improving performance, in this paper we argue that computational times are very important in order to enable real time applications such as autonomous driving. [...] [PDF]
IEEE Intelligent Vehicles Symposium (IV), 2018

COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks

P. Molino, H. Zheng, Y.-C. Wang
For a company looking to provide delightful user experiences, it is of paramount importance to take care of any customer issues. This paper proposes COTA, a system to improve speed and reliability of customer support for end users through automated ticket classification and answers selection for support representatives. [...] [PDF]
ACM SIGKDD International Conference on Knowledge Discovery and Data Science (KDD), 2018

Variational Bayesian dropout: pitfalls and fixes

J. Hron, A. Matthews, Z. Ghahramani
Dropout, a stochastic regularisation technique for training of neural networks, has recently been reinterpreted as a specific type of approximate inference algorithm for Bayesian neural networks. The main contribution of the reinterpretation is in providing a theoretical framework useful for analysing and extending the algorithm [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Reviving and Improving Recurrent Back Propagation

R. Liao, Y. Xiong, E. Fetaya, L. Zhang, K. Yoon, X. Pitkow, R. Urtasun, R. Zemel
In this paper, we revisit the recurrent back-propagation (RBP) algorithm, discuss the conditions under which it applies as well as how to satisfy them in deep neural networks. We show that RBP can be unstable and propose two variants based on conjugate gradient on the normal equations (CG-RBP) and Neumann series (Neumann-RBP). [...] [PDF]
Conference on Computer Vision and Pattern Recognition (ICML), 2018

Learning to Reweight Examples for Robust Deep Learning

M. Ren, W. Zeng, B. Yang, R. Urtasun
Deep neural networks have been shown to be very powerful modeling tools for many supervised learning tasks involving complex input patterns. However, they can also easily overfit to training set biases and label noises. [...] [PDF]
International Conference on Machine Learning ( ICML), 2018

Evolving Multimodal Robot Behavior via Many Stepping Stones with the Combinatorial Multi-Objective Evolutionary Algorithm

J. Huizinga, J. Clune
An important challenge in reinforcement learning, including evolutionary robotics, is to solve multimodal problems, where agents have to act in qualitatively different ways depending on the circumstances. Because multimodal problems are often too difficult to solve directly, it is helpful to take advantage of staging, where a difficult task is divided into simpler subtasks that can serve as stepping stones for solving the overall problem. [...] [PDF]

An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution

R. Liu, J. Lehman, P. Molino, F.i Such, E. Frank, A. Sergeev, J. Yosinski
Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and one-hot pixel space. [...] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2018

GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation

X. Qi, R. Liao, Z. Liu, R. Urtasun, J. Jia
In this paper, we propose Geometric Neural Network (GeoNet) to jointly predict depth and surface normal maps from a single image. Building on top of two-stream CNNs, our GeoNet incorporates geometric relation between depth and surface normal via the new depth-to-normal and normal-to-depth networks. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Matching Adversarial Networks

G. Mattyus, R. Urtasun
Generative Adversarial Nets (GANs) and Conditonal GANs (CGANs) show that using a trained network as loss function (discriminator) enables to synthesize highly structured outputs (e.g. natural images). However, applying a discriminator network as a universal loss function for common supervised tasks (e.g. semantic segmentation, line detection, depth estimation) is considerably less successful. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Deep Parametric Continuous Convolutional Neural Networks

S. Wang, S. Suo, W. Ma, A. PokrovskyR. Urtasun
We propose an approach for semi-automatic annotation of object instances. While most current methods treat object segmentation as a pixel-labeling problem, we here cast it as a polygon prediction task, mimicking how most current datasets have been annotated. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Learning deep structured active contours end-to-end

D. Marcos, D. Tuia, B. Kellenberger, L. Zhang, M. Bai, R. Liao, R. Urtasun
The world is covered with millions of buildings, and precisely knowing each instance's position and extents is vital to a multitude of applications. Recently, automated building footprint segmentation models have shown superior detection accuracy thanks to the usage of Convolutional Neural Networks (CNN). [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

SBNet: Sparse Block’s Network for Fast Inference

M. Ren, A. Pokrovsky, B. Yang, R. Urtasun
Conventional deep convolutional neural networks (CNNs) apply convolution operators uniformly in space across all feature maps for hundreds of layers - this incurs a high computational cost for real-time applications. For many problems such as object detection and semantic segmentation, we are able to obtain a low-cost computation mask, either from a priori problem knowledge, or from a low-resolution segmentation network. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

iMapper: Interaction-guided Scene Mapping from Monocular Videos

A. Monszpart, P. Guerrero, D. Ceylan, E. Yumer, N. Mitra
A long-standing challenge in scene analysis is the recovery of scene arrangements under moderate to heavy occlusion, directly from monocular video. While the problem remains a subject of active research, concurrent advances have been made in the context of human pose reconstruction from monocular video, including image-space feature point detection and 3D pose recovery. These methods, however, start to fail under moderate to heavy occlusion as the problem becomes severely under-constrained. We approach the problems differently. We observe that people interact similarly in similar scenes. [...] [PDF]
Special Interest Group on Computer Graphics and Interactive Techniques Conference, (SIGGRAPH), 2018

SurfConv: Bridging 3D and 2D Convolution for RGBD Images

H. Chu, W. Ma, K. Kundu, R. Urtasun, S. Fidler
The last few years have seen approaches trying to combine the increasing popularity of depth sensors and the success of the convolutional neural networks. Using depth as additional channel alongside the RGB input has the scale variance problem present in image convolution based approaches. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net

W. Luo, B. Yang, R. Urtasun
In this paper we propose a novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor. By jointly reasoning about these tasks, our holistic approach is more robust to occlusion as well as sparse data at range. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

End-to-end Learning of Multi-sensor 3D Tracking by Detection

D. Frossard, R. Urtasun
In this paper we propose a novel approach to tracking by detection that can exploit both cameras as well as LIDAR data to produce very accurate 3D trajectories. Towards this goal, we formulate the problem as a linear program that can be solved exactly, and learn convolutional networks for detection as well as matching in an end-to-end manner. [...] [PDF]
International Conference on Robotics and Automation (ICRA), 2018

Pathwise Derivatives for Multivariate Distributions

M. Jankowiak, T. Karaletsos
We exploit the link between the transport equation and derivatives of expectations to construct efficient pathwise gradient estimators for multivariate distributions. We focus on two main threads. [...] [PDF]
International Conference on Artificial Intelligence and Statistics (AI STATS) (in submission), 2019

Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning

M. Norouzzadeh, A. Nguyen, M. Kosmala, A. Swanson, M. Palmer, C. Parker, J. Clune
Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would revolutionize our ability to study and conserve ecosystems. We investigate the ability to automatically, accurately, and inexpensively collect such data, which could transform many fields of biology, ecology, and zoology into "big data" sciences. [...] [PDF]
PNAS Vol. 115 no. 25, 2018

Hierarchical Recurrent Attention Networks for Structured Online Maps

N. Homayounfar, W. Ma, S. Lakshmikanth, R. Urtasun
In this paper, we tackle the problem of online road network extraction from sparse 3D point clouds. Our method is inspired by how an annotator builds a lane graph, by first identifying how many lanes there are and then drawing each one in turn. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

PIXOR: Real-time 3D Object Detection from Point Clouds

B. Yang, W. Luo, R. Urtasun
We address the problem of real-time 3D object detection from point clouds in the context of autonomous driving. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems

C. Stanton, J. Clune
Traditional exploration methods in RL require agents to perform random actions to find rewards. But these approaches struggle on sparse-reward domains like Montezuma's Revenge where the probability that any random action sequence leads to reward is extremely low. Recent algorithms have performed well on such tasks by encouraging agents to visit new states or perform new actions in relation to all prior training episodes (which we call across-training novelty). [...] [PDF]

Pathwise Derivatives Beyond the Reparameterization Trick

M. Jankowiak, F. Obermeyer
We observe that gradients computed via the reparameterization trick are in direct correspondence with solutions of the transport equation in the formalism of optimal transport. We use this perspective to compute (approximate) pathwise gradients for probability distributions not directly amenable to the reparameterization trick: Gamma, Beta, and Dirichlet. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Surge Pricing Moves Uber’s Driver-Partners

A. Lu, P. I. Frazier, O. Kislev
We study the impact of dynamic pricing (so-called "surge pricing") on relocation decisions by Uber's driver-partners and the corresponding revenue they collected. Using a natural experiment arising from an outage in the system that produces the surge pricing heatmap for a portion of Uber's driver-partners over 10 major cities, and a difference-in-differences approach, we study the short-run effect that visibility of the surge heatmap has on 1) drivers' decisions to relocate to areas with higher or lower prices and 2) drivers' revenue. [...] [PDF]
ACM Conference on Economics and Computation (ACM EC), 2018

VINE: An Open Source Interactive Data Visualization Tool for Neuroevolution

R. Wang, J. Clune, K. Stanley
Recent advances in deep neuroevolution have demonstrated that evolutionary algorithms, such as evolution strategies (ES) and genetic algorithms (GA), can scale to train deep neural networks to solve difficult reinforcement learning (RL) problems. However, it remains a challenge to analyze and interpret the underlying process of neuroevolution in such high dimensions. To begin to address this challenge, this paper presents an interactive data visualization tool called VINE (Visual Inspector for NeuroEvolution) aimed at helping neuroevolution researchers and end-users better understand and explore this family of algorithms. [...] [PDF]
Visualization Workshop at The Genetic and Evolutionary Computation Conference (GECCO), 2018

Robust Dense Mapping for Large-Scale Dynamic Environments

I. Bârsan, P. Liu, M. Pollefeys, A. Geiger
We present a stereo-based dense mapping algorithm for large-scale dynamic urban environments. In contrast to other existing methods, we simultaneously reconstruct the static background, the moving objects, and the potentially moving but currently stationary objects separately, which is desirable for high-level mobile robotic tasks such as path planning in crowded environments. [...] [PDF]
Video: [LINK]
Project Page: [LINK]
International Conference on Robotics and Automation (ICRA), 2018

Driver Surge Pricing

H. Nazerzadeh, N. Garg
Uber and Lyft ride-hailing marketplaces use dynamic pricing, often called surge, to balance the supply of available drivers with the demand for rides. We study pricing mechanisms for such marketplaces from the perspective of drivers, presenting the theoretical foundation that has informed the design of Uber's new additive driver surge mechanism. We present a dynamic stochastic model to capture the impact of surge pricing on driver earnings and their strategies to maximize such earnings. [...] [PDF]

Likelihood-free inference with emulator networks

J.-M. Lueckmann, G. Bassetto, T. Karaletsos, J. H. Macke
Approximate Bayesian Computation (ABC) provides methods for Bayesian inference in simulation-based stochastic models which do not permit tractable likelihoods. We present a new ABC method which uses probabilistic neural emulator networks to learn synthetic likelihoods on simulated data -- both local emulators which approximate the likelihood for specific observed data, as well as global ones which are applicable to a range of data. [...] [PDF]

Leveraging Constraint Logic Programming for Neural Guided Program Synthesis

L. Zhang, G. Rosenblatt, E. Fetaya, R. Liao, W. Byrd, R. Urtasun, R. Zemel
We present a method for solving Programming by Example (PBE) problems that tightly integrates a neural network with a constraint logic programming system called miniKanren. Internally, miniKanren searches for a program that satisfies the recursive constraints imposed by the provided examples. [...] [PDF]
International Conference on Machine Learning (ICLR), 2018

Understanding Short-Horizon Bias in Stochastic Meta-Optimization

Y. Wu, M. Ren, R. Liao, R. Grosse
Careful tuning of the learning rate, or even schedules thereof, can be crucial to effective neural net training. There has been much recent interest in gradient-based meta-optimization, where one tunes hyperparameters, or even learns an optimizer, in order to minimize the expected loss when the training procedure is unrolled. [...] [PDF]
International Conference on Learning Representations (ICLR), 2018

Measuring the Intrinsic Dimension of Objective Landscapes

Chunyuan Li, Heerad Farkhoor, R. Liu, J. Yosinski
Many recently trained neural networks employ large numbers of parameters to achieve good performance. One may intuitively use the number of parameters required as a rough gauge of the difficulty of a problem. But how accurate are such notions? How many parameters are really needed? In this paper we attempt to answer this question by training networks not in their native parameter space, but instead in a smaller, randomly oriented subspace. [...] [PDF]
International Conference on Learning Representations (ICLR), 2018

Gaussian Process Behaviour in Wide Deep Neural Networks

Alexander G. de G. Matthews, Mark Rowland, Jiri Hron, Richard E. Turner, Zoubin Ghahramani
Whilst deep neural networks have shown great empirical success, there is still much work to be done to understand their theoretical properties. In this paper, we study the relationship between random, wide, fully connected, feedforward networks with more than one hidden layer and Gaussian processes with a recursive kernel definition. [...] [PDF]
International Conference on Learning Representations (ICLR), 2018

Differentiable plasticity: training plastic neural networks with backpropagation

T. Miconi, J. Clune, K. Stanley
How can we build agents that keep learning from experience, quickly and efficiently, after their initial training? Here we take inspiration from the main mechanism of learning in biological brains: synaptic plasticity, carefully tuned by evolution to produce efficient lifelong learning. We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Graph Partition Neural Networks for Semi-Supervised Classification

R. Liao, M. Brockschmidt, D. Tarlow, A. Gaunt, R. Urtasun, R. Zemel
We present graph partition neural networks (GPNN), an extension of graph neural networks (GNNs) able to handle extremely large graphs. GPNNs alternate between locally propagating information between nodes in small subgraphs and globally propagating information between the subgraphs. [...] [PDF]
Workshop @ International Conference on Machine Learning (ICLR), 2018

Sports Field Localization via Deep Structured Models

N. Homayounfar, S. Fidler, R. Urtasun
In this work, we propose a novel way of efficiently localizing a soccer field from a single broadcast image of the game. Related work in this area relies on manually annotating a few key frames and extending the localization to similar images, or installing fixed specialized cameras in the stadium from which the layout of the field can be obtained. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

The surprising creativity of digital evolution: A collection of anecdotes from the evolutionary computation and artificial life research communities

J. Lehman, J. Clune, D. Misevic, C. Adami, L. Altenberg, J. Beaulieu, P. Bentley, S. Bernard, G. Beslon, D. Bryson, P. Chrabaszcz, N. Cheney, A. Cully, S. Doncieux, F. Dyer, K. Ellefsen, R. Feldt, S. Fischer, S. Forrest, A. Frénoy, C. Gagné, L. Goff, L. Grabowski, B. Hodjat, F. Hutter, L. Keller, C. Knibbe, P. Krcah, R. Lenski, H. Lipson, R. MacCurdy, C. Maestre, R. Miikkulainen, S. Mitri, D. Moriarty, J. Mouret, A. Nguyen, C. Ofria, M. Parizeau, D. Parsons, R. Pennock, W. Punch, T. Ray, M. Schoenauer, E. Shulte, K. Sims, K. Stanley, F. Taddei, D. Tarapore, S. Thibault, W. Weimer, R. Watson, J. Yosinski
Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution's creativity is not limited to nature. [...] [PDF]

Inference in Probabilistic Graphical Models by Graph Neural Networks

K. Yoon, R. Liao, Y. Xiong, L. Zhang, E. Fetaya, R. Urtasun, R. Zemel, X. Pitkow
A fundamental computation for statistical inference and accurate decision-making is to compute the marginal probabilities or most probable states of task-relevant variables. Probabilistic graphical models can efficiently represent the structure of such complex data, but performing these inferences is generally difficult. [...] [PDF]
Workshop @ International Conference on Learning Representations (ICLR), 2018

Incorporating the Structure of the Belief State in End-to-End Task-Oriented Dialogue Systems

L. Shu, P. Molino, M. Namazifar, B. Liu, H. Xu, H. Zheng, and G. Tur
End-to-end trainable networks try to overcome error propagation, lack of generalization and overall brittleness of traditional modularized task-oriented dialogue system architectures. Most proposed models expand on the sequence-to-sequence architecture. Some of them don’t track belief state, which makes it difficult to interact with ever-changing knowledge bases, while the ones that explicitly track the belief state do it with classifiers. The use of classifiers suffers from the out-of-vocabulary words problem, making these models hard to use in real-world applications with ever-changing knowledge bases. We propose Structured Belief Copy Network (SBCN), a novel end-to-end trainable architecture that allows for interaction with external symbolic knowledge bases and solves the out-of-vocabulary problem at the same time. [...] [PDF]
Conversational Intelligence Challenge at Conference on Neural Information Processing Systems (ConvAI @ NeurIPS), 2018

Can You be More Polite and Positive? Infusing Social Language into Task-Oriented Conversational Agents

Y.-C. Wang, R. Wang, G. Tur, H. Williams
Goal-oriented conversational agents are becoming ubiquitous in daily life for tasks ranging from personal assistants to customer support systems. For these systems to engage users and achieve their goals in a more natural manner, they need to not just provide informative replies and guide users through the problems but also to socialize with users. To this end, we extend the line of style transfer research on developing generative deep learning models to control for a specific style such as sentiment and personality. [...] [PDF]
Conversational Intelligence Challenge at Conference on Neural Information Processing Systems (ConvAI @ NeurIPS), 2018

The Mirage of Action-Dependent Baselines in Reinforcement Learning

G. Tucker, S. Bhupatiraju, S. Gu, R. Turner, Z. Ghahramani, S. Levine
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance. Several recent papers extend the baseline to depend on both the state and action and suggest that this significantly reduces variance and improves sample efficiency without introducing bias into the gradient estimates. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Weakly supervised collective feature learning from curated media

Y. Mukuta, A. Kimura, D. Adrian, Z. Ghahramani
The current state-of-the-art in feature learning relies on the supervised learning of large-scale datasets consisting of target content items and their respective category labels. However, constructing such large-scale fully-labeled datasets generally requires painstaking manual effort. One possible solution to this problem is to employ community contributed text tags as weak labels, however, the concepts underlying a single text tag strongly depends on the users. [...] [PDF]
AAAI Conference on Artificial Intelligence (AAAI), 2018

NerveNet: Learning Structured Policy with Graph Neural Networks

L. Castrejón, K. Kundu, R. Urtasun, S. Fidler
We address the problem of learning structured policies for continuous control. In traditional reinforcement learning, policies of agents are learned by multi-layer perceptrons (MLPs) which take the concatenation of all observations from the environment as input for predicting actions. [...] [PDF]
International Conference on Machine Learning (ICLR), 2018

The Gender Earnings Gap in the Gig Economy: Evidence from over a Million Rideshare Drivers

C. Cook, R. Diamond, J. Hall, J. A. List, P. Oyer
The growth of the “gig” economy generates worker flexibility that, some have speculated, will favor women. We explore this by examining labor supply choices and earnings among more than a million rideshare drivers on Uber in the U.S. [...] [PDF]

Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents

E. Conti, V. Madhavan, F. Such, J. Lehman, K. Stanley, J. Clune
Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. hours vs. days) because they parallelize better. [...] [PDF]
ViGIL @ NeurIPS 2017 (NeurIPS), 2017

On the Relationship Between the OpenAI Evolution Strategy and Stochastic Gradient Descent

X. Zhang, J. Clune, K. Stanley
Because stochastic gradient descent (SGD) has shown promise optimizing neural networks with millions of parameters and few if any alternatives are known to exist, it has moved to the heart of leading approaches to reinforcement learning (RL). [...] [PDF]

Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients

J. Lehman, J. Chen, J. Clune, K. Stanley
While neuroevolution (evolving neural networks) has a successful track record across a variety of domains from reinforcement learning to artificial life, it is rarely applied to large, deep neural networks. A central reason is that while random mutation generally works in low dimensions, a random perturbation of thousands or millions of weights is likely to break existing functionality, providing no learning signal even if some individual weight changes were beneficial. [...] [PDF]
The Genetic and Evolutionary Computation Conference (GECCO), 2018

Meta-Learning for Semi-Supervised Few-Shot Classification

M. Ren, E. Triantafilou, S. Ravi, J. Snell, K. Swersky, J. Tenenbaum, H. Larochelle, R. Zemel
In few-shot classification, we are interested in learning algorithms that train a classifier from only a handful of labeled examples. Recent progress in few-shot classification has featured meta-learning, in which a parameterized model for a learning algorithm is defined and trained on episodes representing different classification problems, each with a small labeled training set and its corresponding test set. [...] [PDF]
Code & Datasets: [LINK]
International Conference on Learning Representations (ICLR), 2018

Characterizing how Visual Question Answering models scale with the world

E. Bingham, P. Molino, P. Szerlip, F. Obermeyer, N. Goodman
Detecting differences in generalization ability between models for visual question answering tasks has proven to be surprisingly difficult. We propose a new statistic, asymptotic sample complexity, for model comparison, and construct a synthetic data distribution to compare a strong baseline CNN-LSTM model to a structured neural network with powerful inductive biases. [...] [PDF]
ViGIL @ NeurIPS(NeurIPS), 2017

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

F. Such, V. Madhavan, E. Conti, J. Lehman, K. Stanley, J. Clune
Deep artificial neural networks (DNNs) are typically trained via gradient-based learning algorithms, namely backpropagation. Evolution strategies (ES) can rival backprop-based algorithms such as Q-learning and policy gradients on challenging deep reinforcement learning (RL) problems. [...] [PDF]
Deep RL @ NeurIPS 2018

Open-endedness: The last grand challenge you’ve never heard of

K. Stanley
Artificial intelligence (AI) is a grand challenge for computer science. Lifetimes of effort and billions of dollars have powered its pursuit. Yet, today its most ambitious vision remains unmet: though progress continues, no human-competitive general digital intelligence is within our reach. [..] [HTML]
O’Reilly Online, 2017

The Reversible Residual Network: Backpropagation Without Storing Activations

A. Gomez, M. Ren, Raquel Urtasun, R. Grosse
Residual Networks (ResNets) have demonstrated significant improvement over traditional Convolutional Neural Networks (CNNs) on image classification, increasing in performance as networks grow both deeper and wider. However, memory consumption becomes a bottleneck as one needs to store all the intermediate activations for calculating gradients using backpropagation. [...] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2017

ES Is More Than Just a Traditional Finite-Difference Approximator

J. Lehman, J. Chen, Jeff Clune, Kenneth O. Stanley
An evolution strategy (ES) variant based on a simplification of a natural evolution strategy recently attracted attention because it performs surprisingly well in challenging deep reinforcement learning domains. It searches for neural network parameters by generating perturbations to the current set of parameters, checking their performance, and moving in the aggregate direction of higher reward. [...] [PDF]
The Genetic and Evolutionary Computation Conference (GECCO), 2018

Automated Identification of Northern Leaf Blight-Infected Maize Plants from Field Imagery Using Deep Learning

C. DeChant, T. Wiesner-Hanks, S, Chen, E. Stewart, J. Yosinski, M. Gore, R. Nelson, and H. Lipson
Northern leaf blight (NLB) can cause severe yield loss in maize; however, scouting large areas to accurately diagnose the disease is time consuming and difficult. We demonstrate a system capable of automatically identifying NLB lesions in field-acquired images of maize plants with high reliability. [...] [PDF]
Phytopathology, 2017

Variational Gaussian Dropout is not Bayesian

J. Hron, A. Matthews, Z. Ghahramani
Gaussian multiplicative noise is commonly used as a stochastic regularisation technique in training of deterministic neural networks. A recent paper reinterpreted the technique as a specific algorithm for approximate inference in Bayesian neural networks; several extensions ensued. [...] [PDF]
Bayesian Deep Learning Workshop @ NeurIPS, 2017

Diffusion-based neuromodulation can eliminate catastrophic forgetting in simple neural networks

R. Velez, J. Clune
A long-term goal of AI is to produce agents that can learn a diversity of skills throughout their lifetimes and continuously improve those skills via experience. A longstanding obstacle towards that goal is catastrophic forgetting, which is when learning new information erases previously learned information. [...] [PDF]
PLoS One, 2017

Be Your Own Prada: Fashion Synthesis With Structural Coherence

S. Zhu, R. Urtasun, S. Fidler, D. Lin, C. Loy
We present a novel and effective approach for generating new clothing on a wearer through generative adversarial learning. Given an input image of a person and a sentence describing a different outfit, our model "redresses" the person as desired, while at the same time keeping the wearer and her/his pose unchanged. [...] [PDF]
International Conference on Computer Vision (ICCV), 2017

DeepRoadMapper: Extracting Road Topology From Aerial Images

G. Máttyus, W. Luo, R. Urtasun
Creating road maps is essential for applications such as autonomous driving and city planning. Most approaches in industry focus on leveraging expensive sensors mounted on top of a fleet of cars. This results in very accurate estimates when exploiting a user in the loop. [...] [PDF]
International Conference on Computer Vision (ICCV), 2017

3D Graph Neural Networks for RGBD Semantic Segmentation

X. Qi, R. Liao, J. Jia, S. Fidler, R. Urtasun
RGBD semantic segmentation requires joint reasoning about 2D appearance and 3D geometric information. In this paper we propose a 3D graph neural network (3DGNN) that builds a k-nearest neighbor graph on top of 3D point cloud. [...] [PDF]
International Conference on Computer Vision (ICCV), 2017

SGN: Sequential Grouping Networks for Instance Segmentation

S. Liu, J. Jia, S. Fidler, R. Urtasun
In this paper, we propose Sequential Grouping Networks (SGN) to tackle the problem of object instance segmentation. SGNs employ a sequence of neural networks, each solving a sub-grouping problem of increasing semantic complexity in order to gradually compose objects out of pixels. [...] [PDF]
International Conference on Computer Vision (ICCV), 2017

Synthesizing Entity Matching Rules by Examples

R. Singh, V. Vamsikrishna Meduri, A. Elmagarmid, S. Madden, P. Papotti, Jo. Quiané-Ruiz, A. Solar-Lezama, N. Tang
Entity matching (EM) is a critical part of data integration. We study how to synthesize entity matching rules from positive-negative matching examples. The core of our solution is program synthesis, a powerful tool to automatically generate rules (or programs) that satisfy a given highlevel specification, via a predefined grammar. [...] [PDF]
Proceedings of the VLDB Endowment (PVLDB) 11(2): 189-202, 2017

Uber vs Taxi: A Driver’s Eye View

J. Angrist, S. Caldwell, J. Hall
Ride-hailing drivers pay a proportion of their fares to the ride-hailing platform operator, a commission-based compensation model used by many internet-mediated service providers. To Uber drivers, this commission is known as the Uber fee. By contrast, traditional taxi drivers in most US cities make a fixed payment independent of their earnings, usually a weekly or daily medallion lease, but keep every fare dollar net of expenses. [...] [PDF]

Situation Recognition With Graph Neural Networks

R. Li, M. Tapaswi, R. Liao, J. Jia, R. Urtasun, S. Fidler
We address the problem of recognizing situations in images. Given an image, the task is to predict the most salient verb (action), and fill its semantic roles such as who is performing the action, what is the source and target of the action, etc. [...] [PDF]
International Conference on Computer Vision (ICCV), 2017

Deep Spectral Clustering Learning

M. T. Law, R. Urtasun, R. S. Zemel
Clustering is the task of grouping a set of examples so that similar examples are grouped into the same cluster while dissimilar examples are in different clusters. The quality of a clustering depends on two problem-dependent factors which are i) the chosen similarity metric and ii) the data representation. Supervised clustering approaches, which exploit labeled partitioned datasets have thus been proposed, for instance to learn a metric optimized to perform clustering. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Lost Relatives of the Gumbel Trick

M. Balog, N. Tripuraneni, Z. Ghahramani, A. Weller
The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

A birth-death process for feature allocation

K. Palla, D. Knowles, Z. Ghahramani
We propose a Bayesian nonparametric prior over feature allocations for sequential data, the birthdeath feature allocation process (BDFP). The BDFP models the evolution of the feature allocation of a set of N objects across a covariate (e.g. time) by creating and deleting features. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Automatic Discovery of the Statistical Types of Variables in a Dataset

I. Valera, Z. Ghahramani
A common practice in statistics and machine learning is to assume that the statistical data types (e.g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known. However, as the availability of real-world data increases, this assumption becomes too restrictive. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

End-To-End Instance Segmentation With Recurrent Attention

M. Ren, R. Zemel
While convolutional neural networks have gained impressive success recently in solving structured prediction problems such as semantic segmentation, it remains a challenge to differentiate individual object instances in the scene. Instance segmentation is very important in a variety of applications, such as autonomous driving, image captioning, and visual question answering. [...] [PDF]
Supplementary Materials: [LINK]
Code: [LINK]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Efficient Multiple Instance Metric Learning Using Weakly Supervised Data

M. T. Law, Y. Yu, R. Urtasun, R. S. Zemel, E. P. Xing
We consider learning a distance metric in a weakly supervised setting where “bags” (or sets) of instances are labeled with “bags” of labels. A general approach is to formulate the problem as a Multiple Instance Learning (MIL) problem where the metric is learned so that the distances between instances inferred to be similar are smaller than the distances between instances inferred to be dissimilar. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery

Y. Zhang, L. Gueguen, I. Zharkov, P. Zhang, K. Seifert, B. Kadlec
Optical Character Recognition (OCR) approaches have been widely advanced in recent years thanks to the resurgence of deep learning. The state-of-the-art models are mainly trained on the datasets consisting of the constrained scenes. Detecting and recognizing text from the real-world images remains a technical challenge. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Neuroevolution: A Different Kind of Deep Learning

K. Stanley
Neuroevolution is making a comeback. Prominent artificial intelligence labs and researchers are experimenting with it, a string of new successes have bolstered enthusiasm, and new opportunities for impact in deep learning are emerging. [...] [HTML]
O’Reilly Online, 2017

Surge Pricing Solves the Wild Goose Chase

J. C. Castillo, D. Knoepfle, E. G. Weyl
Ride-hailing apps usually match more efficiently than taxis, but they can enter a failure mode anticipated by Arnott (1996) that we call wild goose chases. High demand depletes the platform of idle drivers, so cars must be sent to pick up distant customers. Time wasted on pick-ups decreases drivers’ earnings, leading to exit and exacerbating the problem. [...] [PDF]
ACM Conference on Economics and Computation (ACM EC), 2018

Few-Shot Learning Through an Information Retrieval Lens

E. Triantafillou, R. Zemel, R. Urtasun
Few-shot learning refers to understanding new concepts from only a few examples. We propose an information retrieval-inspired approach for this problem that is motivated by the increased importance of maximally leveraging all the available information in this low-data regime. [PDF]
Code: [LINK]
Advances in Neural Information Processing Systems (NeurIPS), 2017

General Latent Feature Modeling for Data Exploration Tasks

I. Valera, M. Pradier, Z. Ghahramani
This paper introduces a general Bayesian non- parametric latent feature model suitable to per- form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. [...] [PDF]
ICML Workshop on Human Interpretability in Machine Learning (ICML), 2017

Time-series extreme event forecasting with neural networks at Uber

N. Laptev, J. Yosinski, L. Li, S. Smyl
Accurate time-series forecasting during high variance segments (e.g., holidays), is critical for anomaly detection, optimal resource allocation, budget planning and other related tasks. At Uber accurate prediction for completed trips during special events can lead to a more efficient driver allocation resulting in a decreased wait time for the riders. [PDF]
International Conference on Machine Learning (ICML), 2017

Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

S. Gu, T. Lillicrap, R. Turner, Z. Ghahramani, B. Schölkopf, S. Levine
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques. On the other hand, on-policy algorithms are often more stable and easier to use. [...] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2017

SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability

M. Raghu, J. Gilmer, J. Yosinski, J. Sohl-Dickstein
We propose a new technique, Singular Vector Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations in a way that is both invariant to affine transform (allowing comparison between different layers and networks) and fast to compute (allowing more comparisons to be calculated than with previous methods). [...] [PDF]
Neural Information Processing Systems (NIPS), 2017

Find Your Way by Observing the Sun and Other Semantic Cues

W.-C. Ma, S. Wang, M. Brubaker, S. Fidler, R. Urtasun
In this paper we present a robust, efficient and affordable approach to self-localization which does not require neither GPS nor knowledge about the appearance of the world. Towards this goal, we utilize freely available cartographic maps and derive a probabilistic model that exploits semantic cues in the form of sun direction, presence of an intersection, road type, speed limit as well as the ego-car trajectory in order to produce very reliable localization results. [...] [PDF]
International Conference on Robotics and Automation (ICRA), 2017

Normalizing the Normalizers: Comparing and Extending Network Normalization Scheme

M. Ren, R. Liao, R. Urtasun, F. H. Sinz, R. Zemel
Normalization techniques have only recently begun to be exploited in supervised learning tasks. Batch normalization exploits mini-batch statistics to normalize the activations. This was shown to speed up training and result in better models. However its success has been very limited when dealing with recurrent neural networks. On the other hand, layer normalization normalizes the activations across all activities within a layer. This was shown to work well in the recurrent setting. In this paper we propose a unified view of normalization techniques, as forms of divisive normalization, which includes layer and batch normalization as special cases. [...] [PDF]
International Conference on Learning Representations (ICLR), 2017

Bayesian Generative Adversarial Networks

Y. Saatchi, A. Wilson
Generative adversarial networks (GANs) can implicitly learn rich distributions over images, audio, and data which are hard to model with an explicit likelihood. We present a practical Bayesian formulation for unsupervised and semi-supervised learning with GANs. [...] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2017

Annotating Object Instances with a Polygon-RNN

L. Castrejón, K. Kundu, R. Urtasun, S. Fidler
We propose an approach for semi-automatic annotation of object instances. While most current methods treat object segmentation as a pixel-labeling problem, we here cast it as a polygon prediction task, mimicking how most current datasets have been annotated. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

The emergence of canalization and evolvability in an open-ended, interactive evolutionary system

J. Huizinga, K. Stanley, J. Clune
Natural evolution has produced a tremendous diversity of functional organisms. Many believe an essential component of this process was the evolution of evolvability, whereby evolution speeds up its ability to innovate by generating a more adaptive pool of offspring. [...] [PDF]
Artificial Life (to appear), 2017

Detail-Revealing Deep Video Super-Resolution

X. Tao, H. Gao, R. Liao, J. Wang, J. Jia, K. Kundu
Previous CNN-based video super-resolution approaches need to align multiple frames to the reference. In this paper, we show that proper frame alignment and motion compensation is crucial for achieving high quality results. [...] [PDF]
International Conference on Computer Vision (ICCV), 2017

Deep Bayesian Active Learning with Image Data

Y. Gal, R. Islam, Z. Ghahramani
Even though active learning forms an important pillar of machine learning, deep learning tools are not prevalent within it. Deep learning poses several difficulties when used in an active learning setting. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Towards Diverse and Natural Image Descriptions via a Conditional GAN

B. Dai, S. Fidler, R. Urtasun, D. Lin
In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712.5 km² of land, 8439 km of road and around 400,000 buildings. Our benchmark provides different perspectives of the world captured from airplanes, drones and cars driving around the city. [...] [PDF]
International Conference on Computer Vision (ICCV), 2017

Bayesian inference on random simple graphs with power law degree distributions

J. Lee, C. Heaukulani, Z. Ghahramani, L. James, S. Choi
We present a model for random simple graphs with a degree distribution that obeys a power law (i.e., is heavy-tailed). To attain this behavior, the edge probabilities in the graph are constructed from Bertoin-Fujita-Roynette-Yor (BFRY) random variables, which have been recently utilized in Bayesian statistics for the construction of power law models in several applications. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

TorontoCity: Seeing the World With a Million Eyes

S. Wang; M. Bai; G. Mattyus; H. Chu; W. Luo; B. Yang; J. Liang; J. Cheverie; R. Urtasun; D. Lin.
Despite the substantial progress in recent years, the image captioning techniques are still far from being perfect. Sentences produced by existing methods, e.g. those based on RNNs, are often overly rigid and lacking in variability. [...] [PDF]
International Conference on Computer Vision (ICCV), 2017

Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space

A. Nguyen, J. Clune, Y. Bengio, A. Dosovitskiy, J. Yosinski
Generating high-resolution, photo-realistic images has been a long-standing goal in machine learning. Recently, Nguyen et al. (2016) showed one interesting way to synthesize novel images by performing gradient ascent in the latent space of a generator network to maximize the activations of one or multiple neurons in a separate classifier network. In this paper we extend this method by introducing an additional prior on the latent code, improving both sample quality and sample diversity, leading to a state-of-the-art generative model that produces high quality images at higher resolutions (227x227) than previous generative models, and does so for all 1000 ImageNet categories. [...] [PDF]
Computer Vision and Pattern Recognition (CVPR), 2017

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

S. Gu, T. Lillicrap, Z. Ghahramani, R. Turner, S. Levine
Model-free deep reinforcement learning (RL) methods have been successful in a wide variety of simulated domains. However, a major obstacle facing deep RL in the real world is their high sample complexity. [...] [PDF]
International Conference on Learning Representations (ICLR), 2016

An Analysis of the Labor Market for Uber’s Driver-Partners in the United States

J. Hall, A. Krueger
Uber, the ride-sharing company launched in 2010, has grown at an exponential rate. This paper provides the first comprehensive analysis of the labor market for Uber’s driver-partners, based on both survey and administrative data. [...] [PDF]

Deep Watershed Transform for Instance Segmentation

M. Bai, R. Urtasun
Most contemporary approaches to instance segmentation use complex pipelines involving conditional random fields, recurrent neural networks, object proposals, or template matching schemes. In our paper, we present a simple yet powerful end-to-end convolutional neural network to tackle this task. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Using Big Data to Estimate Consumer Surplus: The Case of Uber

P. Cohen, R. Hahn, J. Hall, S. Levitt, R. Metcalfe
Estimating consumer surplus is challenging because it requires identification of the entire demand curve. We rely on Uber’s “surge” pricing algorithm and the richness of its individual level data to first estimate demand elasticities at several points along the demand curve. We then use these elasticity estimates to estimate consumer surplus. [...] [PDF]

Magnetic Hamiltonian Monte Carlo

N. Tripuraneni, M. Rowland, Z. Ghahramani, R. Turner
Hamiltonian Monte Carlo (HMC) exploits Hamiltonian dynamics to construct efficient proposals for Markov chain Monte Carlo (MCMC). In this paper, we present a generalization of HMC which exploits \textit{non-canonical} Hamiltonian dynamics. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence

J. Clune
Perhaps the most ambitious scientific quest in human history is the creation of general artificial intelligence, which roughly means AI that is as smart or smarter than humans. The dominant approach in the machine learning community is to attempt to discover each of the pieces required for intelligence, with the implicit assumption that some future group will complete the Herculean task of figuring out how to combine all of those pieces into a complex thinking machine. [...] [PDF]

Forecasting Interactive Dynamics of Pedestrians with Fictitious Play

W. Ma, D. Huang, N. Lee, K. Kitani
We develop predictive models of pedestrian dynamics by encoding the coupled nature of multi-pedestrian interaction using game theory, and deep learning-based visual analysis to estimate person-specific behavior parameters. Building predictive models for multi-pedestrian interactions however, is very challenging due to two reasons [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Conditional Similarity Networks

A. Veit, S. Belongie, T. Karaletsos
What makes images similar? To measure the similarity between images, they are typically embedded in a feature-vector space, in which their distance preserve the relative dissimilarity. However, when learning such similarity embeddings the simplifying assumption is commonly made that images are only compared to one unique measure of similarity. [...] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2017

The Effects of Uber’s Surge Pricing: A Case Study

J. Hall, C. Kendrick, C. Nosko
A sold-out concert in Madison Square Garden provides an illustration of the power of surge to equilibrate supply of and demand for rides with Uber. Surge pricing draws more drivers into the area after the concert ends, and causes riders to sort into requesting a ride (or closing the app without requesting a ride) according to their willingness to pay relative to taking an alternative form of transportation. [...] [PDF]