Skip to footer

LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving

G. P. Meyer, A. Laddha, E. Kee, C. Vallespi-Gonzalez, C. Wellington

In this paper, we present LaserNet, a computationally efficient method for 3D object detection from LiDAR data for autonomous driving. The efficiency results from processing LiDAR data in the native range view of the sensor, where the input data is naturally compact. […]
[PDF]
Computer Vision and Pattern Recognition (CVPR), 2019

Understanding and Designing for Deaf or Hard of Hearing Drivers on Uber

S. Lee, B. Hubert-Wallander, M. Stevens, J. M. Carroll

We used content analysis of in-app driver survey responses, customer support tickets, and tweets, and face-to-face interviews of DHH Uber drivers to better understand the DHH driver experience. Here we describe challenges DHH drivers experience and how they address those difficulties via Uber’s accessibility features and their own workarounds. […]
[PDF]
Conference on Human Factors in Computing Systems (CHI), 2019

End-to-end Interpretable Neural Motion Planner

W. Zeng, W. Luo, S. Suo, A. Sadat, B. Yang, S. Casas, R. Urtasun
In this paper, we propose a neural motion planner for learning to drive autonomously in complex urban scenarios that include traffic-light handling, yielding, and interactions with multiple road-users. Towards this goal, we design a holistic model that takes as input raw LIDAR data and an HD map and produces interpretable intermediate representations in the form of 3D detections and their future trajectories, as well as a cost volume defining the goodness of each position that the self-driving car can take within the planning horizon. […] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Learning to Localize through Compressed Binary Maps

X. Wei, I. A. Bârsan, S. Wang, J. Martinez, R. Urtasun
One of the main difficulties of scaling current localization systems to large environments is the on-board storage required for the maps. In this paper we propose to learn to compress the map representation such that it is optimal for the localization task. […] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Convolutional Recurrent Network for Road Boundary Extraction

J. Liang, N. Homayounfar, S. Wang, W.-C. Ma, R. Urtasun
Creating high definition maps that contain precise information of static elements of the scene is of utmost importance for enabling self driving cars to drive safely. In this paper, we tackle the problem of drivable road boundary extraction from LiDAR and camera imagery. […] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Multi-Task Multi-Sensor Fusion for 3D Object Detection

M. Liang, B. Yang, Y. Chen, R. Hu, R. Urtasun
In this paper we propose to exploit multiple related tasks for accurate multi-sensor 3D object detection. Towards this goal we present an end-to-end learnable architecture that reasons about 2D and 3D object detection as well as ground estimation and depth completion. […] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Deep Rigid Instance Scene Flow

W.-C. Ma, S. Wang, R. Hu, Y. Xiong, R. Urtasun
In this paper we tackle the problem of scene flow estimation in the context of self-driving. We leverage deep learning techniques as well as strong priors as in our application domain the motion of the scene can be composed by the motion of the robot and the 3D motion of the actors in the scene. […] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Dimensionality Reduction for Representing the Knowledge of Probabilistic Models

M. T. Law, J. Snell, A.-M. Farahmand, R. Urtasun, R. S. Zemel
Most deep learning models rely on expressive high-dimensional representations to achieve good performance on tasks such as classification. However, the high dimensionality of these representations makes them difficult to interpret and prone to over-fitting. We propose a simple, intuitive and scalable dimension reduction framework that takes into account the soft probabilistic interpretation of standard deep models for classification. […] [PDF]
International Conference on Learning Representations (ICLR), 2019

DARNet: Deep Active Ray Network for Building Segmentation

D. Cheng, R. Liao, S. Fidler, R. Urtasun
In this paper, we propose a Deep Active Ray Network (DARNet) for automatic building segmentation. Taking an image as input, it first exploits a deep convolutional neural network (CNN) as the backbone to predict energy maps, which are further utilized to construct an energy function. […] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Building Together: When Research Went Viral at Uber

B. Hubert-Wallander, E. G. Ruiz, M. Jain, L. G. Barrio, S. S. Mitra, M. Stevens

In late 2017, Uber was nearly a year into a complete redesign of its driver-facing mobile app. This case study describes the research program we executed to support the app’s global beta launch, which aimed to “Build Together” with drivers across different geographies. […]
[PDF][VIDEO]
Conference on Human Factors in Computing Systems (CHI), 2019

UPSNet: A Unified Panoptic Segmentation Network

Y. Xiong, R. Liao, H. Zhao, R. Hu, M. Bai, E. Yumer, R. Urtasun
In this paper we tackle the problem of scene flow estimation in the context of self-driving. We leverage deep learning techniques as well as strong priors as in our application domain the motion of the scene can be composed by the motion of the robot and the 3D motion of the actors in the scene. […] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2019

Learning a Generative Model for Multi-Step Human-Object Interactions from Videos

H. Wang, S. Pirk, V. Kim, E. Yumer, L. Guibas
Creating dynamic virtual environments consisting of humans interacting with objects is a fundamental problem in computer graphics. While it is well-accepted that agent interactions play an essential role in synthesizing such scenes, most extant techniques exclusively focus on static scenes, leaving the dynamic component out. In this paper, we present a generative model to synthesize plausible multi-step dynamic human–object interactions. […] [PDF]
European Association for Computer Graphics (Eurographics), 2019

DeepSignals: Predicting Intent of Drivers Through Visual Attributes

D. Frossard, E. Kee, R. Urtasun
Detecting the intention of drivers is an essential task in self-driving, necessary to anticipate sudden events like lane changes and stops. Turn signals and emergency flashers communicate such intentions, providing seconds of potentially critical reaction time. In this paper, we propose to detect these signals in video sequences by using a deep neural network that reasons about both spatial and temporal information. […] [PDF]
International Conference on Robotics and Automation (ICRA), 2019

Exploratory Stage Lighting Design using Visual Objectives

E. Shimizu, S. Paris, M. Fisher, E. Yumer, K. Fatahalian
Lighting is a critical element of theater. A lighting designer is responsible for drawing the audience’s attention to a specific part of the stage, setting time of day, creating a mood, and conveying emotions. Designers often begin the lighting design process by collecting reference visual imagery that captures different aspects of their artistic intent. Then, they experiment with various lighting options to determine which ideas work best on stage. However, modern stages contain tens to hundreds of lights, and setting each light source’s parameters individually to realize an idea is both tedious and requires expert skill. In this paper, we describe an exploratory lighting design tool based on feedback from professional designers. […] [PDF]
European Association for Computer Graphics (Eurographics), 2019

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

H. Zhou, J. Lan, R. Liu, J. Yosinski

Optical Character Recognition (OCR) approaches have been widely advanced in recent years thanks to the resurgence of deep learning. The state-of-the-art models are mainly trained on the datasets consisting of the constrained scenes. Detecting and recognizing text from the real-world images remains a technical challenge. […] [PDF]
2019

Metropolis-Hastings Generative Adversarial Networks

R. Turner, J. Hung, Y. Saatci, J. Yosinski
We introduce the Metropolis-Hastings generative adversarial network (MH-GAN), which combines aspects of Markov chain Monte Carlo and GANs. The MH-GAN draws samples from the distribution implicitly defined by a GAN’s discriminator-generator pair, as opposed to sampling in a standard GAN which draws samples from the distribution defined by the generator. […] [PDF]
International Conference on Machine Learning (ICML), 2019

Exact Gaussian Processes on a Million Data Points

K. A. Wang, G. Pleiss, J. R. Gardner, S. Tyree, K. Q. Weinberger, A. G. Wilson
Gaussian processes (GPs) are flexible models with state-of-the-art performance on many impactful applications. However, computational constraints with standard inference procedures have limited exact GPs to problems with fewer than about ten thousand training points, necessitating approximations for larger datasets. In this paper, we develop a scalable approach for exact GPs that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication. […] [PDF]
arXiv, 2019

Keeping master green at scale

S. Ananthanarayanan, M. S. Ardekani, D. Haenikel, B. Varadarajan, S. Soriano, D. Patel, A.-R. Adl-Tabatabai
This paper presents the design and implementation of SubmitQueue. It guarantees an always green master branch at scale: all build steps (e.g., compilation, unit tests, UI tests) successfully execute for every commit point. SubmitQueue has been in production for over a year, and can scale to thousands of daily commits to giant monolithic repositories. […] [PDF]
European Conference on Computer Systems (EuroSys), 2019

Quantum speedup at zero temperature via coherent catalysis

G. A. Durkin
Proving quantum speed-up is possible in certain models of quantum annealing with non-stoquastic drivers. The results contradict conventional mean-field analysis in the thermodynamic limit. Asymptotic analysis of finite size system predicts dominant behaviour — both scaling and coefficients of numerical results for systems of more than 50 qubits, indicating the legitmacy and importance of quantum transport by vacuum delocalization. [PDF]
American Physical Society (APS), 2019

Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions

R. Wang, J. Lehman, J. Clune, K. Stanley
While the history of machine learning so far encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. […] [PDF]
2019

Go-Explore: a New Approach for Hard-Exploration Problems

A. Ecoffet, J. Huizinga, J. Lehman, K. Stanley, J. Clune
A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma’s Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. […] [PDF]
2019

Photo-Sketching: Inferring Contour Drawings from Images

M. Li, Z. Lin, R. Mech, E. Yumer, D. Ramanan

Edges, boundaries and contours are important subjects of study in both computer graphics and computer vision. On one hand, they are the 2D elements that convey 3D shapes, on the other hand, they are indicative of occlusion events and thus separation of objects or semantic concepts. In this paper, we aim to generate contour drawings, boundary-like drawings that capture the outline of the visual scene. Prior art often cast this problem as boundary detection. […] [PDF]
Winter Conference on Applications of Computer Vision (WACV), 2019

Neural Guided Constraint Logic Programming for Program Synthesis

L. Zhang, G. Rosenblatt, E. Fetaya, R. Liao, W. Byrd, M. Might, R. Urtasun, R. Zemel
Synthesizing programs using example input/outputs is a classic problem in artificial intelligence. We present a method for solving Programming By Example (PBE) problems by using a neural model to guide the search of a constraint logic programming system called miniKanren. […] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2018

Robustness to out-of-distribution inputs via taskaware generative uncertainty

R. McAllister, G. Kahn, J. Clune, S. Levine
Deep learning provides a powerful tool for machine perception when the observations resemble the training data. However, real-world robotic systems must react intelligently to their observations even in unexpected circumstances. This requires a system to reason about its own uncertainty given unfamiliar, out-of-distribution observations. […] [PDF]
International Conference on Robotics and Automation (ICRA), 2019

LanczosNet: Multi-Scale Deep Graph Convolutional Networks

R. Liao, Z. Zhao, R. Urtasun, R. Zemel
Relational data can generally be represented as graphs. For processing such graph structured data, we propose LanczosNet, which uses the Lanczos algorithm to construct low rank approximations of the graph Laplacian for graph convolution. […] [PDF]
Neural Information Processing Systems (NeurIPS), 2018

Graph HyperNetworks for Neural Architecture Search

C. Zhang, M. Ren, R. Urtasun
Neural architecture search (NAS) automatically finds the best task-specific neural network topology, outperforming many manual architecture designs. However, it can be prohibitively expensive as the search requires training thousands of different networks, while each can last for hours. In this work, we propose the Graph HyperNetwork (GHN) to amortize the search cost: given an architecture, it directly generates the weights by running inference on a graph neural network. […] [PDF]
Meta Learning workshop @ Neural Information Processing Systems (NeurIPS), 2018

Predicting Motion of Vulnerable Road Users using High-Definition Maps and Efficient ConvNets

F. Chou, T.-H. Lin, H. Cui, V. Radosavljevic, T. Nguyen, T. Huang, M. Niedoba, J. Schneider, N. Djuric
Following detection and tracking of traffic actors, prediction of their future motion is the next critical component of a self-driving vehicle (SDV), allowing the SDV to move safely and efficiently in its environment. This is particularly important when it comes to vulnerable road users (VRUs), such as pedestrians and bicyclists. We present a deep learning method for predicting VRU movement where we rasterize high-definition maps and actor’s surroundings into bird’s-eye view image used as input to convolutional networks. […] [PDF]
MLITS workshop @ Neural Information Processing Systems (NeurIPS), 2018

Rotated Rectangles for Symbolized Building Footprint Extraction

M. Dickenson, L. Gueguen
Building footprints (BFP) provide useful visual context for users of digital maps when navigating in space. This paper proposes a method for extracting and symbolizing building footprints from satellite imagery using a convolutional neural network (CNN). […] [PDF]
Conference on Computer Vision and Pattern Recognition (CVPR), 2018

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents

F. Such, V. Madhavan, R. Liu, R. Wang, P. Castro, Y. Li, L. Schubert, M. Bellemare, J. Clune, J. Lehman
Much human and computational effort has aimed to improve how deep reinforcement learning algorithms perform on benchmarks such as the Atari Learning Environment. Comparatively less effort has focused on understanding what has been learned by such methods, and investigating and comparing the representations learned by different families of reinforcement learning (RL) algorithms. […] [PDF]
2018

Faster Neural Networks Straight from JPEG

L. Gueguen, A. Sergeev, B. Kadlec, R. Liu, J. Yosinski
The simple, elegant approach of training convolutional neural networks (CNNs) directly from RGB pixels has enjoyed overwhelming empirical success. But can more performance be squeezed out of networks by using different input representations? In this paper we propose and explore a simple idea: train CNNs directly on the blockwise discrete cosine transform (DCT) coefficients computed and available in the middle of the JPEG codec. […] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2018

Page 1 of 5