Skip to footer
Home Authors Posts by Zoubin Ghahramani

Zoubin Ghahramani

Zoubin Ghahramani is Chief Scientist of Uber and a world leader in the field of machine learning, significantly advancing the state-of-the-art in algorithms that can learn from data. He is known in particular for fundamental contributions to probabilistic modeling and Bayesian approaches to machine learning systems and AI. Zoubin also maintains his roles as Professor of Information Engineering at the University of Cambridge and Deputy Director of the Leverhulme Centre for the Future of Intelligence. He was one of the founding directors of the Alan Turing Institute (the UK's national institute for Data Science and AI), and is a Fellow of St John's College Cambridge and of the Royal Society.

Engineering Blog Articles

Uber AI in 2019: Advancing Mobility with Artificial Intelligence


Artificial intelligence powers many of the technologies and services underpinning Uber’s platform, allowing engineering and data science teams to make informed decisions that help improve user experiences for products across our lines of business. 

At the forefront of this effort

Announcing the 2020 Uber AI Residency


Connecting the digital and physical worlds safely and reliably on the Uber platform presents exciting technological challenges and opportunities. For Uber, artificial intelligence (AI) is essential to developing systems that are capable of optimized, automated decision making at scale.


Introducing the Uber Research Publications Site

Zoubin Ghahramani is Uber’s Chief Scientist and the Head of AI.

The ease and simplicity of Uber’s platform is built on fundamental advances in science and technology. Teams across Uber are committed to developing the most advanced scientific techniques in

First Uber Science Symposium: Discussing the Next Generation of RL, NLP, ConvAI, and DL


At Uber, hundreds of data scientists, economists, AI researchers and engineers, product analysts, behavioral scientists, and other practitioners leverage scientific methods to solve challenges on our platform. From modeling and experimentation to data analysis, algorithm development, and fundamental research, the

Announcing the 2019 Uber AI Residency


By Theofanis Karaletsos, Ersin Yumer, Raquel Urtasun, and Zoubin Ghahramani on behalf of Uber AI Labs and Uber ATG

Connecting the digital and physical worlds safely and reliably on the Uber platform presents exciting technological challenges and opportunities. For Uber,

Introducing the Uber AI Residency


Uber AI Labs and Uber ATG Toronto are excited to announce the Uber AI Residency, an intensive one-year research training program slated to begin this summer.  

Uber has invested substantially in machine learning and artificial intelligence, with groups around

Welcoming Peter Dayan to Uber AI Labs


Zoubin Ghahramani is Uber’s Chief Scientist and head of Uber AI Labs, Uber’s research arm dedicated to artificial intelligence (AI) and machine learning. Below, Ghahramani introduces AI Labs’ newest team member, award-winning neuroscientist Peter Dayan.  

We are thrilled to

Research Papers

Learning Continuous Treatment Policy and Bipartite Embeddings for Matching with Heterogeneous Causal Effects

W. Y. Zou, S. Shyam, M. Mui, M. Wang, J. Pedersen, Z. Ghahramani
Causal inference methods are widely applied in the fields of medicine, policy, and economics. Central to these applications is the estimation of treatment effects to make decisions. Current methods make binary yes-or-no decisions based on the treatment effect of a single outcome dimension. These methods are unable to capture continuous space treatment policies with a measure of intensity. [...] [PDF]

Probabilistic Meta-Representations Of Neural Networks

T. Karaletsos, P. Dayan, Z. Ghahramani
Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently. Here, we consider a richer prior distribution in which units in the network are represented by latent variables, and the weights between units are drawn conditionally on the values of the collection of those variables. [...] [PDF]
UAI 2018 Uncertainty In Deep Learning Workshop (UDL), 2018

Functional Programming for Modular Bayesian Inference

A. Ścibior, O. Kammar, Z. Ghahramani
We present an architectural design of a library for Bayesian modelling and inference in modern functional programming languages. The novel aspect of our approach are modular implementations of existing state-ofthe-art inference algorithms. Our design relies on three inherently functional features: higher-order functions, inductive data-types, and support for either type-classes or an expressive module system. [...] [PDF]

Discovering Interpretable Representations for Both Deep Generative and Discriminative Models

T. Adel, Z. Ghahramani, A. Weller
Interpretability of representations in both deep generative and discriminative models is highly desirable. Current methods jointly optimize an objective combining accuracy and interpretability. However, this may reduce accuracy, and is not applicable to already trained models. We propose two interpretability frameworks. First, we provide an interpretable lens for an existing model. We use a generative model which takes as input the representation in an existing (generative or discriminative) model, weakly supervised by limited side information. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Variational Bayesian dropout: pitfalls and fixes

J. Hron, A. Matthews, Z. Ghahramani
Dropout, a stochastic regularisation technique for training of neural networks, has recently been reinterpreted as a specific type of approximate inference algorithm for Bayesian neural networks. The main contribution of the reinterpretation is in providing a theoretical framework useful for analysing and extending the algorithm [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Gaussian Process Behaviour in Wide Deep Neural Networks

Alexander G. de G. Matthews, Mark Rowland, Jiri Hron, Richard E. Turner, Zoubin Ghahramani
Whilst deep neural networks have shown great empirical success, there is still much work to be done to understand their theoretical properties. In this paper, we study the relationship between random, wide, fully connected, feedforward networks with more than one hidden layer and Gaussian processes with a recursive kernel definition. [...] [PDF]
International Conference on Learning Representations (ICLR), 2018

The Mirage of Action-Dependent Baselines in Reinforcement Learning

G. Tucker, S. Bhupatiraju, S. Gu, R. Turner, Z. Ghahramani, S. Levine
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance. Several recent papers extend the baseline to depend on both the state and action and suggest that this significantly reduces variance and improves sample efficiency without introducing bias into the gradient estimates. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Weakly supervised collective feature learning from curated media

Y. Mukuta, A. Kimura, D. Adrian, Z. Ghahramani
The current state-of-the-art in feature learning relies on the supervised learning of large-scale datasets consisting of target content items and their respective category labels. However, constructing such large-scale fully-labeled datasets generally requires painstaking manual effort. One possible solution to this problem is to employ community contributed text tags as weak labels, however, the concepts underlying a single text tag strongly depends on the users. [...] [PDF]
AAAI Conference on Artificial Intelligence (AAAI), 2018

Variational Gaussian Dropout is not Bayesian

J. Hron, A. Matthews, Z. Ghahramani
Gaussian multiplicative noise is commonly used as a stochastic regularisation technique in training of deterministic neural networks. A recent paper reinterpreted the technique as a specific algorithm for approximate inference in Bayesian neural networks; several extensions ensued. [...] [PDF]
Bayesian Deep Learning Workshop @ NeurIPS, 2017

Lost Relatives of the Gumbel Trick

M. Balog, N. Tripuraneni, Z. Ghahramani, A. Weller
The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

A birth-death process for feature allocation

K. Palla, D. Knowles, Z. Ghahramani
We propose a Bayesian nonparametric prior over feature allocations for sequential data, the birthdeath feature allocation process (BDFP). The BDFP models the evolution of the feature allocation of a set of N objects across a covariate (e.g. time) by creating and deleting features. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Automatic Discovery of the Statistical Types of Variables in a Dataset

I. Valera, Z. Ghahramani
A common practice in statistics and machine learning is to assume that the statistical data types (e.g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known. However, as the availability of real-world data increases, this assumption becomes too restrictive. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

General Latent Feature Modeling for Data Exploration Tasks

I. Valera, M. Pradier, Z. Ghahramani
This paper introduces a general Bayesian non- parametric latent feature model suitable to per- form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. [...] [PDF]
ICML Workshop on Human Interpretability in Machine Learning (ICML), 2017

Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

S. Gu, T. Lillicrap, R. Turner, Z. Ghahramani, B. Schölkopf, S. Levine
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques. On the other hand, on-policy algorithms are often more stable and easier to use. [...] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2017

Deep Bayesian Active Learning with Image Data

Y. Gal, R. Islam, Z. Ghahramani
Even though active learning forms an important pillar of machine learning, deep learning tools are not prevalent within it. Deep learning poses several difficulties when used in an active learning setting. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Bayesian inference on random simple graphs with power law degree distributions

J. Lee, C. Heaukulani, Z. Ghahramani, L. James, S. Choi
We present a model for random simple graphs with a degree distribution that obeys a power law (i.e., is heavy-tailed). To attain this behavior, the edge probabilities in the graph are constructed from Bertoin-Fujita-Roynette-Yor (BFRY) random variables, which have been recently utilized in Bayesian statistics for the construction of power law models in several applications. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

S. Gu, T. Lillicrap, Z. Ghahramani, R. Turner, S. Levine
Model-free deep reinforcement learning (RL) methods have been successful in a wide variety of simulated domains. However, a major obstacle facing deep RL in the real world is their high sample complexity. [...] [PDF]
International Conference on Learning Representations (ICLR), 2016

Magnetic Hamiltonian Monte Carlo

N. Tripuraneni, M. Rowland, Z. Ghahramani, R. Turner
Hamiltonian Monte Carlo (HMC) exploits Hamiltonian dynamics to construct efficient proposals for Markov chain Monte Carlo (MCMC). In this paper, we present a generalization of HMC which exploits \textit{non-canonical} Hamiltonian dynamics. [...] [PDF]
International Conference on Machine Learning (ICML), 2017