Skip to footer

Zoubin Ghahramani

Zoubin Ghahramani
Zoubin Ghahramani is Chief Scientist of Uber and a world leader in the field of machine learning, significantly advancing the state-of-the-art in algorithms that can learn from data. He is known in particular for fundamental contributions to probabilistic modeling and Bayesian approaches to machine learning systems and AI. Zoubin also maintains his roles as Professor of Information Engineering at the University of Cambridge and Deputy Director of the Leverhulme Centre for the Future of Intelligence. He was one of the founding directors of the Alan Turing Institute (the UK's national institute for Data Science and AI), and is a Fellow of St John's College Cambridge and of the Royal Society.

Engineering Blog Articles

Introducing the Uber Research Publications Site

Uber's Chief Scientist announces the launch of the Uber Research Publications Site, a portal for showcasing our contributions to the research community.

First Uber Science Symposium: Discussing the Next Generation of RL, NLP, ConvAI, and DL

The Uber Science Symposium featured talks from members of the broader scientific community about the the latest innovations in RL, NLP, and other fields.

Announcing the 2019 Uber AI Residency

The Uber AI Residency is a 12-month training program for academics and professionals interested in becoming an AI researcher with Uber AI Labs or Uber ATG.

Introducing the Uber AI Residency

Interested in accelerating your career by tackling some of Uber’s most challenging AI problems? Apply for the Uber AI Residency, a research fellowship dedicated to fostering the next generation of AI talent.

Welcoming Peter Dayan to Uber AI Labs

Arriving now: Uber's Chief Scientist Zoubin Ghahramani introduces Uber AI Labs' newest team member, award-winning neuroscientist Peter Dayan.

Research Papers

Probabilistic Meta-Representations Of Neural Networks

T. Karaletsos, P. Dayan, Z. Ghahramani
Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently. Here, we consider a richer prior distribution in which units in the network are represented by latent variables, and the weights between units are drawn conditionally on the values of the collection of those variables. [...] [PDF]
UAI 2018 Uncertainty In Deep Learning Workshop (UDL), 2018

Functional Programming for Modular Bayesian Inference

A. Ścibior, O. Kammar, Z. Ghahramani
We present an architectural design of a library for Bayesian modelling and inference in modern functional programming languages. The novel aspect of our approach are modular implementations of existing state-ofthe-art inference algorithms. Our design relies on three inherently functional features: higher-order functions, inductive data-types, and support for either type-classes or an expressive module system. [...] [PDF]

Discovering Interpretable Representations for Both Deep Generative and Discriminative Models

T. Adel, Z. Ghahramani, A. Weller
Interpretability of representations in both deep generative and discriminative models is highly desirable. Current methods jointly optimize an objective combining accuracy and interpretability. However, this may reduce accuracy, and is not applicable to already trained models. We propose two interpretability frameworks. First, we provide an interpretable lens for an existing model. We use a generative model which takes as input the representation in an existing (generative or discriminative) model, weakly supervised by limited side information. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Variational Bayesian dropout: pitfalls and fixes

J. Hron, A. Matthews, Z. Ghahramani
Dropout, a stochastic regularisation technique for training of neural networks, has recently been reinterpreted as a specific type of approximate inference algorithm for Bayesian neural networks. The main contribution of the reinterpretation is in providing a theoretical framework useful for analysing and extending the algorithm [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Gaussian Process Behaviour in Wide Deep Neural Networks

Alexander G. de G. Matthews, Mark Rowland, Jiri Hron, Richard E. Turner, Zoubin Ghahramani
Whilst deep neural networks have shown great empirical success, there is still much work to be done to understand their theoretical properties. In this paper, we study the relationship between random, wide, fully connected, feedforward networks with more than one hidden layer and Gaussian processes with a recursive kernel definition. [...] [PDF]
International Conference on Learning Representations (ICLR), 2018

The Mirage of Action-Dependent Baselines in Reinforcement Learning

G. Tucker, S. Bhupatiraju, S. Gu, R. Turner, Z. Ghahramani, S. Levine
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance. Several recent papers extend the baseline to depend on both the state and action and suggest that this significantly reduces variance and improves sample efficiency without introducing bias into the gradient estimates. [...] [PDF]
International Conference on Machine Learning (ICML), 2018

Weakly supervised collective feature learning from curated media

Y. Mukuta, A. Kimura, D. Adrian, Z. Ghahramani
The current state-of-the-art in feature learning relies on the supervised learning of large-scale datasets consisting of target content items and their respective category labels. However, constructing such large-scale fully-labeled datasets generally requires painstaking manual effort. One possible solution to this problem is to employ community contributed text tags as weak labels, however, the concepts underlying a single text tag strongly depends on the users. [...] [PDF]
AAAI Conference on Artificial Intelligence (AAAI), 2018

Variational Gaussian Dropout is not Bayesian

J. Hron, A. Matthews, Z. Ghahramani
Gaussian multiplicative noise is commonly used as a stochastic regularisation technique in training of deterministic neural networks. A recent paper reinterpreted the technique as a specific algorithm for approximate inference in Bayesian neural networks; several extensions ensued. [...] [PDF]
Bayesian Deep Learning Workshop @ NeurIPS, 2017

Lost Relatives of the Gumbel Trick

M. Balog, N. Tripuraneni, Z. Ghahramani, A. Weller
The Gumbel trick is a method to sample from a discrete probability distribution, or to estimate its normalizing partition function. The method relies on repeatedly applying a random perturbation to the distribution in a particular way, each time solving for the most likely configuration. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

A birth-death process for feature allocation

K. Palla, D. Knowles, Z. Ghahramani
We propose a Bayesian nonparametric prior over feature allocations for sequential data, the birthdeath feature allocation process (BDFP). The BDFP models the evolution of the feature allocation of a set of N objects across a covariate (e.g. time) by creating and deleting features. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Automatic Discovery of the Statistical Types of Variables in a Dataset

I. Valera, Z. Ghahramani
A common practice in statistics and machine learning is to assume that the statistical data types (e.g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known. However, as the availability of real-world data increases, this assumption becomes too restrictive. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

General Latent Feature Modeling for Data Exploration Tasks

I. Valera, M. Pradier, Z. Ghahramani
This paper introduces a general Bayesian non- parametric latent feature model suitable to per- form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables. The proposed model presents several important properties. [...] [PDF]
ICML Workshop on Human Interpretability in Machine Learning (ICML), 2017

Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

S. Gu, T. Lillicrap, R. Turner, Z. Ghahramani, B. Schölkopf, S. Levine
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques. On the other hand, on-policy algorithms are often more stable and easier to use. [...] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2017

Deep Bayesian Active Learning with Image Data

Y. Gal, R. Islam, Z. Ghahramani
Even though active learning forms an important pillar of machine learning, deep learning tools are not prevalent within it. Deep learning poses several difficulties when used in an active learning setting. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Bayesian inference on random simple graphs with power law degree distributions

J. Lee, C. Heaukulani, Z. Ghahramani, L. James, S. Choi
We present a model for random simple graphs with a degree distribution that obeys a power law (i.e., is heavy-tailed). To attain this behavior, the edge probabilities in the graph are constructed from Bertoin-Fujita-Roynette-Yor (BFRY) random variables, which have been recently utilized in Bayesian statistics for the construction of power law models in several applications. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

S. Gu, T. Lillicrap, Z. Ghahramani, R. Turner, S. Levine
Model-free deep reinforcement learning (RL) methods have been successful in a wide variety of simulated domains. However, a major obstacle facing deep RL in the real world is their high sample complexity. [...] [PDF]
International Conference on Learning Representations (ICLR), 2016

Magnetic Hamiltonian Monte Carlo

N. Tripuraneni, M. Rowland, Z. Ghahramani, R. Turner
Hamiltonian Monte Carlo (HMC) exploits Hamiltonian dynamics to construct efficient proposals for Markov chain Monte Carlo (MCMC). In this paper, we present a generalization of HMC which exploits \textit{non-canonical} Hamiltonian dynamics. [...] [PDF]
International Conference on Machine Learning (ICML), 2017

Popular Articles