Pathwise Derivatives for Multivariate Distributions

    Abstract

    We exploit the link between the transport equation and derivatives of expectations to construct efficient pathwise gradient estimators for multivariate distributions. We focus on two main threads. First, we use null solutions of the transport equation to construct adaptive control variates that can be used to construct gradient estimators with reduced variance. Second, we consider the case of multivariate mixture distributions. In particular we show how to compute pathwise derivatives for mixtures of multivariate Normal distributions with arbitrary means and diagonal covariances. We demonstrate in a variety of experiments in the context of variational inference that our gradient estimators can outperform other methods, especially in high dimensions.

    Authors

    Martin Jankowiak, Theofanis Karaletsos

    Conference

    AI STATS 19

    Full Paper

    ‘Pathwise Derivatives for Multivariate Distributions’ (PDF)

    Uber AI

    Comments
    Previous articleHierarchical Recurrent Attention Networks for Structured Online Maps
    Next articleEnd-to-end Learning of Multi-sensor 3D Tracking by Detection
    Martin Jankowiak
    Martin is a former particle physicist whose interest in data and modeling goes back to the Large Hadron Collider. After physics stops at Stanford and Heidelberg, he became a Research Scientist at the Center for Urban Science and Progress at NYU with a focus on applied machine learning research. Martin then joined a small machine learning start-up (Geometric Intelligence) with the happy end result that he joined AI Labs in March 2017.
    Theofanis Karaletsos
    Theofanis took his first steps as a machine learner at the Max Planck Institute For Intelligent Systems in collaboration with Microsoft Research Cambridge with work focused on unsupervised knowledge extraction from unstructured data, such as generative modeling of images and phenotyping for biology. He then moved to Memorial Sloan Kettering Cancer Center in New York, where he worked on machine learning in the context of cancer therapeutics. He joined a small AI startup Geometric Intelligence in 2016 and with his colleagues formed the new Uber AI Labs. Theofanis' research interests are focused on rich probabilistic modeling, approximate inference and probabilistic programming. His main passion are structured models, examples of which are spatio-temporal processes, models of image formation, deep probabilistic models and the tools needed to make them work on real data. His past in the life sciences has also made him keenly interested in how to make models interpretable and quantify their uncertainty, non-traditional learning settings such as weakly supervised learning and model criticism.