Skip to footer
Home Authors Posts by Piero Molino

Piero Molino

Piero Molino
Piero is a Staff Research Scientist in the Hazy research group at Stanford University. He is a former founding member of Uber AI where he created Ludwig, worked on applied projects (COTA, Graph Learning for Uber Eats, Uber’s Dialogue System) and published research on NLP, Dialogue, Visualization, Graph Learning, Reinforcement Learning and Computer Vision.

Engineering Blog Articles

Ludwig v0.3 Introduces Hyperparameter Optimization, Transformers and TensorFlow 2 support

In February 2019, Uber released Ludwig, an open source, code-free deep learning (DL) toolbox that gives non-programmers and advanced machine learning (ML) practitioners alike the power to develop models for a variety of DL tasks. With use cases spanning text

Meta-Graph: Few-Shot Link Prediction Using Meta-Learning


This article is based on the paper “Meta-Graph: Few Shot Link Prediction via Meta Learning” by Joey Bose, Ankit Jain, Piero Molino, and William L. Hamilton

Many real-world data sets are structured as graphs, and as such, machine

Controlling Text Generation with Plug and Play Language Models


This article is based on the paper “Plug and Play Language Models: A Simple Approach To Controlled Text Generationby Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, and Rosanne Liu.


Food Discovery with Uber Eats: Using Graph Learning to Power Recommendations

The Uber Eats app serves as a portal to more than 320,000 restaurant-partners in over 500 cities globally across 36 countries. In order to make the user experience more seamless and easy-to-navigate, we show users the dishes, restaurants, and cuisines

Ludwig v0.2 Adds New Features and Other Improvements to its Deep Learning Toolbox


Uber released Ludwig, our open source, code-free deep learning toolbox, in February 2019, introducing the world to one of the easiest ways to get started building machine learning models. The simplicity and the declarative nature of Ludwig’s model definition

Introducing Ludwig, a Code-Free Deep Learning Toolbox

Over the last decade, deep learning models have proven highly effective at performing a wide variety of machine learning tasks in vision, speech, and language. At Uber we are using these models for a variety of tasks, including customer support

An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution


Uber uses convolutional neural networks in many domains that could potentially involve coordinate transforms, from designing self-driving vehicles to automating street sign detection to build maps and maximizing the efficiency of spatial movements in the Uber Marketplace.

In deep learning,

COTA: Improving Uber Customer Care with NLP & Machine Learning


To facilitate the best end-to-end experience possible for users, Uber is committed to making customer support easier and more accessible. Working toward this goal, Uber’s Customer Obsession team leverages five different customer-agent communication channels powered by an in-house platform that

Research Papers

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

S. Dathathri, A. Madotto, J. Lan, J. Hung, E. Frank, P. Molino, J. Yosinski, R. Liu
Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without modifying the model architecture or fine-tuning on attribute-specific data and entailing the significant cost of retraining. We propose a simple alternative: the Plug and Play Language Model (PPLM) for controllable language generation, which combines a pretrained LM with one or more simple attribute classifiers that guide text generation without any further training of the LM. [PDF]
International Conference on Learning Representations (ICLR), 2020

Flexibly-Structured Model for Task-Oriented Dialogues

L. Shu, P. Molino, M. Namazifar, H. Xu, B. Liu, H. Zheng, G. Tur
This paper proposes a novel end-to-end architecture for task-oriented dialogue systems. It is based on a simple and practical yet very effective sequence-to-sequence approach, where language understanding and state tracking tasks are modeled jointly with a structured copy-augmented sequential decoder and a multi-label decoder for each slot. The policy engine and language generation tasks are modeled jointly following that. [...] [PDF]

Collaborative Multi-Agent Dialogue Model Training Via Reinforcement Learning

A. Papangelis, Y.-C. Wang, P. Molino, G. Tur
We present the first complete attempt at concurrently training conversational agents that communicate only via self-generated language. Using DSTC2 as seed data, we trained natural language understanding (NLU) and generation (NLG) networks for each agent and let the agents interact online. [...] [PDF]
Special Interest Group on Discourse and Dialogue (SIGDIAL), 2019

Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models

J. Zhang, Y. Wang, P. Molino, L. Li, D. Ebert
Interpretation and diagnosis of machine learning models have gained renewed interest in recent years with breakthroughs in new approaches. We present Manifold, a framework that utilizes visual analysis techniques to support interpretation, debugging, and comparison of machine learning models in a more transparent and interactive manner. [...] [PDF]
IEEE Visualization (IEEE VIS), 2018

COTA: Improving the Speed and Accuracy of Customer Support through Ranking and Deep Networks

P. Molino, H. Zheng, Y.-C. Wang
For a company looking to provide delightful user experiences, it is of paramount importance to take care of any customer issues. This paper proposes COTA, a system to improve speed and reliability of customer support for end users through automated ticket classification and answers selection for support representatives. [...] [PDF]
ACM SIGKDD International Conference on Knowledge Discovery and Data Science (KDD), 2018

An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution

R. Liu, J. Lehman, P. Molino, F.i Such, E. Frank, A. Sergeev, J. Yosinski
Few ideas have enjoyed as large an impact on deep learning as convolution. For any problem involving pixels or spatial representations, common intuition holds that convolutional neural networks may be appropriate. In this paper we show a striking counterexample to this intuition via the seemingly trivial coordinate transform problem, which simply requires learning a mapping between coordinates in (x,y) Cartesian space and one-hot pixel space. [...] [PDF]
Advances in Neural Information Processing Systems (NeurIPS), 2018

Incorporating the Structure of the Belief State in End-to-End Task-Oriented Dialogue Systems

L. Shu, P. Molino, M. Namazifar, B. Liu, H. Xu, H. Zheng, and G. Tur
End-to-end trainable networks try to overcome error propagation, lack of generalization and overall brittleness of traditional modularized task-oriented dialogue system architectures. Most proposed models expand on the sequence-to-sequence architecture. Some of them don’t track belief state, which makes it difficult to interact with ever-changing knowledge bases, while the ones that explicitly track the belief state do it with classifiers. The use of classifiers suffers from the out-of-vocabulary words problem, making these models hard to use in real-world applications with ever-changing knowledge bases. We propose Structured Belief Copy Network (SBCN), a novel end-to-end trainable architecture that allows for interaction with external symbolic knowledge bases and solves the out-of-vocabulary problem at the same time. [...] [PDF]
Conversational Intelligence Challenge at Conference on Neural Information Processing Systems (ConvAI @ NeurIPS), 2018

Characterizing how Visual Question Answering models scale with the world

E. Bingham, P. Molino, P. Szerlip, F. Obermeyer, N. Goodman
Detecting differences in generalization ability between models for visual question answering tasks has proven to be surprisingly difficult. We propose a new statistic, asymptotic sample complexity, for model comparison, and construct a synthetic data distribution to compare a strong baseline CNN-LSTM model to a structured neural network with powerful inductive biases. [...] [PDF]
ViGIL @ NeurIPS(NeurIPS), 2017

Popular Articles