Skip to footer

Uber’s Fulfillment Platform: Ground-up Re-architecture to Accelerate Uber’s Go/Get Strategy

Introduction to Fulfillment at Uber Uber’s mission is to help our consumers effortlessly go anywhere and get anything in thousands of cities worldwide. At its core, we capture a consumer’s intent and fulfill it by matching it with the right set of providers.  Fulfillment is the “act or process of delivering a product or service to a customer.” The Fulfillment organization...

Containerizing Apache Hadoop Infrastructure at Uber

Introduction As Uber’s business grew, we scaled our Apache Hadoop (referred to as ‘Hadoop’ in this article) deployment to 21000+ hosts in 5 years, to support the various analytical and machine learning use cases. We built a team with varied expertise to address the challenges we faced running Hadoop on bare-metal: host lifecycle management, deployment and automation, Hadoop core development,...

‘Orders Near You’ and User-Facing Analytics on Real-Time Geospatial Data

Introduction By its nature, Uber’s business is highly real-time and contingent upon geospatial data. PBs of data are continuously being collected from our drivers, riders, restaurants, and eaters. Real-time analytics over this geospatial data could provide powerful insights. In this blog, we will highlight the Orders near you feature from the Uber Eats app, illustrating one example of how Uber generates...

Analyzing Customer Issues to Improve User Experience

Introduction The primary goal for customer support is to ensure users’ issues are addressed and resolved in a timely and effective manner. The kind of issues users face and what they say in their support interactions provides a lot of information about the product experience, any technical or operational gaps and even their general sentiment towards the product / company....

Customer Support Automation Platform at Uber

High Level Overview of the Problem Introduction If you’ve used any online/digital service, chances are that you are familiar with what a typical customer service experience entails: you send a message (usually email aliased) to the company’s support staff, fill out a form, expect some back and forth with a customer service representative (CSR), and hopefully have your issue resolved. This...

Tuning Model Performance

Introduction Uber uses machine learning (ML) models to power critical business decisions. An ML model goes through many experiment iterations before making it to production. During the experimentation phase, data scientists or machine learning engineers explore adding features, tuning parameters, and running offline analysis or backtesting. We enhanced the platform to reduce the human toil and time in this stage,...

Elastic Distributed Training with XGBoost on Ray

Introduction Since we productionized distributed XGBoost on Apache Spark™ at Uber in 2017, XGBoost has powered a wide spectrum of machine learning (ML) use cases at Uber, spanning from optimizing marketplace dynamic pricing policies for Freight, improving times of arrival (ETA) estimation, fraud detection and prevention, to content discovery and recommendation for Uber Eats. However, as Uber has scaled, we have...

Continuous Integration and Deployment for Machine Learning Online Serving and Models

Introduction At Uber, we have witnessed a significant increase in machine learning adoption across various organizations and use-cases over the last few years. Our machine learning models are empowering a better customer experience, helping prevent safety incidents, and ensuring market efficiency, all in real time. The figure above is a high level view of CI/CD for models and service binary. One...

Efficient and Reliable Compute Cluster Management at Scale

Introduction Uber relies on a containerized microservice architecture. Our need for computational resources has grown significantly over the years, as a consequence of business’ growth. It is an important goal now to increase the efficiency of our computing resources. Broadly speaking, the efficiency efforts in compute cluster management involve scheduling more workloads on the same number of machines. This approach...

Handling Flaky Unit Tests in Java

Introduction to Flaky Tests Unit testing forms the bedrock of any Continuous Integration (CI) system. It warns software engineers of bugs in newly-implemented code and regressions in existing code, before it is merged. This ensures increased software reliability. It also improves overall developer productivity, as bugs are caught early in the software development lifecycle. Hence, building a stable and reliable...

The Evolution of Data Science Workbench

In October 2017, we published an article introducing Data Science Workbench (DSW), our custom, all-in-one toolbox for data science, complex geospatial analytics, and exploratory machine learning. It centralizes everything required to perform data preparation, ad-hoc analyses, model prototyping, workflow scheduling, dashboarding, and collaboration in a single-pane, web-based graphical user interface.  In this article, we reflect on the evolution of DSW...

Scaling of Uber’s API gateway

As a recap from the last article, Uber’s API Gateway provides an interface and acts as a single point of access for all of our back-end services to expose features and data to Mobile and 3rd party partners. Two major components for a system like API Gateway are configuration management and runtime. The runtime component is responsible for authenticating,...

Fraud Detection: Using Relational Graph Learning to Detect Collusion

As Uber grew in popularity and scale among legitimate customers, it also attracted the attention of financial criminals in the cyberspace. One type of fraudulent behavior is collusion, a cooperative fraud action among users. For example, users collude by taking fake trips with stolen credit cards resulting in chargeback (a bank-initiated refund for a credit card purchase). In this...

The Architecture of Uber’s API gateway

API gateways are an integral part of microservices architecture in recent years. An API gateway provides a single point of entry for all our apps and provides an interface to access data, logic, or functionality from back-end microservices. It also provides a centralized place to implement many high-level responsibilities, including routing, protocol conversion, rate limiting, load shedding, header enrichment...

Introducing Orbit, An Open Source Package for Time Series Inference and Forecasting

Orbit is a general interface for Bayesian time series modeling. The goal of Orbit development team is to create a tool that is easy to use, flexible, interitible, and high performing (fast computation). Under the hood, Orbit uses the probabilistic programming languages (PPL) including but not limited to Stan and Pyro for posterior approximation (i.e, MCMC sampling, SVI). Below...

pprof++: A Go Profiler with Hardware Performance Monitoring

Motivation for a Better Go Profiler Golang is the lifeblood of thousands of Uber’s back-end services, running on millions of CPU cores. Understanding our CPU bottlenecks is critical, both for reducing service latencies and also for making our compute fleet efficient. The scale at which Uber operates demands in-depth insights into codes and microarchitectural implications. While the built-in Go profiler is...

Optimal Feature Discovery: Better, Leaner Machine Learning Models Through Information Theory

Introduction  Suppose you own a production ML model that already works reasonably well. You know that adding relevant and diverse sources of signal to your model is a sure way to boost performance, but finding new features that actually improve performance can be a slow and tedious process of trial and error.  At the start of your search, you might look...

Automating Merchant Live Monitoring with Real-Time Analytics: Charon

At Uber, live monitoring and automation of Ops is critical to preserve marketplace health, maintain reliability, and gain efficiency in markets. By the virtue of the word “live”, this monitoring needs to show what is happening now, with prompt access to fresh data, and the ability to recommend appropriate actions based on that data. Uber’s data platform provides the...

Freight Pricing with a Controlled Markov Decision Process

Intro Uber Freight was launched in 2017 to revolutionize the business of matching shippers and carriers in the huge and inefficient freight trucking industry (around $800B annual spend in the US). We believe, and have demonstrated, that a technology-first freight broker and marketplace can provide better opportunities to carriers, and superior outcomes to shippers and communities alike.  One of the wasteful...

Flipr: Making Changes Quickly and Safely at Scale

Introduction Uber’s many software systems require a high volume of changes every day. Because of our systems’ size and complexity, it is a significant challenge to implement these changes without unintended consequences, ultimately slowing down developer productivity. Flipr is a big part of Uber’s solution to solving this problem. Flipr is a tool that we created for dynamic configuration management,...

Popular Articles