Skip to footer

Optimal Feature Discovery: Better, Leaner Machine Learning Models Through Information Theory

Introduction  Suppose you own a production ML model that already works reasonably well. You know that adding relevant and diverse sources of signal to your model is a sure way to boost performance, but finding new features that actually improve performance can be a slow and tedious process of trial and error.  At the start of your search, you might look...

Automating Merchant Live Monitoring with Real-Time Analytics: Charon

At Uber, live monitoring and automation of Ops is critical to preserve marketplace health, maintain reliability, and gain efficiency in markets. By the virtue of the word “live”, this monitoring needs to show what is happening now, with prompt access to fresh data, and the ability to recommend appropriate actions based on that data. Uber’s data platform provides the...

Freight Pricing with a Controlled Markov Decision Process

Intro Uber Freight was launched in 2017 to revolutionize the business of matching shippers and carriers in the huge and inefficient freight trucking industry (around $800B annual spend in the US). We believe, and have demonstrated, that a technology-first freight broker and marketplace can provide better opportunities to carriers, and superior outcomes to shippers and communities alike.  One of the wasteful...

Flipr: Making Changes Quickly and Safely at Scale

Introduction Uber’s many software systems require a high volume of changes every day. Because of our systems’ size and complexity, it is a significant challenge to implement these changes without unintended consequences, ultimately slowing down developer productivity. Flipr is a big part of Uber’s solution to solving this problem. Flipr is a tool that we created for dynamic configuration management,...

Uber’s Journey Toward Better Data Culture From First Principles

Data powers Uber Uber has revolutionized how the world moves by powering billions of rides and deliveries connecting millions of riders, businesses, restaurants, drivers, and couriers. At the heart of this massive transportation platform is Big Data and Data Science that powers everything that Uber does, such as better pricing and matching, fraud detection, lowering ETAs, and experimentation. Petabytes of...

Navigating to the Technical Program Management and Learning Team

Spread across 4 continents, the Technical Strategy, Program Management, and Learning team is composed of Technical Program Managers (TPMs), Technical Writers, Technical Strategists, and Technical Training Program Managers. Uber TPMs play a critical role in executing high-impact, company-wide initiatives and continuously improving processes to increase the effectiveness of our Product and Engineering organizations. On the Learning side, Program Managers and Technical Writers increase...

Elastic Deep Learning with Horovod on Ray

Introduction In 2017, we introduced Horovod, an open source framework for scaling deep learning training across hundreds of GPUs in parallel.  At the time, most of the deep learning use cases at Uber were related to the research and development of self-driving vehicles, while in Michelangelo, the vast majority of production machine learning models were tree models based on XGBoost. Now...

Applying Machine Learning in Internal Audit with Sparsely Labeled Data

As machine learning continues to evolve, transforming the various industries it touches, it has only begun to inform the world of audit. As a data scientist and former CPA Auditor, I can understand why this is the case. By nature, auditing is a field that focuses on the fine details and investigates any exceptions, while machine learning typically seeks...

How Uber Deals with Large iOS App Size

The App Size Problem Uber’s iOS mobile Apps for Rider, Driver, and Eats are large in size. The choice of Swift as our primary programming language, our fast-paced development environment and feature additions, layered software and its dependencies, and statically linked platform libraries result in large app binaries. Reducing application size is critical to our customer experience. Moreover, Apple’s app-download-size...

Evolving Schemaless into a Distributed SQL Database

Introduction In 2016 we published blog posts (I, II) about Schemaless - Uber Engineering’s Scalable Datastore. We went over the design of Schemaless as well as explained the reasoning behind developing it. In this post today we are going to talk about the evolution of Schemaless into a general-purpose transactional database called Docstore.  Docstore is a general-purpose multi-model database that provides...

Fast and Reliable Schema-Agnostic Log Analytics Platform

At Uber, we provide a centralized, reliable, and interactive logging platform that empowers engineers to work quickly and confidently at scale. The logs are tagged with a rich set of contextual key value pairs, with which engineers can slice and dice their data to surface abnormal or interesting patterns that can guide product improvement. Right now, the platform is...

Uber’s Real-time Data Intelligence Platform At Scale: Improving Gairos Scalability/Reliability

Background Real-time data (# of ride requests, # of drivers available, weather, game) enables operations teams to make informed decisions like surge pricing, maximum dispatch ETA calculating, and demand/supply forecasting about our services that improve user experiences on the Uber platform. While batched data can provide powerful insights by identifying medium-term and long-term trends, Uber services can combine streaming data...

The Journey Towards Metric Standardization

At Uber, business metrics are vital for discovering insights about how we perform, gauging the impact of new products, and optimizing the decision making process. The use cases for metrics can range from an operations member diagnosing a fares issue at the trip level to a machine learning model for dynamic pricing that shapes a balanced and robust marketplace...

Disaster Recovery for Multi-Region Kafka at Uber

Apache Kafka at Uber Uber has one of the largest deployments of Apache Kafka in the world, processing trillions of messages and multiple petabytes of data per day. As Figure 1 shows, today we position Apache Kafka as a cornerstone to Uber’s technology stack and build a complex ecosystem on top of it to empower a large number of different...

Uber’s Real-Time Push Platform

Uber builds multi-sided marketplaces handling millions of trips every day across the globe. We strive to build real-time experiences for all our users. The nature of real time marketplaces make them very lively. Over the course of a trip, there are multiple participants that can modify and view the state of an ongoing trip and need real-time updates. This creates...

No Code Workflow Orchestrator for Building Batch & Streaming Pipelines at Scale

Data-In-Motion @ Uber At Uber, several petabytes of data move across and within various platforms every day. We power this data movement by a strong backbone of data pipelines. Whether it’s ingesting the data from millions of Uber trips or transforming the ingested data for analytical and machine learning models, it all runs through these pipelines. To put it in...

Horovod v0.21: Optimizing Network Utilization with Local Gradient Aggregation and Grouped Allreduce

We originally open-sourced Horovod in 2017, and since then it has grown to become the standard solution in industry for scaling deep learning training to hundreds of GPUs.  With Horovod, you can reduce training times from days or weeks to hours or minutes by adding just a few lines of Python code to an existing TensorFlow, PyTorch, or Apache...

Turning Metadata Into Insights with Databook

Every day in over 10,000 cities around the world, millions of people rely on Uber to travel, order food, and ship cargo. Our apps and services are available in over 69 countries and run 24 hours a day. At our global scale, these activities generate large amounts of logging & operational data that runs through our systems in real-time....

Meet the 2020 Safety Engineering Interns: COVID Edition

About the Safety team & What we do Uber is dedicated to keeping people safe on the road. The Safety and Insurance Engineering team is at the core of Uber’s business. We work to redefine what it takes to be safe on the roads at a global scale. Our technology enables us to focus on rider safety before, during, and...

Operating Apache Pinot @ Uber Scale

Introduction Uber has a complex marketplace consisting of riders, drivers, eaters, restaurants and so on. Operating that marketplace at a global scale requires real-time intelligence and decision making. For instance, identifying delayed Uber Eats orders or abandoned carts helps to enable our community operations team to take corrective action. Having a real-time dashboard of different events such as consumer demand,...

Popular Articles