Better Load Balancing: Real-Time Dynamic Subsetting
Overview
Subsetting is a common technique used in load balancing for large-scale distributed systems. In this blog post, we will briefly introduce Uber’s current service...
One Stone, Three Birds: Finer-Grained Encryption @ Apache Parquet™
Overview
Data access restrictions, retention, and encryption at rest are fundamental security controls. This blog explains how we have built and utilized open-sourced Apache Parquet™'s...
Introducing Ballast: An Adaptive Load Test Framework
As Uber's architecture has grown to encompass thousands of interdependent microservices, we need to test our mission-critical components at max load in order to...
Cadence Multi-Tenant Task Processing
Introduction
Cadence is a multi-tenant orchestration framework that helps developers at Uber to write fault-tolerant, long-running applications, also known as workflows. It scales horizontally to...
How Uber Migrated Financial Data from DynamoDB to Docstore
Introduction
Each day, Uber moves millions of people around the world and delivers tens of millions of food and grocery orders. This generates a large...
Introducing uGroup: Uber’s Consumer Management Framework
Background
Apache Kafka® is widely used across Uber’s multiple business lines. Take the example of an Uber ride: When a user opens up the Uber app,...
Building Uber’s Fulfillment Platform for Planet-Scale using Google Cloud Spanner
Introduction
The Fulfillment Platform is a foundational Uber domain that enables the rapid scaling of new verticals. The platform handles billions of database transactions each...
Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot
Uber recently launched a new capability: Ads on UberEats. With this new ability came new challenges that needed to be solved at Uber, such...
Jellyfish: Cost-Effective Data Tiering for Uber’s Largest Storage System
Problem
Uber deploys a few storage technologies to store business data based on their application model. One such technology is called Schemaless, which enables the...
Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework
Introduction
Uber’s GSS (Global Scaled Solutions) team runs scaled programs for diverse products and businesses, including but not limited to Eats, Rides, and Freight. The...
How Uber Achieves Operational Excellence in the Data Quality Experience
Uber delivers efficient and reliable transportation across the global marketplace, which is powered by hundreds of services, machine learning models, and tens of thousands...
Uber’s Fulfillment Platform: Ground-up Re-architecture to Accelerate Uber’s Go/Get Strategy
Introduction to Fulfillment at Uber
Uber’s mission is to help our consumers effortlessly go anywhere and get anything in thousands of cities worldwide. At its...
Containerizing Apache Hadoop Infrastructure at Uber
Introduction
As Uber’s business grew, we scaled our Apache Hadoop (referred to as ‘Hadoop’ in this article) deployment to 21000+ hosts in 5 years, to...
Customer Support Automation Platform at Uber
High Level Overview of the Problem
Introduction
If you’ve used any online/digital service, chances are that you are familiar with what a typical customer service experience...
Elastic Distributed Training with XGBoost on Ray
Introduction
Since we productionized distributed XGBoost on Apache Spark™ at Uber in 2017, XGBoost has powered a wide spectrum of machine learning (ML) use cases...
Efficient and Reliable Compute Cluster Management at Scale
Introduction
Uber relies on a containerized microservice architecture. Our need for computational resources has grown significantly over the years, as a consequence of business’ growth....
Handling Flaky Unit Tests in Java
Introduction to Flaky Tests
Unit testing forms the bedrock of any Continuous Integration (CI) system. It warns software engineers of bugs in newly-implemented code and...
Scaling of Uber’s API gateway
As a recap from the last article, Uber’s API Gateway provides an interface and acts as a single point of access for all of...
The Architecture of Uber’s API gateway
API gateways are an integral part of microservices architecture in recent years. An API gateway provides a single point of entry for all our...
Flipr: Making Changes Quickly and Safely at Scale
Introduction
Uber’s many software systems require a high volume of changes every day. Because of our systems’ size and complexity, it is a significant challenge...