Skip to footer

Articles

How We Halved Go Monorepo CI Build Time

Painting the Picture Before 2021, Uber engineers would have to take quite a taxing journey to make a code change to the Go Monorepo. First,...

Enabling Offline Inferences at Uber Scale

Introduction At Uber we use data from user support interactions to identify gaps in our products and create better, more delightful experiences for our users....

Uber’s Real-Time Document Check

Introduction Justification for Identity Verification Latin America is a rich cultural region, known for its world-renowned gastronomy, its abundant biodiversity, and its welcoming population. However, socio-economic...

Data Race Patterns in Go

Uber has adopted Golang (Go for short) as a primary programming language for developing microservices. Our Go monorepo consists of about 50 million lines...

Better Load Balancing: Real-Time Dynamic Subsetting

Overview Subsetting is a common technique used in load balancing for large-scale distributed systems. In this blog post, we will briefly introduce Uber’s current service...

Dynamic Data Race Detection in Go Code

Uber has extensively adopted Go as a primary programming language for developing microservices. Our Go monorepo consists of about 50 million lines of code...

Presto® on Apache Kafka® At Uber Scale

Uber’s goal is to ignite opportunity by setting the world in motion, and big data is a very important part of that. Presto® and...

Securing Kafka® Infrastructure at Uber

Background Uber has one of the largest deployments of Apache Kafka® in the world. It empowers a large number of real-time workflows at Uber, including pub-sub...

Uber’s Emergency Button and The Technologies Behind It

Safety has long been a top priority at Uber, as Uber’s CEO Dara Khosrowshahi wrote in ‘Raising the Bar on Safety’ in September 2018....

Avoiding CPU Throttling in a Containerized Environment

At Uber, all stateful workloads run on a common containerized platform across a large fleet of hosts. Stateful workloads include MySQL®, Apache Cassandra®, ElasticSearch®,...

One Stone, Three Birds: Finer-Grained Encryption @ Apache Parquet™

Overview  Data access restrictions, retention, and encryption at rest are fundamental security controls. This blog explains how we have built and utilized open-sourced Apache Parquet™'s...

Introducing Ballast: An Adaptive Load Test Framework

As Uber's architecture has grown to encompass thousands of interdependent microservices, we need to test our mission-critical components at max load in order to...

DeepETA: How Uber Predicts Arrival Times Using Deep Learning

At Uber, magical customer experiences depend on accurate arrival time predictions (ETAs). We use ETAs to calculate fares, estimate pickup times, match riders to...

Project RADAR: Intelligent Early Fraud Detection System with Humans in the Loop

Introduction Uber is a worldwide marketplace of services, processing thousands of monetary transactions every second. As a marketplace, Uber takes on all of the risks...

Cost Efficiency @ Scale in Big Data File Format

  Background Our Apache Hadoop® based data platform ingests hundreds of petabytes of analytical data with minimum latency and stores it in a data lake built...

Capacity Recommendation Engine: Throughput and Utilization Based Predictive Scaling

Introduction Capacity is a key component of reliability. Uber's services require enough resources in order to handle daily peak traffic and to support our different...

Cadence Multi-Tenant Task Processing

Introduction Cadence is a multi-tenant orchestration framework that helps developers at Uber to write fault-tolerant, long-running applications, also known as workflows. It scales horizontally to...

CRISP: Critical Path Analysis for Microservice Architectures

Uber’s backend is an exemplar of microservice architecture. Each microservice is a small, individually deployable program performing a specific business logic (operation). The microservice...

How Uber Migrated Financial Data from DynamoDB to Docstore

Introduction Each day, Uber moves millions of people around the world and delivers tens of millions of food and grocery orders. This generates a large...

Introducing uGroup: Uber’s Consumer Management Framework

Background Apache Kafka® is widely used across Uber’s multiple business lines. Take the example of an Uber ride: When a user opens up the Uber app,...