Skip to footer

Open Source

Introducing Shadower: A Minimalistic Load Testing Tool

Introduction Shadower is a load testing tool that allows us to provide load testing as a service to any microservice at Uber. Shadower started as a...

How We Halved Go Monorepo CI Build Time

Painting the Picture Before 2021, Uber engineers would have to take quite a taxing journey to make a code change to the Go Monorepo. First,...

Enabling Offline Inferences at Uber Scale

Introduction At Uber we use data from user support interactions to identify gaps in our products and create better, more delightful experiences for our users....

Uber’s Real-Time Document Check

Introduction Justification for Identity Verification Latin America is a rich cultural region, known for its world-renowned gastronomy, its abundant biodiversity, and its welcoming population. However, socio-economic...

Data Race Patterns in Go

Uber has adopted Golang (Go for short) as a primary programming language for developing microservices. Our Go monorepo consists of about 50 million lines...

Dynamic Data Race Detection in Go Code

Uber has extensively adopted Go as a primary programming language for developing microservices. Our Go monorepo consists of about 50 million lines of code...

Presto® on Apache Kafka® At Uber Scale

Uber’s goal is to ignite opportunity by setting the world in motion, and big data is a very important part of that. Presto® and...

Securing Kafka® Infrastructure at Uber

Background Uber has one of the largest deployments of Apache Kafka® in the world. It empowers a large number of real-time workflows at Uber, including pub-sub...

Uber’s Emergency Button and The Technologies Behind It

Safety has long been a top priority at Uber, as Uber’s CEO Dara Khosrowshahi wrote in ‘Raising the Bar on Safety’ in September 2018....

Avoiding CPU Throttling in a Containerized Environment

At Uber, all stateful workloads run on a common containerized platform across a large fleet of hosts. Stateful workloads include MySQL®, Apache Cassandra®, ElasticSearch®,...

One Stone, Three Birds: Finer-Grained Encryption @ Apache Parquet™

Overview  Data access restrictions, retention, and encryption at rest are fundamental security controls. This blog explains how we have built and utilized open-sourced Apache Parquet™'s...

Introducing Ballast: An Adaptive Load Test Framework

As Uber's architecture has grown to encompass thousands of interdependent microservices, we need to test our mission-critical components at max load in order to...

Introducing Carbon Feed for Earners: The One-Stop Info Shop

After launching the Driver App in 2018 to over 2 million earners worldwide, we added content and functionality at a rapid pace. Although this...

DeepETA: How Uber Predicts Arrival Times Using Deep Learning

At Uber, magical customer experiences depend on accurate arrival time predictions (ETAs). We use ETAs to calculate fares, estimate pickup times, match riders to...

Project RADAR: Intelligent Early Fraud Detection System with Humans in the Loop

Introduction Uber is a worldwide marketplace of services, processing thousands of monetary transactions every second. As a marketplace, Uber takes on all of the risks...

Cost Efficiency @ Scale in Big Data File Format

  Background Our Apache Hadoop® based data platform ingests hundreds of petabytes of analytical data with minimum latency and stores it in a data lake built...

Capacity Recommendation Engine: Throughput and Utilization Based Predictive Scaling

Introduction Capacity is a key component of reliability. Uber's services require enough resources in order to handle daily peak traffic and to support our different...

The New Version of Orbit (v1.1) is Released: The Improvements, Design Changes, and Exciting...

Introduction The previous post gave an overview of Orbit, a Python package developed by Uber in order to perform Bayesian time-series analysis and forecasting. This...

How We Saved 70K Cores Across 30 Mission-Critical Services (Large-Scale, Semi-Automated Go GC Tuning...

Introduction As part of Uber engineering’s wide efforts to reach profitability, recently our team was focused on reducing cost of compute capacity by improving efficiency....

Cadence Multi-Tenant Task Processing

Introduction Cadence is a multi-tenant orchestration framework that helps developers at Uber to write fault-tolerant, long-running applications, also known as workflows. It scales horizontally to...