Skip to footer

General Engineering

Uber’s Emergency Button and The Technologies Behind It

Safety has long been a top priority at Uber, as Uber’s CEO Dara Khosrowshahi wrote in ‘Raising the Bar on Safety’ in September 2018....

Avoiding CPU Throttling in a Containerized Environment

At Uber, all stateful workloads run on a common containerized platform across a large fleet of hosts. Stateful workloads include MySQL®, Apache Cassandra®, ElasticSearch®,...

One Stone, Three Birds: Finer-Grained Encryption @ Apache Parquet™

Overview  Data access restrictions, retention, and encryption at rest are fundamental security controls. This blog explains how we have built and utilized open-sourced Apache Parquet™'s...

Introducing Ballast: An Adaptive Load Test Framework

As Uber's architecture has grown to encompass thousands of interdependent microservices, we need to test our mission-critical components at max load in order to...

Cost Efficiency @ Scale in Big Data File Format

  Background Our Apache Hadoop® based data platform ingests hundreds of petabytes of analytical data with minimum latency and stores it in a data lake built...

CRISP: Critical Path Analysis for Microservice Architectures

Uber’s backend is an exemplar of microservice architecture. Each microservice is a small, individually deployable program performing a specific business logic (operation). The microservice...

How Uber Migrated Financial Data from DynamoDB to Docstore

Introduction Each day, Uber moves millions of people around the world and delivers tens of millions of food and grocery orders. This generates a large...

Introducing uGroup: Uber’s Consumer Management Framework

Background Apache Kafka® is widely used across Uber’s multiple business lines. Take the example of an Uber ride: When a user opens up the Uber app,...

Improving HDFS I/O Utilization for Efficiency

Scaling our data infrastructure with lower hardware costs while maintaining high performance and service reliability has been no easy feat. To accommodate the exponential...

Building Uber’s Fulfillment Platform for Planet-Scale using Google Cloud Spanner

  Introduction The Fulfillment Platform is a foundational Uber domain that enables the rapid scaling of new verticals. The platform handles billions of database transactions each...

Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot

Uber recently launched a new capability: Ads on UberEats. With this new ability came new challenges that needed to be solved at Uber, such...

YAML Generator for Funnel YAML Files: Streamlining the Mobile Data Workflow Process

At Uber, real-time mobile analytics events—generated by button taps, page views, and more—form the backbone of the mobile data workflow process. To process these events,...

Jellyfish: Cost-Effective Data Tiering for Uber’s Largest Storage System

Problem Uber deploys a few storage technologies to store business data based on their application model. One such technology is called Schemaless, which enables the...

Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework

Introduction Uber’s GSS (Global Scaled Solutions) team runs scaled programs for diverse products and businesses, including but not limited to Eats, Rides, and Freight. The...

Enabling Seamless Kafka Async Queuing with Consumer Proxy

Uber has one of the largest deployments of Apache Kafka in the world, processing trillions of messages and multiple petabytes of data per day....

Building Scalable Streaming Pipelines for Near Real-Time Features

Background Uber is committed to providing reliable services to customers across our global markets. To achieve this, we heavily rely on machine learning (ML) to...

Eats Safety Team On-Call Overview

Introduction Our engineers have the responsibility of ensuring a consistent and positive experience for our riders, drivers, eaters, and delivery/restaurant partners. Ensuring such an experience requires...

Unifying Support Content to Enable More Empathetic and Personalized Customer Support Experiences

Introduction  Content quality is critical to the support experienced by Uber’s customers. Consider an Eater who reached out for help to cancel a very delayed...

Efficiently Managing the Supply and Demand on Uber’s Big Data Platform

With Uber’s business growth and the fast adoption of big data and AI, Big Data scaled to become our most costly infrastructure platform. To...

Cost-Efficient Open Source Big Data Platform at Uber

As Uber’s business has expanded, the underlying pool of data that powers it has grown exponentially, and thus ever more expensive to process. When...