Skip to footer

General Engineering

Introducing uGroup: Uber’s Consumer Management Framework

Background Apache Kafka® is widely used across Uber’s multiple business lines. Take the example of an Uber ride: When a user opens up the Uber app,...

Improving HDFS I/O Utilization for Efficiency

Scaling our data infrastructure with lower hardware costs while maintaining high performance and service reliability has been no easy feat. To accommodate the exponential...

Building Uber’s Fulfillment Platform for Planet-Scale using Google Cloud Spanner

  Introduction The Fulfillment Platform is a foundational Uber domain that enables the rapid scaling of new verticals. The platform handles billions of database transactions each...

Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot

Uber recently launched a new capability: Ads on UberEats. With this new ability came new challenges that needed to be solved at Uber, such...

YAML Generator for Funnel YAML Files: Streamlining the Mobile Data Workflow Process

At Uber, real-time mobile analytics events—generated by button taps, page views, and more—form the backbone of the mobile data workflow process. To process these events,...

Jellyfish: Cost-Effective Data Tiering for Uber’s Largest Storage System

Problem Uber deploys a few storage technologies to store business data based on their application model. One such technology is called Schemaless, which enables the...

Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework

Introduction Uber’s GSS (Global Scaled Solutions) team runs scaled programs for diverse products and businesses, including but not limited to Eats, Rides, and Freight. The...

Enabling Seamless Kafka Async Queuing with Consumer Proxy

Uber has one of the largest deployments of Apache Kafka in the world, processing trillions of messages and multiple petabytes of data per day....

Building Scalable Streaming Pipelines for Near Real-Time Features

Background Uber is committed to providing reliable services to customers across our global markets. To achieve this, we heavily rely on machine learning (ML) to...

Eats Safety Team On-Call Overview

Introduction Our engineers have the responsibility of ensuring a consistent and positive experience for our riders, drivers, eaters, and delivery/restaurant partners. Ensuring such an experience requires...

Unifying Support Content to Enable More Empathetic and Personalized Customer Support Experiences

Introduction  Content quality is critical to the support experienced by Uber’s customers. Consider an Eater who reached out for help to cancel a very delayed...

Efficiently Managing the Supply and Demand on Uber’s Big Data Platform

With Uber’s business growth and the fast adoption of big data and AI, Big Data scaled to become our most costly infrastructure platform. To...

Cost-Efficient Open Source Big Data Platform at Uber

As Uber’s business has expanded, the underlying pool of data that powers it has grown exponentially, and thus ever more expensive to process. When...

Challenges and Opportunities to Dramatically Reduce the Cost of Uber’s Big Data

Introduction Big data is at the core of Uber’s business. We continue to innovate and provide better experiences for our earners, riders, and eaters by...

How Uber Achieves Operational Excellence in the Data Quality Experience

Uber delivers efficient and reliable transportation across the global marketplace, which is powered by hundreds of services, machine learning models, and tens of thousands...

Uber’s Finance Computation Platform

For a company of our size and scale, robust, accurate, and compliant accounting and analytics are a necessity, ensuring accurate and granular visibility into...

Pinot Real-Time Ingestion with Cloud Segment Storage

Introduction Apache Pinot is an open source data analytics engine (OLAP), which allows users to query data ingested from as recently as a few seconds...

Uber’s Fulfillment Platform: Ground-up Re-architecture to Accelerate Uber’s Go/Get Strategy

Introduction to Fulfillment at Uber Uber’s mission is to help our consumers effortlessly go anywhere and get anything in thousands of cities worldwide. At its...

Containerizing Apache Hadoop Infrastructure at Uber

Introduction As Uber’s business grew, we scaled our Apache Hadoop (referred to as ‘Hadoop’ in this article) deployment to 21000+ hosts in 5 years, to...

‘Orders Near You’ and User-Facing Analytics on Real-Time Geospatial Data

Introduction By its nature, Uber’s business is highly real-time and contingent upon geospatial data. PBs of data are continuously being collected from our drivers, riders,...

Popular Articles