Skip to footer

Uber Data

Turning Metadata Into Insights with Databook

Every day in over 10,000 cities around the world, millions of people rely on Uber to travel, order food, and ship cargo. Our apps...

Operating Apache Pinot @ Uber Scale

Introduction Uber has a complex marketplace consisting of riders, drivers, eaters, restaurants and so on. Operating that marketplace at a global scale requires real-time intelligence...

Inside Uber ATG’s Data Mining Operation: Identifying Real Road Scenarios at Scale for Machine...

Uber ATG's self-driving vehicles measure a multitude of possible scenario variations to answer the age-old question: "how does the pedestrian cross the road?"
an image with 24 cats all purple except for one red

Monitoring Data Quality at Scale with Statistical Modeling

Uber employs statistical modeling to find anomalies in data and continually monitor data quality.

Building a Backtesting Service to Measure Model Performance at Uber-scale

We built a backtesting service to better assess financial forecast model error rates, facilitating improved forecast performance and decision making.

Women in Data Science at Uber: Moving the World With Data in 2020—and Beyond

In October 2019, Uber hosted our second annual Moving The World With Data meetup, showcasing some of our most interesting data science challenges in 2019.
elevated freeways

Designing a Production-Ready Kappa Architecture for Timely Data Stream Processing

We implemented a Kappa architecture at Uber to effectively backfill streaming data at scale, ensuring accurate data in our platform.

Engineering SQL Support on Apache Pinot at Uber

We engineered full SQL support on Apache Pinot to enable quick analysis and reporting on aggregated data, leading to improved experiences on our platform.
San Francisco map showing average, clustered traffic speeds

Uber Visualization Highlights: Displaying City Street Speed Clusters with SpeedsUp

As part of Uber Visualization's all-team hackathon, we built SpeedsUp, a project using machine learning to process average speeds across a city, cluster the results, and overlay them on a street map.

Uber’s Data Platform in 2019: Transforming Information to Intelligence

In 2019, Uber's Data Platform team leveraged data science to improve the efficiency of our infrastructure, enabling us to compute optimum datastore and hardware usage.

Productionizing Distributed XGBoost to Train Deep Tree Models with Large Data Sets at Uber

We share technical challenges and lessons learned while productionizing and scaling XGBoost to train distributed gradient boosted algorithms at Uber.

Uber Visualization Highlights: How Urban Symphony Adds an Audio Dimension to Visualization

As part of Uber Visualization's all-team hackathon, we built Urban Symphony, an Uber Movement visualization that adds an audio component to traffic speed patterns.

Taking City Visualization into the Third Dimension with Point Clouds, 3D Tiles, and deck.gl

With the release of deck.gl version 7.3, Uber’s open source visualization tool now supports rendering massive geospatial data sets formatted according to the OGC 3D Tiles community standard.

Building a Better Big Data Architecture: Meet Uber’s Presto Team

Uber has embraced Presto, a high performance, distributed SQL query engine, and joined the Presto Foundation. Meet the Uber engineers who contribute to and use Presto on a daily basis.

Science at Uber: Making a Real-world Impact with Data Science

Suzette Puente, Uber Data Science Manager, shares how she applies her graduate work in statistics to forecast traffic patterns and generate better routes.
word cloud

Less is More: Engineering Data Warehouse Efficiency with Minimalist Design

Data science helps Uber determine which tables in a database should be off-boarded to another source to maximize the efficiency of our data warehouse.

Science at Uber: Powering Uber’s Ridesharing Technologies Through Mapping

Dawn Woodard, Director of Data Science, considers travel time prediction one of Uber's most interesting mapping problems.

Science at Uber: Bringing Research to the Roads

Uber Principal Engineer Waleed Kadous discusses how we assess technologies our teams can leverage to improve the reliability and performance of our platform.
Fran Bell

Science at Uber: Building a Data Science Platform at Uber

Uber Director of Data Science Franziska Bell discusses how we created data science platforms at Uber, letting employees of all technical skills perform forecasts and analyze data.
Chinese Water Dragon photo by InspiredImages/Pixabay

Making Apache Spark Effortless for All of Uber

Uber engineers created uSCS, a Spark-as-a-Service solution that helps manage Apache Spark jobs throughout large organizations.

Popular Articles