Skip to footer

Uber Data

Taking City Visualization into the Third Dimension with Point Clouds, 3D Tiles, and deck.gl

With the release of deck.gl version 7.3, Uber’s open source visualization tool now supports rendering massive geospatial data sets formatted according to the OGC 3D Tiles community standard.
Presto logo

Building a Better Big Data Architecture: Meet Uber’s Presto Team

Uber has embraced Presto, a high performance, distributed SQL query engine, and joined the Presto Foundation. Meet the Uber engineers who contribute to and use Presto on a daily basis.

Science at Uber: Making a Real-world Impact with Data Science

Suzette Puente, Uber Data Science Manager, shares how she applies her graduate work in statistics to forecast traffic patterns and generate better routes.
word cloud

Less is More: Engineering Data Warehouse Efficiency with Minimalist Design

Data science helps Uber determine which tables in a database should be off-boarded to another source to maximize the efficiency of our data warehouse.

Science at Uber: Powering Uber’s Ridesharing Technologies Through Mapping

Dawn Woodard, Director of Data Science, considers travel time prediction one of Uber's most interesting mapping problems.

Science at Uber: Bringing Research to the Roads

Uber Principal Engineer Waleed Kadous discusses how we assess technologies our teams can leverage to improve the reliability and performance of our platform.
Fran Bell

Science at Uber: Building a Data Science Platform at Uber

Uber Director of Data Science Franziska Bell discusses how we created data science platforms at Uber, letting employees of all technical skills perform forecasts and analyze data.
Chinese Water Dragon photo by InspiredImages/Pixabay

Making Apache Spark Effortless for All of Uber

Uber engineers created uSCS, a Spark-as-a-Service solution that helps manage Apache Spark jobs throughout large organizations.

Visualizing City Cores with H3, Uber’s Open Source Geospatial Indexing System

In a selection of presentations delivered at a June 2019 Uber meetup, we discuss how to use H3, our open source hexagonal indexing system, to facilitate the granular mining of large geospatial data sets.

Gaining Insights in a Simulated Marketplace with Machine Learning at Uber

Uber's Marketplace simulation platform leverages ML to rapidly prototype and test new product features and hypotheses in a risk-free environment.

Using Causal Inference to Improve the Uber User Experience

Uber Labs leverages causal inference, a statistical method for better understanding the cause of experiment results, to improve our products and operations analysis.

Power On: Accelerating Uber’s Self-Driving Vehicle Development with Data

A key challenge faced by self-driving vehicles comes during interactions with pedestrians. In our development of self-driving vehicles, the Data Engineering and Data Science teams at Uber ATG (Advanced Technologies Group) contribute to the data processing and analysis that help make these interactions safe.

Second Uber Science Symposium: Exploring Advances in Behavioral Science

On May 3, 2019, Uber’s Applied Behavioral Science team hosted the Behavioral Science Track of the Second Uber Science Symposium, featuring a full day of presentations delivered by leading researchers in the field.
Map of dangerous traffic in NYC

Visualizing Traffic Safety with Uber Movement Data and Kepler.gl

Learn how to use Kepler.gl for data visualization through our tutorial, where we show how easy it is to load multiple datasets into Kepler.gl to visualize traffic safety in Manhattan.

Improving Uber’s Mapping Accuracy with CatchME

CatchMapError (CatchMe) is a system that automatically catches errors in Uber's map data with anonymized GPS traces from the driver app.
elephant

Consistent Data Partitioning through Global Indexing for Large Apache Hadoop Tables at Uber

Performing updates of individual records in Uber's over 100 petabyte Apache Hadoop data lake required building Global Index, a component that manages data bookkeeping and lookups at scale.
server racks

Solving Big Data Challenges with Data Science at Uber

How engineers and data scientists at Uber came together to come up with a means of partially replicating Vertica clusters to better scale our data volume.
Elephant silhouette

DBEvents: A Standardized Framework for Efficiently Ingesting Data into Uber’s Apache Hadoop Data Lake

Uber engineers discuss the development of DBEvents, a change data capture system designed for high data quality and freshness that is capable of operating on a global scale.

Managing Uber’s Data Workflows at Scale

In this article, we discuss Uber's journey toward a unified, multi-tenant, and scalable data workflow management system.
Model flow showing rider and driver sign-ups

Why Financial Planning is Exciting… At Least for a Data Scientist

In this article, Uber’s Marianne Borzic Ducournau discusses why financial planning at Uber presents unique and challenging opportunities for data scientists.

Popular Articles