Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
Engineering

Uber Case Study: Choosing the Right HDFS File Format for Your Apache Spark Jobs

March 21, 2019 / Global
Featured image for Uber Case Study: Choosing the Right  HDFS File Format for Your Apache Spark Jobs
Figure 1. We ingest the imagery and imagery metadata into Uber data centers and then use Apache Spark to process the imagery and metadata.
ResourceAvroParquetImprovement
Wall Time (sec)20.767.17290%
Core Time (min)24.801.281,938%
Reads (MB) 24,678.4 1,848.51,335%
ResourceAvroParquetImprovement
Wall Time (sec)18.486.0308%
Core Time (min)1670.0050.763,289%
Reads (MB) 24,678.4376.66,552%
Scott Short

Scott Short

Scott Short is a senior software engineer on Uber's Maps Engineering team, based in Boulder, CO.

Posted by Scott Short

Category: