Tag: Hadoop

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Responsible for cleaning, storing, and serving over 100 petabytes of analytical data, Uber's Hadoop platform ensures data reliability, scalability, and ease-of-use with minimal latency.

Scaling Uber’s Apache Hadoop Distributed File System for Growth

Uber's Data Infrastructure team overhauled our approach to scaling our storage infrastructure by incorporating several new features and functionalities, including ViewFs, NameNode garbage collection tuning, and an HDFS load management service.

Engineering Data Analytics with Presto and Apache Parquet at Uber

Snap your fingers and presto! How Uber Engineering built a fast, efficient data analytics system with Presto and Parquet.

Redesigning Uber Engineering’s Mobile Content Delivery Ecosystem

How Uber Engineering re-architected the content delivery feed and backend ecosystem of our new driver app to deliver an enhanced user experience.

Hudi: Uber Engineering’s Incremental Processing Framework on Apache Hadoop

Uber Engineering's data processing platform team recently built and open sourced Hudi, an incremental processing framework that supports our business critical data pipelines. In this article, we see how Hudi powers a rich data ecosystem where external sources can be ingested into Hadoop in near real-time.

Designing Euclid to Make Uber Engineering Marketing Savvy

In this article, we take a look at Euclid, Uber Engineering's Hadoop and Spark-based in-house marketing platform.

uReplicator: Uber Engineering’s Robust Apache Kafka Replicator

Take a look into uReplicator, Uber’s open source solution for replicating Apache Kafka data in a robust and reliable manner.

The Uber Engineering Tech Stack, Part II: The Edge and Beyond

The end of a two-part series on the tech stack that Uber Engineering uses to make transportation as reliable as running water, everywhere, for everyone, as of spring 2016.

Streamific, the Ingestion Service for Hadoop Big Data at Uber Engineering

Here we look at Hadoop data ingestion, and how Uber Engineering streams diverse data into a cohesive layer for querying in near real-time using our in-house developed Streamific.

Popular Articles