Tag: Hoodie

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Responsible for cleaning, storing, and serving over 100 petabytes of analytical data, Uber's Hadoop platform ensures data reliability, scalability, and ease-of-use with minimal latency.

Scaling Uber’s Apache Hadoop Distributed File System for Growth

Uber's Data Infrastructure team overhauled our approach to scaling our storage infrastructure by incorporating several new features and functionalities, including ViewFs, NameNode garbage collection tuning, and an HDFS load management service.

Engineering Data Analytics with Presto and Apache Parquet at Uber

Snap your fingers and presto! How Uber Engineering built a fast, efficient data analytics system with Presto and Parquet.

Hudi: Uber Engineering’s Incremental Processing Framework on Apache Hadoop

Uber Engineering's data processing platform team recently built and open sourced Hudi, an incremental processing framework that supports our business critical data pipelines. In this article, we see how Hudi powers a rich data ecosystem where external sources can be ingested into Hadoop in near real-time.

Popular Articles