Tag: Apache Hadoop

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Responsible for cleaning, storing, and serving over 100 petabytes of analytical data, Uber's Hadoop platform ensures data reliability, scalability, and ease-of-use with minimal latency.
Marmaray logo

Marmaray: An Open Source Generic Data Ingestion and Dispersal Framework and Library for Apache...

Today we introduce Marmaray, an open source framework allowing data ingestion and dispersal for Apache Hadoop, realizing our vision of any-sync-to-any-source functionality, including data format validation.

Scaling Uber’s Apache Hadoop Distributed File System for Growth

Uber's Data Infrastructure team overhauled our approach to scaling our storage infrastructure by incorporating several new features and functionalities, including ViewFs, NameNode garbage collection tuning, and an HDFS load management service.

Popular Articles