Tag: HDFS

Databook: Turning Big Data into Knowledge with Metadata at Uber

Databook, Uber's in-house platform for surfacing and exploring contextual metadata, makes dataset discovery and exploration easier for teams across the company.

Scaling Uber’s Apache Hadoop Distributed File System for Growth

Uber's Data Infrastructure team overhauled our approach to scaling our storage infrastructure by incorporating several new features and functionalities, including ViewFs, NameNode garbage collection tuning, and an HDFS load management service.

Meet Michelangelo: Uber’s Machine Learning Platform

Uber Engineering introduces Michelangelo, our machine learning-as-a-service system that enables teams to easily build, deploy, and operate ML solutions at scale.

Engineering Data Analytics with Presto and Apache Parquet at Uber

Snap your fingers and presto! How Uber Engineering built a fast, efficient data analytics system with Presto and Parquet.

Redesigning Uber Engineering’s Mobile Content Delivery Ecosystem

How Uber Engineering re-architected the content delivery feed and backend ecosystem of our new driver app to deliver an enhanced user experience.

Hudi: Uber Engineering’s Incremental Processing Framework on Apache Hadoop

Uber Engineering's data processing platform team recently built and open sourced Hudi, an incremental processing framework that supports our business critical data pipelines. In this article, we see how Hudi powers a rich data ecosystem where external sources can be ingested into Hadoop in near real-time.

Popular Articles