The Apache Hudi team at Uber reflects on the open source project's history as it graduates to a Top Level Project under the Apache Software Foundation.
Uber's Maps Collection and Reporting (MapCARs) team shares best practices when choosing which HDFS file formats are optimal for use with Apache Spark.
Databook, Uber's in-house platform for surfacing and exploring contextual metadata, makes dataset discovery and exploration easier for teams across the company.
Uber's Data Infrastructure team overhauled our approach to scaling our storage infrastructure by incorporating several new features and functionalities, including ViewFs, NameNode garbage collection tuning, and an HDFS load management service.
Uber Engineering introduces Michelangelo, our machine learning-as-a-service system that enables teams to easily build, deploy, and operate ML solutions at scale.
Snap your fingers and presto! How Uber Engineering built a fast, efficient data analytics system with Presto and Parquet.
How Uber Engineering re-architected the content delivery feed and backend ecosystem of our new driver app to deliver an enhanced user experience.
Uber Engineering's data processing platform team recently built and open sourced Hudi, an incremental processing framework that supports our business critical data pipelines. In this article, we see how Hudi powers a rich data ecosystem where external sources can be ingested into Hadoop in near real-time.