Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
Data / ML, Engineering

Databook: Turning Big Data into Knowledge with Metadata at Uber

August 3, 2018 / Global
Featured image for Databook: Turning Big Data into Knowledge with Metadata at Uber
Figure 1. Databook is Uber’s in-house platform that surfaces and manages metadata about internal data locations and owners.
Figure 2. The Databook architecture takes in metadata from Vertica, Hive, and other storage systems, stores it in its back-end databases, and outputs the data using RESTful APIs.
Figure 3. Databook is comprised of two different application layers: data collection crawlers and a request serving layer.
Figure 4. In Databook, metadata lineage/freshness is collected for each table.
Figure 5. Databook persists cluster-agnostic metadata to all tables.
Figure 6. Databook persists cluster-agnostic metadata to all tables.
Figure 7. Databook links cluster-specific and cluster-agnostic metadata during reads.
Figure 8. Databook lets users search by different dimensions, including name, owner, and column.
Luyao Li

Luyao Li

Luyao Li is a senior software engineer at Uber. He is an enthusiast of building reliable, scalable and performant systems. When not coding, he enjoys hiking around the Bay Area.

Kaan Onuk

Kaan Onuk

Kaan Onuk is a software engineer on Uber’s Data Knowledge Platform team. When he’s not getting meta about metadata, he can be found exploring new hiking trails in Northern California, eating mediterranean food, or discussing GDPR.

Lauren Tindal

Lauren Tindal

Lauren Tindal is a product manager on Uber’s Data Knowledge Platform team. She appreciates her large collection of plants, california burritos, and of course, metadata.

Posted by Luyao Li, Kaan Onuk, Lauren Tindal