Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
Data / ML, Engineering

Managing Uber’s Data Workflows at Scale

February 28, 2019 / Global
Featured image for Managing Uber’s Data Workflows at Scale
Figure 1: Our initial Piper architecture was based on Airflow, an open source workflow authoring, scheduling, and monitoring solution.
Figure 2: User code is executed in all system components, which can negatively impact Piper’s availability and performance.
Figure 3: With Piper, pipeline definitions can be thought of as two representations: a serialized metadata representation and a fully executable workflow representation.
Figure 4: User code no longer needs to be loaded in the scheduler or web servers. With Piper, the workflow serializer provides isolation by extracting metadata from user code.
Figure 5: Designed to be highly-available, decomposed, and fully distributed, our Piper architecture supports multiple active schedulers, while eliminating single points of failure.
Alex Kira

Alex Kira

Alex Kira is an engineering tech lead on Uber’s Data Workflow Management Team. His team provides the data infrastructure platform for thousands of engineers, data scientists, and city ops, empowering them to own and manage their data pipelines.

Posted by Alex Kira