Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
Engineering

Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

October 30, 2018 / Global
Featured image for Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads
Figure 1: We divide our cluster workloads into four categories, each with their own attributes and resources.*
Figure 2: We can compare the architectures of the four major cluster schedulers in six functional areas.
Figure 3: Covering job lifecycles, task placement, and task preemption, Peloton met our needs better than other available cluster schedulers.
Figure 4: Peloton features an active-active architecture with multiple Mesos clusters.
Figure 5. In deployment at scale, Peloton can manage multiple Mesos clusters and schedule jobs across them.
Figure 6: We implemented hierarchical max-min fairness for resource management in Peloton to share resources between different organizations.
Figure 7: Peloton manages resource pools for individual teams within an organization, using Reservation, Limit, and Share as controls.
Figure 8: With elastic resource sharing between Team 1 and Team 2’s resource pools, one resource pool can borrow resources from another in case its demand is more than its guaranteed reservation.
Figure 9: When Peloton runs an Apache Spark job, it uses its Spark driver to schedule, prioritize, and launch executor tasks.
Figure 10: In an example where Peloton is running a distributed TensorFlow job using Horovod, Peloton can run all the tasks using Mesos and provide a mechanism for them to discover each other.

Figure 11: As we increase the number of NVIDIA Pascal GPUs, Peloton shows how it becomes more efficient when running TensorFlow modified to use Horovod versus standard distributed TensorFlow, both for Inception V3 and ResNet-101 TensorFlow models.
Figure 12: We could migrate Mesos workloads to Kubernetes using Peloton to make it more cloud friendly.
Mayank Bansal

Mayank Bansal

Mayank Bansal is a staff engineer on Uber's Big Data team.

Posted by Leslie Williams, Mayank Bansal

Category: