Gaining Insights in a Simulated Marketplace with Machine Learning at Uber

At Uber, we use marketplace algorithms to connect drivers and riders. Before the algorithms roll out globally, Uber fully tests and evaluates them to create an optimal user experience that maps to our core marketplace principles.

To make product testing safer and easier, the Uber Marketplace Simulation team built a simulation platform that hosts a simulated world with driver-partners and riders, mimicking scenarios in the real world. Leveraging an agent-based discrete event simulator, the platform allows Uber Marketplace engineers and data scientists to rapidly prototype and test new features and hypotheses in a risk-free environment.

The simulator leverages historical data to run simulations with two important components: Marketplace services and user behavior models. Marketplace services leverage these insights to make dispatch decisions. The user behavior model uses simulation context to determine riders’ and drivers’ decisions, such as trip cancellations and navigation choices.

To make model deployment and training easier and more accurate, Uber’s Marketplace Simulation team built a machine learning (ML) framework that can seamlessly build, train, and serve the models on the simulation platform. Since it improves simulation accuracy and realism, ML plays an important role in predicting user behavior on our simulation platform. In fact, most simulated user behavior on our platform is model-based and trained with anonymized, aggregated historical data. Simulation also helps develop and improve ML models. New models can onboard to our simulation platform and iterate quickly before rolling out to production services.

Our team designed an ML model training framework to enable users to quickly build and deploy models on our simulation platform, while simultaneously improving our ML models through simulation.

Simulation platform machine learning framework

ML lies at the heart of a growing number of important features in the simulation platform. Before we introduced the ML framework to the simulation platform, ML development, training, and serving had consisted mainly of ad-hoc, non-reusable solutions. For example, ML developers implemented ML models directly in a simulator and trained the model on the fly when the simulator was running.

These one-off solutions quickly accumulated complexity within the simulator as developers implemented multiple models and added model versioning. The simulator became difficult to maintain and these different models took large amounts of RAM and CPU resources for training, which led simulator performance to degrade. In these types of scenarios, having custom top-to-bottom training and serving code for each model becomes untenable.

The Simulation team set out to build a single, efficient, and reusable framework for most ML training and workload serving  on our simulation platform. Our framework offers a mix of open source systems and components built in-house. It operates on four layers:

  • A Python ML library consisting of APIs and modules from which to construct an ML model on the simulation platform
  • An automatic training pipeline to train models and save them into a binary
  • A back-end service that manages all models’ metadata
  • A centralized module in the simulator that loads selected models into the simulation

In our simulation ML framework training workflow, the training pipeline imports a Python ML library, fetches and processes raw data from Apache Hive using Apache Spark, and trains the models automatically based on historical time series data. Once the pipeline finishes training with this data, it sends its metadata to the simulation database and checkpoint model instance, then saves the binary file in Uber’s storage service.

In our model-serving workflow, when our simulation back-end service receives requests to create simulations, it fetches related ML models’ metadata based on user settings in the requests and uses metadata to download model checkpoints to a local disk. After that, the simulation back-end service starts running simulators, which has a model factory. The model factory then imports its Python ML library and scans the checkpoint in the local disk, where it instantiates the related models and outputs the instances to the simulation core flow.

Figures 1 and 2 below show the machine learning framework’s training and serving workflow:  

Figure 1. The Automatic Training Pipeline (center) fetches raw data from Hive using Spark (above) and uses the Simulation ML Models (left) to save data to the Storage Service (right) and Simulation Database (bottom).
Figure 2. To create a simulation, the program first requests a simulation (top left). Next, it fetches backend service model metadata and inputs it in the database. The database then fetches checkpoints based on this metadata from its Storage Service and downloads checkpoints to its disc. These checkpoints then combine with Simulation ML Models (upper center) in the simulator, where the Model Factor instantiates models into the core flow.

 

In this framework, we largely decoupled our training and serving workflows. This change provides the flexibility model developers need to constantly evolve their models. With this system, if developers want to change the user behavior models, all they have to do is update the implementations in the Python ML library. The framework will then automatically trigger the training pipeline and populate the latest model to all simulators.

By removing model training workloads from the simulator, we’ve simplified the complexity of the simulator to significantly improve its RAM and CPU usage. This change also increases the simulator’s throughput. Combined, these simplifications conserve hardware resources and improve system efficiency.  

The Marketplace Simulation team has used this framework to integrate several user behavior and other ML models onto the simulation platform in a standardized way.

How supply movement models improve marketplace simulation

As suggested by recent research on ride-sharing simulations, driver movement is a major factor in simulation accuracy and realism. On Uber’s Marketplace Simulation Platform, experimental Marketplace algorithms led to counterfactual matches, causing driver movement to be different than that reflected in real-world historical data. For example, a driver in the simulation may match with a different rider than they did in reality. In this case, the simulator could not simply reuse the historical driver movements to simulate the route, the length of the ride, and the drop-off location. In order to remedy this problem, we knew we needed to build a model that accurately simulated driver movement behaviors. With a proper movement model, we could enhance rider behavior models (e.g., the rider cancellation model) and reduce the noise in matching and pricing algorithms, which we do to ensure that our marketplace principles are upheld.

Accurately simulating each driver’s movement is a very challenging task because the simulated world runs on limited historical data and lacks many factors that can influence user behavior, such as traffic information, weather, and geographical data.

Fortunately, for most Marketplace algorithms, a single driver’s movement does not significantly impact the algorithm results. These algorithms leverage aggregated distributions of drivers rather than individual drivers’ motions, particularly for rider-driver matching and pricing algorithms. For this reason, rather than trying to perfectly simulate each and every driver’s movements, we built a model to simulate driver distributions accurately.

To achieve our objective in a simulated world, we divided online driver movement behavior into two states: on-trip and off-trip. We created a hybrid model based on different driver behaviors in these two states, as shown in Figure 3, below, to better forecast user demand in Uber’s marketplace:

Figure 3. Our Driver Movement Hybrid model begins with a driver request simulating its movement and goes through a series of branching “Yes/No” options to reach a conclusion, send driver information to the Routing Engine, estimate the driving speed, and move the simulated driver through the route.

 

We applied a rule-based model for on-trip drivers. When a simulated driver accepts a trip, the model uses a routing engine to direct drivers along their routes to a predetermined destination.

Simulating off-trip driver distribution is much more difficult. As previously mentioned, the simulated world often does not accurately reproduce historical driver movement. For this reason, we use an ML model alongside historical data to predict aggregated off-trip driver movement. Using production-grade algorithms, this model simulates the open drivers’ distribution as close to historical distribution as possible. We trained a tree-based stochastic model to accomplish our goal of simulating off-trip driver distribution accurately. Figures 4, 5, and 6, below, provide examples of how this model works:  

Figure 4. One of the decision tree structures of our stochastic model for simulating off-trip driver distribution starts with “Is the driver in rush hour” at the top and uses a series of “Yes/No” branches to produce leaf nodes at the bottom.

 

When the simulation wants to predict an off-trip driver’s destination, it fetches the related leaf node from the tree model using the open driver’s information (e.g. location, timestamp, etc.), as shown in Figure 4. The leaf node contains a transition matrix that predicts the simulated driver’s future movements. In our simulation platform, we partition areas of the earth into identifiable grid cells using H3, our open source hexagonal, hierarchical spatial index. By using these identifiable grid cells, we defined our transition matrix as in Figure 5, below:

Figure 5. This probability table (with three rows and three columns labeled vertically and horizontally Grid Cell 1, Grid Cell 2, and Grid Cell 3) shows probability values for various grid cells, which represent locations on Earth.

 

The values in the transition matrix describe the probability that a driver will move to Grid Cell Y given that the driver is currently in Grid Cell X.

By using this type of transition matrix, the simulation can predict which cell the open driver is likely to move to next. After that, the simulation randomly selects a location inside that cell and assigns it as the destination of the open driver. Figure 6, below, demonstrates this process:

Figure 6. This flowchart begins at a leaf node, uses a map to explain how the simulation fetches an identifiable grid cell to pinpoint a driver, moves to another highlighted map to show how the simulation calculates driver movement probabilities with a transition matrix, and ends with a third map to show how the simulation selects a location as the driver’s destination. This is our tree-based stochastic model for predicting driver destination.

 

Once the simulator predicts an open driver’s destination, it estimates the driver’s route and speed, then moves the driver to the destination. One key difference between on-trip and off-trip drivers is that open drivers might change their status (go offline/be dispatched) while driving to their destinations.

Using the hybrid driver movement model, our Marketplace Simulation Platform gains an accurate approximation of driver distribution in the real world. We have compared the distributions with real-world distributions using the same algorithms. Figure 7, below, shows the peak hour driver distribution comparison:

Figure 7. The map on the left highlights areas where drivers are distributed in the real world. The map on the right simulates driver distribution in the same area. Due to the accuracy of our simulation, the maps are nearly identical.

 

How simulation helps us experiment with matching algorithms

Uber’s platform connects riders and drivers in real-time by using matching algorithms that optimize for various factors, including ETA, routing, user locations, trip preferences, and price.

Based on recent research, ride-sharing programs often use large-scale combination algorithms to optimize user matching. However, as ride-sharing needs diversify, these algorithms must take  more and more factors into account. To accommodate the increasing complexity of matching, our team integrated machine learning (ML) models into the optimization algorithms to increase simulation throughputs. We used a combination of a recommendation system and maximum bipartite matching. We’ve depicted our entire workflow in Figure 8, below:

Figure 8. The top half of this image represents the recommendation system, which narrows down and ranks drivers through a few steps. The flowchart for the recommendation system begins by finding all possible drivers for each rider, narrowing driver options down to hundreds, generating links, narrowing driver options down to dozens, then, when there are fewer than ten, ranking them. Once this process is complete, an arrow shows these recommendations moving into the matching phase. At first, the algorithm may link multiple riders to a single driver, but with our maximum bipartite matching algorithm, we pair up individual drivers with single riders.

 

In our matching solution, the recommendation system suggests a few drivers per rider with scores for each pair. Based on the recommendation results, we construct a bipartite graph with weighted edges, which matches one driver with a single rider. We used the maximum bipartite matching algorithm to solve this problem efficiently.

It was a challenge to develop the recommendation system based only on rider and driver locations. Without other context, building a recommendation system using traditional collaborative filtering, factorization machines, or deep neural networks is impossible due to lack of features. In order to solve this problem, we looked to large-scale social network recommendations, which have encountered the same challenges and devised effective solutions for them (as covered by Future Generation Computer Systems, Cornell University, and the International Workshop on Languages and Compilers for Parallel Computing).

After conducting comprehensive research, we decided to design our recommendation system using one of the solutions social networks have implemented: graph representation learning and link prediction models. As depicted in Figure 8, we constructed a graph that depicts vertices as riders and drivers based on rider and driver locations. We created the edges of the graph based on the routes between riders and drivers using some rule constraints, such as driver ETA and distance between the rider and the driver. After the simulator constructs the graph, we can predict the score for each edge based on historical dispatch results.

However, the information for each node and edge is very limited in the constructed graph, since it only contains the locations and routes. Inspired by graph representation learning methods such as GraphSAGE, we aggregated information from neighbors to extract enough features for each node, as shown in Figure 9, below:  

Figure 9. A larger outer circle of rider nodes uses aggregated feature information from neighbors to direct rider nodes to driver nodes within a smaller circle, which is then aggregated again to narrow the selection to one rider node.

 

These features reflect the nearby network structure information for each node. In addition, we normalized some continuous features and embedded them to the feature vector to make our model more generic, emulating recent research.  

The model we eventually settled on to predict the scores for each pair is a Gradient Boosting Tree with dropouts. Since a significant portion of the training data results in negative pairs (in which the driver and rider were not matched), we downsample the negative pairs and upsample the positive pairs (in which the driver and rider were matched) before we train the model.

After training the model, we use it to predict top-k potential drivers for each rider based on its pairs’ scores. In this case, according to the predicted results, each rider would be linked to multiple drivers, and a driver could also be linked with multiple riders. We then construct a bipartite graph based on links and apply a maximum bipartite matching algorithm to solve the bipartite matching problem and achieve minimum average driver ETA.

Moving forward

By progressively integrating ML models into the simulation platform using our ML framework, we can test more user behaviors and experimental marketplace algorithms and features in simulation. This system helps Uber accelerate its development process and deliver safe, reliable transportation products.

In the future, we intend to enhance the simulation ML framework with the following features:

  • Hyperparameter automatic tuning. ML models contain many hyperparameters, such as learning rate and tree depth. Tuning these hyperparameters is time-consuming and error-prone. We plan to integrate hyperparameter tuning tools into our model training pipeline to amplify model developer productivity and make engineers’ jobs easier.
  • Online model performance monitoring. Most of the ML models on the simulation platform are directly affected by environmental factors like time and location. Over time, model performance degrades because the environment (e.g., traffic patterns) in the simulations changes. To keep our models accurate throughout these shifts, our models need to be re-trained with new data. An online model performance monitoring system could monitor the model serving performance and trigger the training pipeline once the performance is lower than a certain threshold.
  • Distributed machine learning serving. Uber’s engineers are integrating an increasing number of ML models into the simulation platform. As such, our current framework has encountered computing resource starvation, which causes latency for predictions to increase significantly. Therefore, the simulation throughput becomes worse. A high performance distributed ML serving component could help solve this issue.
  • Configurable machine learning model development. Most ML models in Simulation Platform share common components, such as data processing logic and underlying machine learning algorithms. To simplify the model development process for most users, we plan to build a configuration driven model development framework. It could construct behavior models based on simulation users’ input configurations.

These are just a few of our plans for further improving the simulation ML platform and ensuring it runs smoothly.

If you are interested in shaping the future of transportation with us, consider applying for a role at Uber!

Adam Hotz and Qianyi Wang also contributed to this article.

Comments