Redesigning Uber Engineering’s Mobile Content Delivery Ecosystem
Supporting a quick and efficient in-app communication channel for people who drive with Uber is critical to our business. If we are unable to effectively communicate messages on the app, it can prevent drivers from receiving important information. In 2015, the Driver Experience team at Uber introduced a new driver app to improve the user experience, including the deployment of a new, more effective content delivery ecosystem.
In this article, we discuss the technical challenges we encountered—and solutions we developed—while building a new content feed and corresponding backend ecosystem for the app.
To Feed or Not to Feed?
When designing a new app, the most immediate challenge is determining the best way to structure content for users. In general, Uber recognizes three models of interaction used to group user content:
- Urgent: Content which requires immediate action from the user, such as notifications about document expiration.
- Non-urgent: Content which can either be acknowledged or dismissed. This model works similarly to an email inbox in the sense that incoming messages are accessible for some period of time and can wait for users to read and process them at their convenience. Most of these notifications are confirmations of payments, recent ratings, and feature promotions.
- Permanent: Educational material or ongoing statistical data about a user’s profile, such as total rating and weekly earnings. Users can view this content anytime in a dedicated section of the app’s user interface (UI).
Combining these three paradigms into a single solution follows orthogonal goals. In our search for a solution, we realized that taking a feed concept as we know it from social networks and adding a few modifications would allow us to meet our objective. In the next section, we look at how each of these interaction models are reflected in our new feed’s behavior.
Categories of Interaction Models
By placing a message feed at the bottom of the main screen and letting these messages slide up as overlay, all incoming updates on the new driver app take the form of popups. While this deals with confirmations neatly, promotions and requests for action (RFAs) have a more complex lifecycle.
A confirmation disappears once the content is acknowledged, because its sole purpose is to inform the user about something that they may want to act on. A promotion stays on the screen longer than a confirmation to allow the user more time to digest its content, only disappearing from the feed if it is acknowledged by the user with a tap; it might even link to the permanent content as well, which is available to the user in a separate, static section of the app. RFAs follow an even more complex pattern by appearing, disappearing, and resurfacing until users complete an action.
If the content displayed is not urgent, it is placed in the feed according to its “freshness” and relevancy to the user. This strategy bodes well for a variety of updates such as information on upcoming events in their city or feedback from previous customers.
For the new app, permanent content is primarily a static set of links that leads to other content. While testing this new interface, we discovered that we could model such content as a separate, statically ranked feed presented in separate tabs of the application. For instance, earnings-related information such as incentives, referrals, and financial stats can be presented as a static list of items in a separate tab served by a feed system.
Interaction Model Implications
When designing the new feed, we assessed what requirements these interaction models put on the whole system and their resulting architectural implications. By equipping the application with multiple types of feeds, we were able to build each feed separately. This is managed by giving each feed type its own configurable ranking strategy and each piece of content within a given feed its own lifecycle policy.
Using this approach, tabs of the application which contain static content is served by a unique, statically ranked feed. In this scenario, ranking rarely varies by time, but it might be affected by such conditions as a driver’s current region or level of experience. Each item in these static feeds has a nearly permanent life span, at least until there is a request to change or remove it.
Conversely, dynamic feeds have a configurable and complex ranking strategy able to push items to the top of the feed based on urgency, relevance, and other criteria.
Now, after reviewing the product requirements from the driver’s side of the feed, let us take a look at the larger landscape of communication requirements for the content delivery system.
The Broader Content Delivery Ecosystem
A content delivery system like Uber’s does not exist in isolation and is heavily influenced by content creation and ingestion mechanisms. In order to provide a smooth and functional user experience, an entire ecosystem needs to be built from scratch. In this section, we discuss how we approached this process at Uber.
At first glance, this feed might resemble that of a social network, but there is a key difference: the content is coming from Uber exclusively and not from drivers. Rather than sending their content into the “wilds” of the data center to be indexed, ranked independently, and forgotten, authors actively control the targeting, customization, and management of their content. In this sense, Uber’s content delivery system has more similarities with an an advertising content delivery platform than it does with a social network because the interactions are one-sided and may require or prompt a response from the user. This difference, in turn, affects the way content must be ingested and selected by the feed ecosystem.
We currently utilize three primary targeting strategies on Uber’s driver app: event-based targeting, real-time matching, and bulk messaging. For bulk messaging, we invented a whole new set of tools and services tailored to Uber’s needs. Each of the backend services in this broader system deals specifically with targeting; however, the content is offered to our users via a cohesive frontend experience.
Targeted content is produced in two ways: the first and simplest way is by using in-house UI tools to manually define a set of rules or campaigns, and the second and more complex approach is to generate content with a separate service. Services which provide content typically monitor metrics or events within our system and use some business logic to perform additional evaluation steps. In both cases, the infrastructure shown in the following diagram is utilized:
Whenever a driver produces an event on the mobile app, such as crossing a geofence or finishing a trip, there might be a need to push a real-time message to their feed. This comes into play when dealing with the confirmation and notification use cases we already described. To push this notification, we set up a targeting filter which waits for a specific event to occur; if triggered, this event causes a message to be sent to the driver’s feed. An event can be triggered by any action, such as a trip request, tapping a button in the app, completion of a background payment calculation, or even a driver crossing a particular geofence.
At Uber, we use a data-rich platform for dealing with events. Almost everything happening on the mobile device, including geocoordinate changes and user interactions, is reported to the gateway service and sent to our Kafka stream ready to be consumed by downstream services. The targeting infrastructure primarily utilizes a Samza-based rules engines configurable by UI tools which allow content creators to filter events by certain rules, such as geofence penetration, to produce a message and deliver it to users.
Bulk messaging is a good match for targeted campaigns, such as cohort-specific promotions and announcements. In this model, messages can be sent in bulk by running Hive, Hadoop, or Spark jobs offline to select cohorts and then push customized messages to user feeds.
At Uber, almost all events produced by mobile devices are extracted, transformed, and loaded into a Hadoop Distributed File System (HDFS) and aggregated. This allows data to be accessed offline via Hive queries or Spark jobs for more complex cohort selection. Using this technology, content creators can message users based on criteria which require the aggregation of events, such as passing a threshold of declining activity, an increase in complaints from riders, or simply the announcement of a new initiative to a particular group of drivers based on their driving history. In most cases, a Hive-based engine is used to query HDFS tables and export the results into a queuing system such as Cherami or Kafka. After results are queued, they can be consumed by the feed system and sent to mobile devices.
In some targeting cases, certain criteria must be evaluated when the driver turns on the app and starts accepting rides. These are cases in which content will depend on dynamic parameters such as location and the installed app version. Situations like these are rather rare, since most of a user’s information can be sent proactively by the app and processed offline for further consumption by bulk-targeting engines. There are exceptions, however, in which a change in a user or application state has to be acknowledged by a message in real time, for instance, in the case of new feature rollouts.
Nowadays, many software companies host some sort of experimentation platform (XP) which allows them to roll out new features to a configurable percentage of users who are selected by certain criteria such as country, city, locale, and device version. The availability of such a feature is announced to the user in the form of a message sent through the feed system. In these cases, it is critical that the cohort messaged about this new feature is precisely the same as the one receiving the feature. Similarly, the same XP should be able to produce an answer to both of these questions: “should the feature be enabled for the user?” and “should a message about this new feature be shown to the user?” At Uber, we built our own real-time matcher to address this use case.
Feed System Technology Decisions
There were a number of decisions we needed to make when it came to determining the most effective technological makeup for our feed.
There are two ways to store content in the standard content delivery ecosystem: pushed and pre-evaluated for each individual user (fan-out–on-write) or indexed, filtered, and ranked on request (fan-out-on-read). The targeting options described above have a direct implication on the style with which the feed system handles content. The push of customized targeted content tends to force the feed system design to follow a fan-out-on-write pattern. This is primarily for two reasons:
- Real-time Delivery: For urgent messages, like when a driver has to be warned about travel conditions, a message is sent to their feed in the very same moment it is triggered (e.g., when a user hits a geofence border or an event occurs).
- Efficiency: Even in cases when content does not have to be delivered in real time, fan-out-on-read results in polling to targeting engines, since they contain the targeting logic. When cohort sizes are small, most polls will result in empty responses, which leads to inefficient utilization of hardware.
Due to the existence of real-time match targeting, a fan-out-on-read strategy has to be supported as well. As a result, Uber’s feed system supports both strategies at the same time, described in additional detail in our next section.
Real-Time Push vs. Periodic Polling
As result of Uber’s combined fan-out strategies, our feed operates in either a store-and-notify or store-and-push mode depending on the feed configuration, while also reaching out to specific engines and services configured for fan-out-on-read behavior, as demonstrated below:
The feed back end determines the flow of content delivery. Since mobile devices cannot be assumed to keep state, the backend service has to be able to deliver all feed content at any time; feed content is called by the mobile app whenever a device needs to get the latest state.
Once a new piece of content is pushed by an internal service, the feed back end stores it in a database for a specified period of time and notifies mobile devices through a push mechanism. Triggered by the push, a device requests a fresh feed from the back end. At this moment, the feed back end retrieves stored items from the database and calls its fan-out-on-read services to obtain additional items in real time. After aggregation and ranking, step items are returned to the mobile device.
In our new content delivery ecosystem, the feed back end stores active feed content in a database. We selected this model to increase the performance of subsequent feed fetches and remove content when it should no longer appear in the feed. Similarly, we chose Cassandra for our storage needs over other technologies, because of its ability to handle data center replication and data modeling.
Data Center Replication
Mobile application requests from riders and driver-partners are routed to data centers worldwide based on geo-proximity. However, some of our internal content-producing services are only active in one of these data centers (e.g. where a Hive job is running) and are not privy to driver locations. When a producer pushes content to the feed service, the content must be made available to users hosted across data centers to facilitate its possible rollout regardless of driver location. To ensure that users can access feed content at all times—even during network failovers—user traffic may be routed to an alternate data center. Cassandra’s data center replication abilities made it an easy choice for us when designing our new content delivery ecosystem.
The feed content for a single driver typically comes from multiple internal services, and these services can push content at any rate and at any time. To avoid race conditions, we need content-level granularity when writing content to our backend datastore. The feed service must have access to all saved content when serving a feed to a mobile client; in order to achieve this, we need user-level granularity for reading data.
Cassandra’s partitioned row-oriented data model meets these requirements perfectly. Choosing user ID as the partition key allows us to fetch all available items for a user in a single request. This type of query has a constant time complexity over the number of item types. By choosing user ID, item type, and item ID as a compound primary key, each item is essentially a row in the data store. When multiple content providers across the org push content for a user simultaneously, Cassandra writes to different rows to avoid any race conditions.
Content Creation and Design
A typical feed system on a social network renders content differently at the presentation layer for customization purposes. In order to reduce the amount of presentational code, companies usually attempt to standardize their presentation and use only a few different designs. A presentation layer (e.g., frontend or mobile code) then converts each update type to one of the UI designs.
At Uber, we learned this on the job. With our growing number of feed content types across our products, extending and maintaining a presentation layer for each new category became a significant burden for the whole feed system for two reasons:
- Code dealing with developing new types of content has to be added and rolled out to mobile devices for every new content type shipped due to subtle differences in the design, such as hiding some UI element or adding an extra button. Consequently, standardization and consolidation of designs was not a reality for us given the hypergrowth of our services.
- Even if the same design code can be reused for a new type of content, it still requires writing new presentational code to input new data into this design. The finished product is barely reusable, and as a result, new types of content cannot be added in real time. This strategy was also not feasible for our use case.
In addition to creating a series of content designs that can be plugged and chugged depending on content type, we rolled out a number of targeting engines with the ability to author messages in a particular design. We toyed with the idea of having each engine offer its own fixed set of designs, but not surprisingly, some designs could not be reused across targeting engines or had to be duplicated, which proved time-consuming.
Presenting: Uber’s Content Delivery Ecosystem
Over time, we realized that we needed to build a new tool to better create and rollout customized content designs so that the presentation component of our feed system could use a single, standard approach to content rendering.
On the mobile side of this model, similar designs can be split into reusable components that a rendering engine can then dynamically combine at run time, allowing the content creator to select a particular configuration of visual design during the content creation process.
Content on the feed is authored by the UI tool which allows content creators to stitch together a particular design for their message from reusable blocks such as title, image, list, or button set, among other items, as well as fill out the actual language of the content in each block. This process enables us to inject template placeholders for data coming from targeting engines, user profiles, and other sources.
By allowing the same design templates to be used with multiple targeting engines, we enabled rich and consistent content customization capabilities across the entire platform. In addition, this decision allowed us to eliminate the need for mobile code modifications for most new designs and helped drive the standardization of the app’s look and feel by defining a fixed set of blocks. New services producing updates for users no longer need to deal with presentational aspects, such as translations of feed content into other languages, and can focus on targeting the right driver cohorts for specific content. With this block-based approach to design, Uber’s mobile content creation tool also makes it possible to craft new content and translate it in minutes, a process which previously took anywhere from several hours to a matter of days.
This new, easy-to-use mobile content delivery ecosystem seamlessly integrates with many of Uber’s widely used services and targeting tools, allowing both engineers and customer support to promptly create and deliver relevant information to drivers, and by extension, our riders.
If this type of work excites you, consider applying for a role on our Driver Experience team.
Alex Forsythe, Denis Haenikel, and Minjie Zha are software engineers on Uber’s Driver Experience team. Jiaduo He and Yujia Luo, also software engineers on the Driver Experience team, contributed to this article.