Octopus to the Rescue: the Fascinating World of Inter-App Communications at Uber Engineering
You might think that a team comprised of folks formerly from Google, Facebook, Microsoft, Yahoo, etc. had seen it all when it comes to building tools and infrastructure for mobile and web applications.
As the team in question, we certainly thought so. But soon after getting started, we encountered a fresh challenge while building UI testing frameworks for our mobile org here at Uber Engineering. Do you think you know what roadblock we encountered?
What makes testing the Uber mobile apps significantly different from, say, testing Google Maps?
Have you used our Uber rider app? Are you familiar with the typical flow from requesting your ride to getting off at your destination? I expect that many of you reading this post might answer yes to these questions. Now let’s see how many of you can answer the next one.
What’s the typical flow for the Uber driver app? If you’re not a driver, many of you are probably unaware that there even is a driver app. Even fewer of you know what it looks like. (Perhaps just those of you who peeked at the driver’s phone while in an Uber!) Now back to our question: What makes testing the Uber mobile apps significantly different than an application such as Google Maps? Now can you guess the answer?
The Rider and the Driver apps are tied to each other.
The rider’s experience depends highly on a myriad of factors: the choices of drivers available, the ride request acceptance, the ETA to pickup, and the route to the rider for instance. In addition, drivers and riders may contact each other. This cross-app interaction gets more involved when you consider uberPOOL where there’s a second rider and an additional pickup and dropoff point. So, our framework needs to run end-to-end tests involving multiple apps and cross-app interaction.
Trip Flow: Our Uber Unique Problem
Imagine a basic scenario for us like an end-to-end trip flow (1) A rider requests a ride, a driver accepts it (2) A driver goes to pick up the rider and the trip begins (3) Upon destination arrival, the trip ends, and the rider and driver rate each other. The diagram below shows signals sent between apps as the test state changes throughout this scenario:
Now imagine a more complicated interaction, like a matched uberPOOL trip. For this scenario, one driver and at least two riders will be involved. How do we synchronize the interactions between three different test targets so they can cooperate seamlessly?
Recall that even in the basic rider-driver scenario where two test targets are involved, the tests are running in two different processes. We needed to figure out how to get tests running on different devices/emulators to communicate with each other.
When we were first looking into the Appium test automation framework, we proposed an initial solution: configure the rider and driver tests to start up their respective apps with different ports, so that they can communicate with each other. After trying it out, however, we observed that the extra overhead of the client/server communication between the test case and the Appium http server made things slow and a bit flaky. Tests sometimes mysteriously failed, possibly due to the network connection between client and server, and the errors were hard to diagnose and troubleshoot. The more we looked into understanding the problem, the more we were certain a more robust solution was required.
Introducing Octopus: Our Uber-Useful Solution
Try to think back to taking Operating Systems theory in college and studying cross-process communication. We knew that if we could implement a dependable signaling mechanism, the problem would be solved. So in Spring 2015 we created a solution for both iOS and Android in our platform-agnostic test runner, which we call Octopus.
Using the previous rider-driver example, Octopus starts up two simulator/emulators. It launches the driver app and runs the driver test on one, while launching the rider app and running the rider test on the other. In addition, it handles the interprocess communication between the two tests.
Thus, the two tests collaborate together to test the end-to-end driver/rider trip flow using signaling: a signal is as simple as a string. The meaning of a signal is determined by the logic inside the test case. (In this case, the rider and driver components work together to make sense of the signals.) When any instance of the simulator/emulator writes a signal, it specifies the destination target where the signal should be sent. Octopus is responsible for delivering the signal to the specified destination, e.g. from one test target to another test target.
Octopus provides two methods related to signaling:
Read the next signal from the current signal channel. Each test case is tied to one signal channel (a string as signal name) from which it can read signals.
Write the signal to a signal channel so that test target tied to that signal channel can get the signal using readSignal.
(As a way of understanding this, imagine how every simulator/emulator has an inbox and outbox. The outbox contains messages to be delivered to another simulator. E.g. driver outbox says rider1 driver_online. Octopus will copy driver_online into the rider1 simulator’s inbox. Then rider1 can start requesting the ride.)
The implementations of readSignal and writeSignal are totally different on Android and iOS:
There is no direct way of accessing the test host (the machine which is running the test) from the test target (i.e. an emulator or Android device). So the communication has to be relayed by the test host using adb (a command line tool that lets you communicate with an emulator instance or connected Android-powered device) commands.
The test host periodically checks to see if there is an outgoing signal. Whenever a test target writes a signal, and /data/tmp/<signal_name>.out is created on the same test target, the test host gets the signal instruction from the file. Octopus looks into that file and copies its contents to the corresponding test target the signal is intended for.
The UIAHost object in iOS UI Automation provides a method called performTaskWithPathArgumentsTimeout, which executes a command (e.g. a shell script) on the test host, collecting the exit code and output. With this functionality, iOS test targets (simulator or iOS devices) are able to access the test host (the machine the tests are run on) directly. The implementation of signaling is therefore simpler than the one on Android.
Octopus in Action
In a test world with one Uber vehicle and two riders, what does a simulated matched uberPOOL trip look like? Follow along with the slideshow below to see Octopus in action in our uberPOOL example, from the login stage on the first slide, to the first drop-off on the last slide. The far left simulator is the driver app; the middle and right are the rider apps:
In the simulation, you see Rider 1 (middle) request a ride. Because this is a synthetic environment, once the driver (left) accepts the request, Rider 2 (right) sees no available cars. We’re showing an uberPOOL, though, so once the ride starts, the driver is again available for a matched trip. Rider 2’s location and destination fit the criteria, so Rider 2 can now request the same vehicle. The driver picks both riders up and they are dropped off in the order that makes sense for their trip. Finally, the driver rates the individual riders at the conclusion of the trip.
If you find this blog interesting, follow @UberEng to receive updates about the cool tools that the Mobile Test Infra Team is building. Excited to work on this infrastructure? Do you like building tools that push boundaries? Come join us and make a splash! There’s an ocean of possibilities to work on.