Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
Engineering

The Billion Data Point Challenge: Building a Query Engine for High Cardinality Time Series Data

December 10, 2018 / Global
Featured image for The Billion Data Point Challenge: Building a Query Engine for High Cardinality Time Series Data
Figure 1. Our metrics query engine handles around 2,500 queries per second. Above, we map queries over a seven-day period.
Figure 2. Our metrics engine returns about 8.5 billion data points per second. Above, we map data point returns over a seven-day period.
Figure 3. Our metrics query engine handles approximately 3.5 Gbps. Above, we map network traffic over a seven-day period.
Figure 4. M3’s query engine architecture runs through parse, execution, and data retrieval phases.
Figure 5. The memory footprint of sequential execution (Approach 1) is much greater than that of lazy execution (Approach 2).
Figure 6. A block structure allows us to work in parallel on different storage blocks, which greatly improves our computation speed.
Figure 7. When leveraging no downsampling algorithm, the results are the most accurate but longest to load.
Figure 8. The averaging algorithm is fast to load, but hides anomalies, such as the data point circled in the graph.
Figure 9. The LTTB algorithm captures anomalies such as the one shown in the graph and loads quickly.
Figure 10. A typical M3QL query showing the failure rate of a certain endpoint.
Benjamin Raskin

Benjamin Raskin

Benjamin Raskin is a software engineer on Uber's Observability Engineering team.

Nikunj Aggarwal

Nikunj Aggarwal

Nikunj Aggarwal is a senior software engineer on Uber's Observability Engineering team.

Posted by Benjamin Raskin, Nikunj Aggarwal

Category: