Tag: Big Data

Image of birds flying

Sessionizing Uber Trips in Real Time

Uber's many data flows required modeling the data associated with a specific task, such as a rider trip, into a state machine. The state machine lets engineers focus on just the events needed to successfully accomplish a trip.

Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

Uber developed Peloton to help us balance resource use, elastically share resources, and plan for future capacity needs.

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Responsible for cleaning, storing, and serving over 100 petabytes of analytical data, Uber's Hadoop platform ensures data reliability, scalability, and ease-of-use with minimal latency.

Scaling Uber’s Apache Hadoop Distributed File System for Growth

Uber's Data Infrastructure team overhauled our approach to scaling our storage infrastructure by incorporating several new features and functionalities, including ViewFs, NameNode garbage collection tuning, and an HDFS load management service.

Meet Michelangelo: Uber’s Machine Learning Platform

Uber Engineering introduces Michelangelo, our machine learning-as-a-service system that enables teams to easily build, deploy, and operate ML solutions at scale.

Visualize Data Sets on the Web with Uber Engineering’s deck.gl Framework

In this article, we discuss deck.gl, an open sourced, WebGL-powered framework specifically designed for exploring and visualizing data sets at scale.

The Uber Engineering Tech Stack, Part II: The Edge and Beyond

The end of a two-part series on the tech stack that Uber Engineering uses to make transportation as reliable as running water, everywhere, for everyone, as of spring 2016.

Streamific, the Ingestion Service for Hadoop Big Data at Uber Engineering

Here we look at Hadoop data ingestion, and how Uber Engineering streams diverse data into a cohesive layer for querying in near real-time using our in-house developed Streamific.

Popular Articles