Home Tags HIVE

Tag: HIVE

Marmaray logo

Marmaray: An Open Source Generic Data Ingestion and Dispersal Framework and Library for Apache...

Today we introduce Marmaray, an open source framework allowing data ingestion and dispersal for Apache Hadoop, realizing our vision of any-sync-to-any-source functionality, including data format validation.

Databook: Turning Big Data into Knowledge with Metadata at Uber

Databook, Uber's in-house platform for surfacing and exploring contextual metadata, makes dataset discovery and exploration easier for teams across the company.

Scaling Uber’s Apache Hadoop Distributed File System for Growth

Uber's Data Infrastructure team overhauled our approach to scaling our storage infrastructure by incorporating several new features and functionalities, including ViewFs, NameNode garbage collection tuning, and an HDFS load management service.

Queryparser, an Open Source Tool for Parsing and Analyzing SQL

Written in Haskell, Queryparser is Uber Engineering's open source tool for parsing and analyzing SQL queries that makes it easy to identify foreign-key relationships in large data warehouses.

Turbocharging Analytics at Uber with our Data Science Workbench

Uber Engineering's data science workbench (DSW) is an all-in-one toolbox that leverages aggregate data for interactive analytics and machine learning.

Engineering Restaurant Manager, our UberEATS Analytics Dashboard

The UberEATS Restaurant Manager gives restaurant partners insight into their business by measuring customer satisfaction, sales, and service quality.

Engineering Uber Predictions in Real Time with ELK

Uber Engineering architected a real-time trip features prediction system using an open source RESTful search engine built with Elasticsearch, Logstash, and Kibana (ELK).

Engineering Data Analytics with Presto and Apache Parquet at Uber

Snap your fingers and presto! How Uber Engineering built a fast, efficient data analytics system with Presto and Parquet.

Building an Intelligent Experimentation Platform with Uber Engineering

Composed of a staged rollout and intelligent analytics tool, Uber Engineering's experimentation platform is capable of stably deploying new features at scale across our apps. In this article, we discuss the challenges and opportunities we faced when building this product.

Redesigning Uber Engineering’s Mobile Content Delivery Ecosystem

How Uber Engineering re-architected the content delivery feed and backend ecosystem of our new driver app to deliver an enhanced user experience.

Hudi: Uber Engineering’s Incremental Processing Framework on Apache Hadoop

Uber Engineering's data processing platform team recently built and open sourced Hudi, an incremental processing framework that supports our business critical data pipelines. In this article, we see how Hudi powers a rich data ecosystem where external sources can be ingested into Hadoop in near real-time.

Designing Euclid to Make Uber Engineering Marketing Savvy

In this article, we take a look at Euclid, Uber Engineering's Hadoop and Spark-based in-house marketing platform.

The Uber Engineering Tech Stack, Part II: The Edge and Beyond

The end of a two-part series on the tech stack that Uber Engineering uses to make transportation as reliable as running water, everywhere, for everyone, as of spring 2016.

Streamific, the Ingestion Service for Hadoop Big Data at Uber Engineering

Here we look at Hadoop data ingestion, and how Uber Engineering streams diverse data into a cohesive layer for querying in near real-time using our in-house developed Streamific.