Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery

July 1, 2017 / Global

Abstract

Optical Character Recognition (OCR) approaches have been widely advanced in recent years thanks to the resurgence of deep learning. The state-of-the-art models are mainly trained on the datasets consisting of the constrained scenes. Detecting and recognizing text from the real-world images remains a technical challenge. In this paper, we introduce a large-scale OCR dataset Uber-Text, which contains (1) streetside images with their text region polygons and the corresponding transcriptions, (2) 9 categories indicating the business name text, street name text and street number text, etc, (3) a set containing over 110k images, (4) 4.84 text instances per image on average. We show the challenge of the task and the dataset via evaluating the prevalent methods, which proves the significance of the dataset and motivates the future work in this field of study.