Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery

    Abstract

    Optical Character Recognition (OCR) approaches have been widely advanced in recent years thanks to the resurgence of deep learning. The state-of-the-art models are mainly trained on the datasets consisting of the constrained scenes. Detecting and recognizing text from the real-world images remains a technical challenge. In this paper, we introduce a large-scale OCR dataset Uber-Text, which contains (1) streetside images with their text region polygons and the corresponding transcriptions, (2) 9 categories indicating the business name text, street name text and street number text, etc, (3) a set containing over 110k images, (4) 4.84 text instances per image on average. We show the challenge of the task and the dataset via evaluating the prevalent methods, which proves the significance of the dataset and motivates the future work in this field of study.

    Authors

    Ying Zhang, Lionel Gueguen, Ilya Zharkov, Peter Zhang, Keith Seifert, Ben Kadlec

    Conference

    CVPR 2017

    Full Paper

    ‘Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery’ (PDF)

    Uber ATG

    Comments