Skip to footer
Home Research Computer Vision Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery

Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery

Abstract

Optical Character Recognition (OCR) approaches have been widely advanced in recent years thanks to the resurgence of deep learning. The state-of-the-art models are mainly trained on the datasets consisting of the constrained scenes. Detecting and recognizing text from the real-world images remains a technical challenge. In this paper, we introduce a large-scale OCR dataset Uber-Text, which contains (1) streetside images with their text region polygons and the corresponding transcriptions, (2) 9 categories indicating the business name text, street name text and street number text, etc, (3) a set containing over 110k images, (4) 4.84 text instances per image on average. We show the challenge of the task and the dataset via evaluating the prevalent methods, which proves the significance of the dataset and motivates the future work in this field of study.

Authors

Ying Zhang, Lionel Gueguen, Ilya Zharkov, Peter Zhang, Keith Seifert, Ben Kadlec

Conference

CVPR 2017

Full Paper

‘Uber-Text: A Large-Scale Dataset for Optical Character Recognition from Street-Level Imagery’ (PDF)

Uber ATG

Comments