Conference item icon

Conference item

Deep features for text spotting

Abstract:
The goal of this work is text spotting in natural images. This is divided into two sequential tasks: detecting words regions in the image, and recognizing the words within these regions. We make the following contributions: first, we develop a Convolutional Neural Network (CNN) classifier that can be used for both tasks. The CNN has a novel architecture that enables efficient feature sharing (by using a number of layers in common) for text detection, character case-sensitive and insensitive classification, and bigram classification. It exceeds the state-of-the-art performance for all of these. Second, we make a number of technical changes over the traditional CNN architectures, including no downsampling for a per-pixel sliding window, and multi-mode learning with a mixture of linear models (maxout). Third, we have a method of automated data mining of Flickr, that generates word and character level annotations. Finally, these components are used together to form an end-to-end, state-of-the-art text spotting system. We evaluate the text-spotting system on two standard benchmarks, the ICDAR Robust Reading data set and the Street View Text data set, and demonstrate improvements over the state-of-the-art on multiple measures. © 2014 Springer International Publishing.
Publication status:
Published
Peer review status:
Peer reviewed

Actions


Access Document


Files:
Publisher copy:
10.1007/978-3-319-10593-2_34

Authors


More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Research group:
Visual Geometry Group
Role:
Author
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Research group:
Visual Geometry Group
Oxford college:
New College
Role:
Author
ORCID:
0000-0003-1374-2858
More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Engineering Science
Research group:
Visual Geometry Group
Oxford college:
Brasenose College
Role:
Author
ORCID:
0000-0002-8945-8573


Publisher:
Springer Nature
Host title:
Computer Vision -- ECCV 2014 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part IV
Pages:
512-528
Series:
Lecture Notes in Computer Science
Series number:
8692
Publication date:
2014-08-14
Acceptance date:
2014-06-17
Event title:
13th European Conference, Computer Vision - ECCV 2014
Event location:
Zurich, Switzerland
Event start date:
2014-09-06
Event end date:
2014-09-12
DOI:
EISSN:
1611-3349
ISSN:
0302-9743
EISBN:
9783319105932
ISBN:
9783319105925


Language:
English
Pubs id:
484281
Local pid:
pubs:484281
Deposit date:
2024-07-15

Terms of use



Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP