An Overview of the Tesseract OCR Engine

Google (United States)

Indexed incrossref

Abstract

The Tesseract OCR engine, as was the HP Research Prototype in the UNLV Fourth Annual Test of OCR Accuracy, is described in a comprehensive overview. Emphasis is placed on aspects that are novel or at least unusual in an OCR engine, including in particular the line finding, features/classification methods, and the adaptive classifier.

Citation impact

2,217
total citations
FWCI
19.34
Percentile
100%
References
13
Citations per year

Authors

1

Topics & keywords

Keywords
  • Optical character recognition
  • Computer science
  • Artificial intelligence
  • Classifier (UML)
  • Information retrieval
  • Natural language processing
  • Pattern recognition (psychology)
No related works found for this paper.

Funding