A Detailed Analysis of Optical Character Recognition Technology

Karez Hamad; Mehmet Kaya

doi:10.18100/ijamec.270374

EN

A Detailed Analysis of Optical Character Recognition Technology

Abstract

In many different fields, there is a high demand for storing information to a computer storage disk from the data available in printed or handwritten documents or images to later re-utilize this information by means of computers. One simple way to store information to a computer system from these printed documents could be first to scan the documents and then store them as image files. But to re-utilize this information, it would very difficult to read or query text or other information from these image files. Therefore a technique to automatically retrieve and store information, in particular text, from image files is needed. Optical character recognition is an active research area that attempts to develop a computer system with the ability to extract and process text from images automatically. The objective of OCR is to achieve modification or conversion of any form of text or text-containing documents such as handwritten text, printed or scanned text images, into an editable digital format for deeper and further processing. Therefore, OCR enables a machine to automatically recognize text in such documents. Some major challenges need to be recognized and handled in order to achieve a successful automation. The font characteristics of the characters in paper documents and quality of images are only some of the recent challenges. Due to these challenges, characters sometimes may not be recognized correctly by computer system. In this paper we investigate OCR in four different ways. First we give a detailed overview of the challenges that might emerge in OCR stages. Second, we review the general phases of an OCR system such as pre-processing, segmentation, normalization, feature extraction, classification and post-processing. Then, we highlight developments and main applications and uses of OCR and finally, a brief OCR history are discussed. Therefore, this discussion provides a very comprehensive review of the state-of-the-art of the field.

Keywords

OCR,OCR Challenges,OCR Phases,OCR Applications,OCR History

References

Optical character recognition by open source OCR tool tesseract: A case study

Details

Primary Language

English

Subjects

Engineering

Journal Section

Conference Paper

Authors

Karez Hamad
FIRAT UNIV
Türkiye

Mehmet Kaya
ADIYAMAN ÜNİVERSİTESİ
Türkiye

Publication Date

December 1, 2016

Submission Date

November 29, 2016

Acceptance Date

December 1, 2016

Published in Issue

Year 2016 Number: Special Issue-1

DOI

https://doi.org/10.18100/ijamec.270374

IZ

https://izlik.org/JA98TE89SX

Cite

RIS / Bibtex

APA

Hamad, K., & Kaya, M. (2016). A Detailed Analysis of Optical Character Recognition Technology. International Journal of Applied Mathematics Electronics and Computers, Special Issue-1, 244-249. https://doi.org/10.18100/ijamec.270374

AMA

1.Hamad K, Kaya M. A Detailed Analysis of Optical Character Recognition Technology. International Journal of Applied Mathematics Electronics and Computers. 2016;(Special Issue-1):244-249. doi:10.18100/ijamec.270374

Chicago

Hamad, Karez, and Mehmet Kaya. 2016. “A Detailed Analysis of Optical Character Recognition Technology”. International Journal of Applied Mathematics Electronics and Computers, no. Special Issue-1: 244-49. https://doi.org/10.18100/ijamec.270374.

EndNote

Hamad K, Kaya M (December 1, 2016) A Detailed Analysis of Optical Character Recognition Technology. International Journal of Applied Mathematics Electronics and Computers Special Issue-1 244–249.

IEEE

[1]K. Hamad and M. Kaya, “A Detailed Analysis of Optical Character Recognition Technology”, International Journal of Applied Mathematics Electronics and Computers, no. Special Issue-1, pp. 244–249, Dec. 2016, doi: 10.18100/ijamec.270374.

ISNAD

Hamad, Karez - Kaya, Mehmet. “A Detailed Analysis of Optical Character Recognition Technology”. International Journal of Applied Mathematics Electronics and Computers. Special Issue-1 (December 1, 2016): 244-249. https://doi.org/10.18100/ijamec.270374.

JAMA

1.Hamad K, Kaya M. A Detailed Analysis of Optical Character Recognition Technology. International Journal of Applied Mathematics Electronics and Computers. 2016;:244–249.

MLA

Hamad, Karez, and Mehmet Kaya. “A Detailed Analysis of Optical Character Recognition Technology”. International Journal of Applied Mathematics Electronics and Computers, no. Special Issue-1, Dec. 2016, pp. 244-9, doi:10.18100/ijamec.270374.

Vancouver

1.Karez Hamad, Mehmet Kaya. A Detailed Analysis of Optical Character Recognition Technology. International Journal of Applied Mathematics Electronics and Computers. 2016 Dec. 1;(Special Issue-1):244-9. doi:10.18100/ijamec.270374