A Detailed Analysis of Optical Character Recognition Technology

Karez Abdulwahhab Hamad [1] , Mehmet Kaya [2]


In many different fields, there is a high demand for storing information to a computer storage disk from the data available in printed or handwritten documents or images to later re-utilize this information by means of computers. One simple way to store information to a computer system from these printed documents could be first to scan the documents and then store them as image files. But to re-utilize this information, it would very difficult to read or query text or other information from these image files. Therefore a technique to automatically retrieve and store information, in particular text, from image files is needed. Optical character recognition is an active research area that attempts to develop a computer system with the ability to extract and process text from images automatically. The objective of OCR is to achieve modification or conversion of any form of text or text-containing documents such as handwritten text, printed or scanned text images, into an editable digital format for deeper and further processing. Therefore, OCR enables a machine to automatically recognize text in such documents. Some major challenges need to be recognized and handled in order to achieve a successful automation. The font characteristics of the characters in paper documents and quality of images are only some of the recent challenges. Due to these challenges, characters sometimes may not be recognized correctly by computer system. In this paper we investigate OCR in four different ways. First we give a detailed overview of the challenges that might emerge in OCR stages. Second, we review the general phases of an OCR system such as pre-processing, segmentation, normalization, feature extraction, classification and post-processing. Then, we highlight developments and main applications and uses of OCR and finally, a brief OCR history are discussed. Therefore, this discussion provides a very comprehensive review of the state-of-the-art of the field.

OCR, OCR Challenges, OCR Phases, OCR Applications, OCR History
  • Optical character recognition by open source OCR tool tesseract: A case study
Subjects Engineering
Journal Section Research Article
Authors

Author: Karez Abdulwahhab Hamad
Institution: FIRAT UNIV
Country: Turkey


Author: Mehmet Kaya
Institution: ADIYAMAN ÜNİVERSİTESİ
Country: Turkey


Dates

Publication Date : December 1, 2016

Bibtex @conference paper { ijamec270374, journal = {International Journal of Applied Mathematics Electronics and Computers}, issn = {}, eissn = {2147-8228}, address = {}, publisher = {Selcuk University}, year = {2016}, volume = {}, pages = {244 - 249}, doi = {10.18100/ijamec.270374}, title = {A Detailed Analysis of Optical Character Recognition Technology}, key = {cite}, author = {Hamad, Karez and Kaya, Mehmet} }
APA Hamad, K , Kaya, M . (2016). A Detailed Analysis of Optical Character Recognition Technology. International Journal of Applied Mathematics Electronics and Computers , (Special Issue-1) , 244-249 . DOI: 10.18100/ijamec.270374
MLA Hamad, K , Kaya, M . "A Detailed Analysis of Optical Character Recognition Technology". International Journal of Applied Mathematics Electronics and Computers (2016 ): 244-249 <https://dergipark.org.tr/en/pub/ijamec/issue/25619/270374>
Chicago Hamad, K , Kaya, M . "A Detailed Analysis of Optical Character Recognition Technology". International Journal of Applied Mathematics Electronics and Computers (2016 ): 244-249
RIS TY - JOUR T1 - A Detailed Analysis of Optical Character Recognition Technology AU - Karez Abdulwahhab Hamad , Mehmet Kaya Y1 - 2016 PY - 2016 N1 - doi: 10.18100/ijamec.270374 DO - 10.18100/ijamec.270374 T2 - International Journal of Applied Mathematics Electronics and Computers JF - Journal JO - JOR SP - 244 EP - 249 VL - IS - Special Issue-1 SN - -2147-8228 M3 - doi: 10.18100/ijamec.270374 UR - https://doi.org/10.18100/ijamec.270374 Y2 - 2016 ER -
EndNote %0 International Journal of Applied Mathematics Electronics and Computers A Detailed Analysis of Optical Character Recognition Technology %A Karez Abdulwahhab Hamad , Mehmet Kaya %T A Detailed Analysis of Optical Character Recognition Technology %D 2016 %J International Journal of Applied Mathematics Electronics and Computers %P -2147-8228 %V %N Special Issue-1 %R doi: 10.18100/ijamec.270374 %U 10.18100/ijamec.270374
ISNAD Hamad, Karez , Kaya, Mehmet . "A Detailed Analysis of Optical Character Recognition Technology". International Journal of Applied Mathematics Electronics and Computers / Special Issue-1 (December 2016): 244-249 . https://doi.org/10.18100/ijamec.270374
AMA Hamad K , Kaya M . A Detailed Analysis of Optical Character Recognition Technology. International Journal of Applied Mathematics Electronics and Computers. 2016; (Special Issue-1): 244-249.
Vancouver Hamad K , Kaya M . A Detailed Analysis of Optical Character Recognition Technology. International Journal of Applied Mathematics Electronics and Computers. 2016; (Special Issue-1): 249-244.