Araştırma Makalesi
BibTex RIS Kaynak Göster

Handwritten Amharic Character Recognition System Using Convolutional Neural Networks

Yıl 2019, Cilt: 14 Sayı: 2, 71 - 87, 20.04.2019

Öz

       Amharic
language is an official language of the federal government of the Federal
Democratic Republic of Ethiopia. Accordingly, there is a bulk of handwritten
Amharic documents available in libraries, information centres, museums, and
offices. Digitization of these documents enables to harness already available
language technologies to local information needs and developments. Converting
these documents will have a lot of advantages including (i) to preserve and
transfer history of the country (ii) to save storage space (ii) proper handling
of documents (iv) enhance retrieval of information through internet and other
applications. Handwritten Amharic character recognition system becomes a
challenging task due to inconsistency of a writer, variability in writing
styles of different writers, relatively large number of characters of the
script, high interclass similarity, structural complexity and degradation of
documents due to different reasons. In order to recognize handwritten Amharic
character a novel method based on deep neural networks is used which has
recently shown exceptional performance in various pattern recognition and
machine learning applications, but has not been endeavoured for Ethiopic
script. The Convolutional neural network model is evaluated for its performance
using our database that contains 132,500 datasets of handwritten Amharic
characters. Common handwritten recognition systems using machine learning use a
combination of both feature extractors and classifiers. Currently the use of
deep learning techniques shows promising improvements for machine learning
based classification tasks. Our proposed CNN model gives an accuracy of 91.83%
on training data and 90.47% on validation data.

Kaynakça

  • [1] Sarkhel, R., Das, N., Saha, A.K., and Nasipuri, M., (2016). A Multi-objective Approach Towards Cost Effective Isolated handwritten Bangla character and Digit Recognition, Pattern Recognition, 58:172-189.
  • [2] Liang, Y., Wang, J., Zhou, S., Gong, Y., and Zheng, N., (2016). Incorporating Image Priors with Deep Convolutional Neural Networks for Image Superresolution, Neurocomputing, 194:340-347.
  • [3] Maitra, D.S., Bhattacharya, U., and Parui, S.K., (2015). CNN based common approach to handwritten character recognition of multiple scripts, Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:1021-1025.
  • [4] Shin, H.C., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., and Summers, R.M., (2016). Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning, IEEE Transactions on Medical Imaging, 35, pp:1285-1298.
  • [5] Bai, J., Chen, Z., Feng, B., and Xu, B., (2014). Image character recognition using deep convolutional neural network learned from different languages, 2014 IEEE International Conference on Image Processing (ICIP), pp:2560-2564.
  • [6] Lecun, Y. and Bengio, Y., (1995). Pattern Recognition and Neural Networks, in Arbib, M.A. (Eds), The Handbook of Brain Theory and Neural Networks, MIT Press 1995.
  • [7] Guyon, I., Schomaker, L., Plamondon, R., Liberman, M., and Janet, S., (1994). Unipen Project of On-Line Data Exchange and Recognizer Benchmarks, in proc. of 12th International. Conference on Pattern Recognition (ICPR), vol:2, pp:29–33, IEEE.
  • [8] Yuan, A., Bai, G., Jiao, L., and Liu, Y., (2012). Offline Handwritten English Character Recognition Based On Convolutional Neural Network, in 10th IAPR International Workshop on Document Analysis Systems (DAS), pp. 125-129, doi: 10.1109/DAS.2012.61.
  • [9] Assabie, Y. and Bigun, J., (2006). Ethiopic Character Recognition Using Direction Field Tensor, in proc.int.conf. on pattern recognition, ICPR, pp:284-287, Hong Kong.
  • [10] Teshager, B.A., and Sethuraman, R., (2015). Artificial Neural Network Approach to the Development of OCR for Real Life Amharic Documents. International Journal of Science, Engineering and Technology Research (IJSETR), Volume:4, Issue:1, pp:2278–7798.
  • [11] Million, M. and Jawahar, C.V., (2007). Optical character recognition of Amharic documents. African Journal of Information & Communication Technology 3(2):14.
  • [12] Bai, J., Chen, Z., Feng, B., and Xu, B., (2014). Image Character Recognition Using Deep Convolutional Neural Network Learned from Different Languages. 2014 IEEE International Conference on Image Processing (ICIP), pp:2560-2564.
  • [13] Wu, C., Fan, W., He, Y., Sun, J., and Naoi, S., (2014). Handwritten Character Recognition by Alternately Trained Relaxation Convolutional Neural Network, Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on, pp:291-296.
  • [14] Zhong, Z., Jin, L., and Feng, Z., (2015). Multi-font Printed Chinese Character Recognition Using Multipooling Convolutional Neural Network. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:96-100.
  • [15] Yang, W., Jin, L., Xie, Z., and Feng, Z., (2015). Improved Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition Using Domain-Specific Knowledge. Proc. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:551-555.
  • [16] He, M., Zhang, S., Mao, H., and Jin, L., (2015). Recognition Confidence Analysis of Handwritten Chinese Character with CNN, Proc. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:61-65.
  • [17] Zhong, Z., Jin, L., and Xie, Z., (2015). High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps, Proc. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp. 846-850.
  • [18] Kim, I.J. and Xie, X., (2015). Handwritten Hangul Recognition Using Deep Convolutional Neural Networks, International Journal on Document Analysis and Recognition (IJDAR), 18, pp:1-3.
  • [19] Anil, R., Manjusha, K., Kumar, S.S., and Soman, K.P., (2015). Convolutional Neural Networks for the Recognition of Malayalam Characters, Proc. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:1041-1045.
  • [20] Acharya, S., Pant, A.K., and Gyawali, P.K., (2015). Deep Learning Based Large Scale Handwritten Devanagari Character Recognition, 2015 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), pp:1-6.
  • [21] Soman, S.T., Nandigam, A., and Chakravarthy, V.S., (2013). An Efficient Multiclassifier System Based on Convolutional Neural Network for Offline Handwritten Telugu Character Recognition, Proc. Communications (NCC), 2013 National Conference on, pp:1-5.
  • [22] Yann, L., et al., (1998). Gradient-based Learning Applied to Document Recognition. Proceedings of the IEEE 86.11, pp:2278-2324.
  • [23] El-Sawy, A., Loey, M., and El-Bakry, H., (2017). Arabic Handwritten Characters Recognition Using Convolutional Neural Network, WSEAS Transactions on Computer Research, vol:5, pp:11-19.
  • [24] Nair, V. and Hinton, G.E., (2010). Rectified Linear Units Improve Restricted Boltzmann Machines, Machine Learning -International Workshop Then Conference, pp:807-814.

HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS

Yıl 2019, Cilt: 14 Sayı: 2, 71 - 87, 20.04.2019

Öz

Amharic language is an official language of the federal government of the Federal Democratic Republic of Ethiopia. Accordingly, there is a bulk of handwritten Amharic documents available in libraries, information centres, museums, and offices. Digitization of these documents enables to harness already available language technologies to local information needs and developments. Converting these documents will have a lot of advantages including (i) to preserve and transfer history of the country (ii) to save storage space (ii) proper handling of documents (iv) enhance retrieval of information through internet and other applications. Handwritten Amharic character recognition system becomes a challenging task due to inconsistency of a writer, variability in writing styles of different writers, relatively large number of characters of the script, high interclass similarity, structural complexity and degradation of documents due to different reasons. In order to recognize handwritten Amharic character a novel method based on deep neural networks is used which has recently shown exceptional performance in various pattern recognition and machine learning applications, but has not been endeavoured for Ethiopic script. The Convolutional neural network model is evaluated for its performance using our database that contains 132,500 datasets of handwritten Amharic characters. Common handwritten recognition systems using machine learning use a combination of both feature extractors and classifiers. Currently the use of deep learning techniques shows promising improvements for machine learning based classification tasks. Our proposed CNN model gives an accuracy of 91.83% on training data and 90.47% on validation data. 

Kaynakça

  • [1] Sarkhel, R., Das, N., Saha, A.K., and Nasipuri, M., (2016). A Multi-objective Approach Towards Cost Effective Isolated handwritten Bangla character and Digit Recognition, Pattern Recognition, 58:172-189.
  • [2] Liang, Y., Wang, J., Zhou, S., Gong, Y., and Zheng, N., (2016). Incorporating Image Priors with Deep Convolutional Neural Networks for Image Superresolution, Neurocomputing, 194:340-347.
  • [3] Maitra, D.S., Bhattacharya, U., and Parui, S.K., (2015). CNN based common approach to handwritten character recognition of multiple scripts, Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:1021-1025.
  • [4] Shin, H.C., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., and Summers, R.M., (2016). Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning, IEEE Transactions on Medical Imaging, 35, pp:1285-1298.
  • [5] Bai, J., Chen, Z., Feng, B., and Xu, B., (2014). Image character recognition using deep convolutional neural network learned from different languages, 2014 IEEE International Conference on Image Processing (ICIP), pp:2560-2564.
  • [6] Lecun, Y. and Bengio, Y., (1995). Pattern Recognition and Neural Networks, in Arbib, M.A. (Eds), The Handbook of Brain Theory and Neural Networks, MIT Press 1995.
  • [7] Guyon, I., Schomaker, L., Plamondon, R., Liberman, M., and Janet, S., (1994). Unipen Project of On-Line Data Exchange and Recognizer Benchmarks, in proc. of 12th International. Conference on Pattern Recognition (ICPR), vol:2, pp:29–33, IEEE.
  • [8] Yuan, A., Bai, G., Jiao, L., and Liu, Y., (2012). Offline Handwritten English Character Recognition Based On Convolutional Neural Network, in 10th IAPR International Workshop on Document Analysis Systems (DAS), pp. 125-129, doi: 10.1109/DAS.2012.61.
  • [9] Assabie, Y. and Bigun, J., (2006). Ethiopic Character Recognition Using Direction Field Tensor, in proc.int.conf. on pattern recognition, ICPR, pp:284-287, Hong Kong.
  • [10] Teshager, B.A., and Sethuraman, R., (2015). Artificial Neural Network Approach to the Development of OCR for Real Life Amharic Documents. International Journal of Science, Engineering and Technology Research (IJSETR), Volume:4, Issue:1, pp:2278–7798.
  • [11] Million, M. and Jawahar, C.V., (2007). Optical character recognition of Amharic documents. African Journal of Information & Communication Technology 3(2):14.
  • [12] Bai, J., Chen, Z., Feng, B., and Xu, B., (2014). Image Character Recognition Using Deep Convolutional Neural Network Learned from Different Languages. 2014 IEEE International Conference on Image Processing (ICIP), pp:2560-2564.
  • [13] Wu, C., Fan, W., He, Y., Sun, J., and Naoi, S., (2014). Handwritten Character Recognition by Alternately Trained Relaxation Convolutional Neural Network, Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference on, pp:291-296.
  • [14] Zhong, Z., Jin, L., and Feng, Z., (2015). Multi-font Printed Chinese Character Recognition Using Multipooling Convolutional Neural Network. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:96-100.
  • [15] Yang, W., Jin, L., Xie, Z., and Feng, Z., (2015). Improved Deep Convolutional Neural Network for Online Handwritten Chinese Character Recognition Using Domain-Specific Knowledge. Proc. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:551-555.
  • [16] He, M., Zhang, S., Mao, H., and Jin, L., (2015). Recognition Confidence Analysis of Handwritten Chinese Character with CNN, Proc. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:61-65.
  • [17] Zhong, Z., Jin, L., and Xie, Z., (2015). High performance offline handwritten Chinese character recognition using GoogLeNet and directional feature maps, Proc. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp. 846-850.
  • [18] Kim, I.J. and Xie, X., (2015). Handwritten Hangul Recognition Using Deep Convolutional Neural Networks, International Journal on Document Analysis and Recognition (IJDAR), 18, pp:1-3.
  • [19] Anil, R., Manjusha, K., Kumar, S.S., and Soman, K.P., (2015). Convolutional Neural Networks for the Recognition of Malayalam Characters, Proc. Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp:1041-1045.
  • [20] Acharya, S., Pant, A.K., and Gyawali, P.K., (2015). Deep Learning Based Large Scale Handwritten Devanagari Character Recognition, 2015 9th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), pp:1-6.
  • [21] Soman, S.T., Nandigam, A., and Chakravarthy, V.S., (2013). An Efficient Multiclassifier System Based on Convolutional Neural Network for Offline Handwritten Telugu Character Recognition, Proc. Communications (NCC), 2013 National Conference on, pp:1-5.
  • [22] Yann, L., et al., (1998). Gradient-based Learning Applied to Document Recognition. Proceedings of the IEEE 86.11, pp:2278-2324.
  • [23] El-Sawy, A., Loey, M., and El-Bakry, H., (2017). Arabic Handwritten Characters Recognition Using Convolutional Neural Network, WSEAS Transactions on Computer Research, vol:5, pp:11-19.
  • [24] Nair, V. and Hinton, G.E., (2010). Rectified Linear Units Improve Restricted Boltzmann Machines, Machine Learning -International Workshop Then Conference, pp:807-814.
Toplam 24 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Mühendislik
Bölüm Makaleler
Yazarlar

Fetulhak Abdurahman 0000-0002-5670-0319

Yayımlanma Tarihi 20 Nisan 2019
Yayımlandığı Sayı Yıl 2019 Cilt: 14 Sayı: 2

Kaynak Göster

APA Abdurahman, F. (2019). HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS. Engineering Sciences, 14(2), 71-87.
AMA Abdurahman F. HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS. Engineering Sciences. Nisan 2019;14(2):71-87.
Chicago Abdurahman, Fetulhak. “HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS”. Engineering Sciences 14, sy. 2 (Nisan 2019): 71-87.
EndNote Abdurahman F (01 Nisan 2019) HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS. Engineering Sciences 14 2 71–87.
IEEE F. Abdurahman, “HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS”, Engineering Sciences, c. 14, sy. 2, ss. 71–87, 2019.
ISNAD Abdurahman, Fetulhak. “HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS”. Engineering Sciences 14/2 (Nisan 2019), 71-87.
JAMA Abdurahman F. HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS. Engineering Sciences. 2019;14:71–87.
MLA Abdurahman, Fetulhak. “HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS”. Engineering Sciences, c. 14, sy. 2, 2019, ss. 71-87.
Vancouver Abdurahman F. HANDWRITTEN AMHARIC CHARACTER RECOGNITION SYSTEM USING CONVOLUTIONAL NEURAL NETWORKS. Engineering Sciences. 2019;14(2):71-87.