TR
EN
Online Turkish Handwriting Recognition Using Synthetic Data
Abstract
We present a recognition system for online Turkish handwriting trained with synthetically generated data and transfer learning. Training deep networks requires large amounts of data. However, a sufficiently large collection of Turkish handwriting samples is not available. Hence we synthesize data to do pretraining before adapting the system to target dataset by fine tuning. We generate words from isolated character collection of a large English handwriting dataset. Then, we train the system first with synthetic data and fine tune it with Turkish handwriting samples from a smaller dataset. Fine tuning increases the character recognition rate of the final system which is evaluated on 2,041 samples of isolated Turkish words from the initial value of 61% to 88%. Performance of the system on synthetic data is quite similar to that on the Turkish test data which shows that the synthetic data resembles the real data quite closely. According to these results, synthetic data generation can be a solution to the data scarcity problem of online Turkish handwriting.
Keywords
Kaynakça
- Aksan, Y., Aksan, M., Koltuksuz, A., Sezer, T., Mersinli, Ü., Demirhan, U. U., Yilmazer, H., Atasoy, G., Öz, S., Yildiz, I. & Kurtoglu, Ö. (2012). Construction of the Turkish National Corpus (TNC). In N. Calzolari, K. Choukri, T. Declerck, M. U. Dogan, B. Maegaard, J. Mariani, J. Odijk & S. Piperidis (Hrsg.), Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC2012, Istanbul, Turkey, May23-25,2012 (S. 3223–3227). European Language Resources Association (ELRA).
- Al-Helali, B. M. & Mahmoud, S. A. (2017). Arabic Online Handwriting Recognition (AOHR): A Survey. ACM Comput. Surv., 50(3), 33:1–33:35.
- Ballard, L., Lopresti, D. P. & Monrose, F. (2007). Forgery Quality and Its Implications for Behavioral Biometric Security. IEEE Trans. Syst. Man Cybern. Part B, 37(5), 1107–1118.
- Biem, A. (2006). Minimum classification error training for online handwriting Recognition. IEEE Trans. Pattern Anal. Mach. Intell., 28(7), 1041–1051.
- Caillault, É. & Viard-Gaudin, C. (2007). Mixed Discriminant Training of Hybrid ANN/HMM Systems for Online Handwritten Word Recognition. IJPRAI, 21(1), 117–134.
- Boubaker, H. , Elbaati, A., Tagougui, N., El Abed, H., Kherallah, M., Märgner, V., & Alimi, A. M. (2021). ADAB database. IEEE Dataport.
- Li, Z., Liu, F., Yang, W., Peng, S. & Zhou, J. "A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects," in IEEE Transactions on Neural Networks and Learning Systems
- Çapar, A., Tasdemir, K., Kilic, Ö. & Gökmen, M. (2003). A Turkish Handprint Character Recognition System. Computer and Information Sciences - ISCIS 2003, 18th International Symposium, Antalya, Turkey, November 3-5, 2003, Proceedings, 447–456.
Ayrıntılar
Birincil Dil
İngilizce
Konular
Mühendislik
Bölüm
Araştırma Makalesi
Yazarlar
Yayımlanma Tarihi
31 Aralık 2021
Gönderilme Tarihi
22 Aralık 2021
Kabul Tarihi
2 Ocak 2022
Yayımlandığı Sayı
Yıl 2021 Sayı: 32
APA
Bilgin Taşdemir, E. F. (2021). Online Turkish Handwriting Recognition Using Synthetic Data. Avrupa Bilim ve Teknoloji Dergisi, 32, 649-656. https://doi.org/10.31590/ejosat.1039846
Cited By
РОЗПІЗНАВАННЯ РУКОПИСНИХ УКРАЇНСЬКИХ ЛІТЕР ТА ЦИФР З ВИКОРИСТАННЯМ СИНТЕТИЧНОГО НАБОРУ ДАНИХ ТА ЗГОРТКОВИХ НЕЙРОННИХ МЕРЕЖ
Grail of Science
https://doi.org/10.36074/grail-of-science.23.12.2022.36Research and evaluation of the efficiency of handwritten character recognition methods using convulsional neural networks
Reporter of the Priazovskyi State Technical University. Section: Technical sciences
https://doi.org/10.31498/2225-6733.47.2023.299989