Araştırma Makalesi
BibTex RIS Kaynak Göster

Text Recognition in Natural Images Using Segmentation

Yıl 2022, Cilt: 10 Sayı: 5, 42 - 51, 26.12.2022
https://doi.org/10.29130/dubited.1107625

Öz

Optical character recognition, also known as OCR, is a method for recognizing a word or a phrase in scanned images. It has been developed through years of research. It has had great success in detecting text on scanned images. However, it does not give the desired result in natural images. Therefore, it is necessary to develop special approaches to detect texts in natural images. This study used Otsu and The Maximum Stable Extremal Regions (MSER) image segmentation methods to detect regions with text on natural images. Image segmentation is dividing an image into meaningful regions to analyze it better. In the Otsu model, the most appropriate threshold value is determined for the image, and the image is divided into two classes, foreground, and background, according to this threshold value. On the other hand, the MSER method blocks non-text regions and encloses regions thought to be text in bounding boxes. The study carried out aimed to determine the text areas on 20 natural images selected from the ICDAR 2013 data set with the Otsu method and the MSER method. After segmentation on the natural image, OCR was applied to the images to detect the text on the natural images, and the accuracy rates were compared.

Kaynakça

  • [1] L. Eikvil. (2022, February 26). Optical character recognition [Online]. Available: http://home.nr.no/~eikvil/OCR.pdf.
  • [2] N. Erdoğmuş, “Türkçe manzara metni veri kümesi,” 25th Signal Processing and Communications Applications Conference’ında sunuldu, Antalya, 2017.
  • [3] M. Thodaskar, and R. Devi, “Segmentation and detection of text in natural scene images,” International Journal of Engineering Research & Technology (IJERT), vol. 4, no. 6, pp. 1272-1277, 2015.
  • [4] O. Granlund, and K. Böhrnsen. (2022, March 5). Improving character recognition by thresholding natural images, [Online]. Available: https://www.diva-portal.org/smash/get/diva2:1108666/FULLTEXT01.pdf.
  • [5] M. Huang, W. Yu, and D. Zhu, “An improved image segmentation algorithm based on the Otsu method,” in 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, Japan, 2012, pp. 135-139.
  • [6] B. Shi, X. Wang, P. Lyu, C. Yao, and X. Bai, “Robust scene text recognition with automatic rectification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4168-4176.
  • [7] B. Kır Savaş, S. İlkin, S. Hangişi, ve S. Şahin, “Gölge tespitinde kullanılan Bayes sınıflandırma, Otsu bölütleme ve histogram dağılımı yöntemlerinin karşılaştırılması,” Düzce Üniversitesi Bilim ve Teknoloji Dergisi, c. 5, s. 2, ss. 345-355, 2016.
  • [8] H. Chen, S. S. Tsai, G. Schroth, D. M. Chen, R. Grzeszczuk, and B. Girod, “Robust text detection in natural images with edge-enhanced maximally stable extremal regions,” in 2011 18th IEEE International Conference on Image Processing, Brussels, Belgium, 2011, pp. 2609-2612.
  • [9] G. Chakraborty, S. Panda, and S. Roy. (2022, March 9). Text extraction from image using MATLAB [Online]. Available: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3525969
  • [10] T. E. De Campos, B. R. Babu, and M. Varma (2009), “Character recognition in natural images,” in Proceedings of the International Conference on Computer Vision Theory and Applications, 2009, 273-280.
  • [11] K. Wang, B. Babenko, and S. Belongie, “End-to-end scene text recognition,” in 2011 International Conference on Computer Vision, IEEE, Barcelona, Spain, 2011, pp. 1457-1464.
  • [12] K. Karthick, K. B. Ravindrakumar, R. Francis, and S. Ilankannan, “Steps involved in text recognition and recent research in OCR; a study,” International Journal of Recent Technology and Engineering, vol. 8, no. 1, pp. 3095-3100, 2019.
  • [13] C. Shi, C. Wang, B. Xiao, Y. Zhang, S. Gao, and Z. Zhang, “Scene text recognition using part-based tree-structured character detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, USA, 2013, pp. 2961-2968.
  • [14] N. Ezaki, M. Bulacu, and L. Schomaker, “Text detection from natural scene images: towards a system for visually impaired persons,” in Proceedings of the 17th International Conference on Pattern Recognition (ICPR), IEEE, Cambridge, England, 2004, pp. 683-686.
  • [15] K. Karthick, and S. Chitra. “Novel method for energy consumption billing using optical character recognition,” Energy Engineering, vol. 114, no. 3, pp. 64-76, 2017.
  • [16] R. C. Gonzalez, and E. R. Woods, Digital Image Processing, 3rd ed., New Jersey, USA: Pearson Education, 2008.
  • [17] T. Asano, D.Z. Chen, N. Katoh, and T. Tokuyama, “Polynomial-time solutions to image segmentation,” in Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, Atlanta, Georgia, 1996, pp. 104-113.
  • [18] Y. P. Zhu, and P. Li, “Survey on the image segmentation algorithms,” in Proceedings of the International Field Exploration and Development Conference 2017, Singapore, 2019, pp. 475-488.
  • [19] M. Sridevi and C. Mala, “A survey on monochrome image segmentation methods,” Procedia Technology, vol. 6, pp. 548-555, 2012.
  • [20] N. R. Pal, and S. K. Pal, “A review on image segmentation techniques,” Pattern Recognition, vol. 26, no. 9, pp. 1277-1294, 1993.
  • [21] M. W. Khan, “A survey: image segmentation techniques,” International Journal of Future Computer and Communication, vol. 3, no. 2, pp. 89-93, 2014.
  • [22] N. Otsu, “A threshold selection method from gray-level histogram,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, 1979.
  • [23] A. Çelik, ve S. Demirel, “Otsu ve Ridler-Calvard görüntü işleme yöntemlerinin zatürre tespitinde kullanılması,” Muş Alparslan Üniversitesi Fen Bilimleri Dergisi, c. 10, s. 1, ss. 917-923, 2022.
  • [24] J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust wide-baseline stereo from maximally stable extremal regions,” Image and Vision Computing, vol. 22, no. 10, pp. 761-767, 2004.
  • [25] T. Lindeberg, “Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention,” International Journal of Computer Vision, vol. 11, no. 3, pp. 283-318, 1993.
  • [26] Y. Alginahi, Character Recognition, 1st ed., Rijeka, Croatia: InTech, 2010, ch. 1, pp. 1-19.
  • [27] N. H. Barnouti, M. Abomaali, and M. H. N. Al-Mayyahi, “An efficient character recognition technique using K-nearest neighbor classifier,” International Journal of Engineering & Technology, vol. 7, no. 4, pp. 3148-3153, 2018.
  • [28] P. M. Manwatkar, and K. R. Singh, “A technical review on text recognition from image,” in 2015 IEEE 9th International Conference on Intelligent Systems and Control (ISCO), IEEE, 2015, pp. 1-5.
  • [29] M. A. Luján, M. V. Jimeno, J. Mateo Sotos, J. J.Ricarte, and A. L. Borja, “A survey on EEG signal processing techniques and machine learning: applications to the neurofeedback of autobiographical memory deficits in schizophrenia,” Electronics, vol. 10, pp. 3037-3055, 2021.
  • [30] R. Mittal, and A. Garg. “Text extraction using OCR: a systematic review,” in 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), IEEE, 2020, pp. 357-362.
  • [31] B. Bektaş, S. Babur, U. Turhal, ve E. Köse, “Makine öğrenmesi yardımıyla optik karakter tanıma sistemi,” 5. Uluslararası Matbaa Teknolojileri Sempozyumu, 2016, ss. 487-494.

Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma

Yıl 2022, Cilt: 10 Sayı: 5, 42 - 51, 26.12.2022
https://doi.org/10.29130/dubited.1107625

Öz

OCR olarak da bilinen optik karakter tanıma, taranan görüntülerdeki bir kelimeyi ya da bir cümleyi tanımak için kullanılan bir yöntemdir. Uzun yıllara dayanan araştırmalarla geliştirilmiştir. Taranan görüntüler üzerindeki metni tespit etmede büyük başarı sağlamıştır. Ancak doğal görüntüler üzerinde istenilen sonucu vermemektedir. Bu nedenle, doğal görüntülerdeki metinleri tespit edebilmek için özel yaklaşımların geliştirilmesi gerekliliği doğmuştur. Bu çalışmada, doğal görüntüler üzerinde metin olan bölgeleri algılamak için Otsu ve maksimum kararlı ekstrem bölgeler (MSER) görüntü bölütleme yöntemleri kullanılmıştır. Görüntü bölütleme, bir görüntüyü daha iyi analiz edebilmek için görüntüyü anlamlı bölgelere ayırma işlemidir. Otsu modelinde görüntü için en uygun eşik değeri belirlenerek, görüntü bu eşik değerine göre ön plan ve arka plan olmak üzere iki sınıfa ayrılmaktadır. MSER yöntemi ise metin olmayan bölgeleri engelleyerek, metin olduğu düşünülen bölgeleri sınırlayıcı kutu içerisine almaktadır. Gerçekleştirilen çalışmada, Otsu metodu ve MSER yöntemi ile ICDAR 2013 veri setinden seçilen 20 doğal görüntü üzerinde metin olan bölgelerinin tespit edilmesi amaçlanmıştır. Doğal görüntü üzerinde bölütleme işlemleri yapıldıktan sonra görüntülere OCR uygulanarak doğal görüntüler üzerindeki metnin tespit edilmesi sağlanmış ve doğruluk oranları karşılaştırılmıştır.

Kaynakça

  • [1] L. Eikvil. (2022, February 26). Optical character recognition [Online]. Available: http://home.nr.no/~eikvil/OCR.pdf.
  • [2] N. Erdoğmuş, “Türkçe manzara metni veri kümesi,” 25th Signal Processing and Communications Applications Conference’ında sunuldu, Antalya, 2017.
  • [3] M. Thodaskar, and R. Devi, “Segmentation and detection of text in natural scene images,” International Journal of Engineering Research & Technology (IJERT), vol. 4, no. 6, pp. 1272-1277, 2015.
  • [4] O. Granlund, and K. Böhrnsen. (2022, March 5). Improving character recognition by thresholding natural images, [Online]. Available: https://www.diva-portal.org/smash/get/diva2:1108666/FULLTEXT01.pdf.
  • [5] M. Huang, W. Yu, and D. Zhu, “An improved image segmentation algorithm based on the Otsu method,” in 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, Japan, 2012, pp. 135-139.
  • [6] B. Shi, X. Wang, P. Lyu, C. Yao, and X. Bai, “Robust scene text recognition with automatic rectification,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4168-4176.
  • [7] B. Kır Savaş, S. İlkin, S. Hangişi, ve S. Şahin, “Gölge tespitinde kullanılan Bayes sınıflandırma, Otsu bölütleme ve histogram dağılımı yöntemlerinin karşılaştırılması,” Düzce Üniversitesi Bilim ve Teknoloji Dergisi, c. 5, s. 2, ss. 345-355, 2016.
  • [8] H. Chen, S. S. Tsai, G. Schroth, D. M. Chen, R. Grzeszczuk, and B. Girod, “Robust text detection in natural images with edge-enhanced maximally stable extremal regions,” in 2011 18th IEEE International Conference on Image Processing, Brussels, Belgium, 2011, pp. 2609-2612.
  • [9] G. Chakraborty, S. Panda, and S. Roy. (2022, March 9). Text extraction from image using MATLAB [Online]. Available: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3525969
  • [10] T. E. De Campos, B. R. Babu, and M. Varma (2009), “Character recognition in natural images,” in Proceedings of the International Conference on Computer Vision Theory and Applications, 2009, 273-280.
  • [11] K. Wang, B. Babenko, and S. Belongie, “End-to-end scene text recognition,” in 2011 International Conference on Computer Vision, IEEE, Barcelona, Spain, 2011, pp. 1457-1464.
  • [12] K. Karthick, K. B. Ravindrakumar, R. Francis, and S. Ilankannan, “Steps involved in text recognition and recent research in OCR; a study,” International Journal of Recent Technology and Engineering, vol. 8, no. 1, pp. 3095-3100, 2019.
  • [13] C. Shi, C. Wang, B. Xiao, Y. Zhang, S. Gao, and Z. Zhang, “Scene text recognition using part-based tree-structured character detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, USA, 2013, pp. 2961-2968.
  • [14] N. Ezaki, M. Bulacu, and L. Schomaker, “Text detection from natural scene images: towards a system for visually impaired persons,” in Proceedings of the 17th International Conference on Pattern Recognition (ICPR), IEEE, Cambridge, England, 2004, pp. 683-686.
  • [15] K. Karthick, and S. Chitra. “Novel method for energy consumption billing using optical character recognition,” Energy Engineering, vol. 114, no. 3, pp. 64-76, 2017.
  • [16] R. C. Gonzalez, and E. R. Woods, Digital Image Processing, 3rd ed., New Jersey, USA: Pearson Education, 2008.
  • [17] T. Asano, D.Z. Chen, N. Katoh, and T. Tokuyama, “Polynomial-time solutions to image segmentation,” in Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, Atlanta, Georgia, 1996, pp. 104-113.
  • [18] Y. P. Zhu, and P. Li, “Survey on the image segmentation algorithms,” in Proceedings of the International Field Exploration and Development Conference 2017, Singapore, 2019, pp. 475-488.
  • [19] M. Sridevi and C. Mala, “A survey on monochrome image segmentation methods,” Procedia Technology, vol. 6, pp. 548-555, 2012.
  • [20] N. R. Pal, and S. K. Pal, “A review on image segmentation techniques,” Pattern Recognition, vol. 26, no. 9, pp. 1277-1294, 1993.
  • [21] M. W. Khan, “A survey: image segmentation techniques,” International Journal of Future Computer and Communication, vol. 3, no. 2, pp. 89-93, 2014.
  • [22] N. Otsu, “A threshold selection method from gray-level histogram,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, 1979.
  • [23] A. Çelik, ve S. Demirel, “Otsu ve Ridler-Calvard görüntü işleme yöntemlerinin zatürre tespitinde kullanılması,” Muş Alparslan Üniversitesi Fen Bilimleri Dergisi, c. 10, s. 1, ss. 917-923, 2022.
  • [24] J. Matas, O. Chum, M. Urban, and T. Pajdla, “Robust wide-baseline stereo from maximally stable extremal regions,” Image and Vision Computing, vol. 22, no. 10, pp. 761-767, 2004.
  • [25] T. Lindeberg, “Detecting salient blob-like image structures and their scales with a scale-space primal sketch: A method for focus-of-attention,” International Journal of Computer Vision, vol. 11, no. 3, pp. 283-318, 1993.
  • [26] Y. Alginahi, Character Recognition, 1st ed., Rijeka, Croatia: InTech, 2010, ch. 1, pp. 1-19.
  • [27] N. H. Barnouti, M. Abomaali, and M. H. N. Al-Mayyahi, “An efficient character recognition technique using K-nearest neighbor classifier,” International Journal of Engineering & Technology, vol. 7, no. 4, pp. 3148-3153, 2018.
  • [28] P. M. Manwatkar, and K. R. Singh, “A technical review on text recognition from image,” in 2015 IEEE 9th International Conference on Intelligent Systems and Control (ISCO), IEEE, 2015, pp. 1-5.
  • [29] M. A. Luján, M. V. Jimeno, J. Mateo Sotos, J. J.Ricarte, and A. L. Borja, “A survey on EEG signal processing techniques and machine learning: applications to the neurofeedback of autobiographical memory deficits in schizophrenia,” Electronics, vol. 10, pp. 3037-3055, 2021.
  • [30] R. Mittal, and A. Garg. “Text extraction using OCR: a systematic review,” in 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), IEEE, 2020, pp. 357-362.
  • [31] B. Bektaş, S. Babur, U. Turhal, ve E. Köse, “Makine öğrenmesi yardımıyla optik karakter tanıma sistemi,” 5. Uluslararası Matbaa Teknolojileri Sempozyumu, 2016, ss. 487-494.
Toplam 31 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Mühendislik
Bölüm Makaleler
Yazarlar

Yeliz Şenkaya 0000-0001-6527-6313

Çetin Kurnaz 0000-0003-3436-899X

Yayımlanma Tarihi 26 Aralık 2022
Yayımlandığı Sayı Yıl 2022 Cilt: 10 Sayı: 5

Kaynak Göster

APA Şenkaya, Y., & Kurnaz, Ç. (2022). Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma. Duzce University Journal of Science and Technology, 10(5), 42-51. https://doi.org/10.29130/dubited.1107625
AMA Şenkaya Y, Kurnaz Ç. Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma. DÜBİTED. Aralık 2022;10(5):42-51. doi:10.29130/dubited.1107625
Chicago Şenkaya, Yeliz, ve Çetin Kurnaz. “Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma”. Duzce University Journal of Science and Technology 10, sy. 5 (Aralık 2022): 42-51. https://doi.org/10.29130/dubited.1107625.
EndNote Şenkaya Y, Kurnaz Ç (01 Aralık 2022) Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma. Duzce University Journal of Science and Technology 10 5 42–51.
IEEE Y. Şenkaya ve Ç. Kurnaz, “Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma”, DÜBİTED, c. 10, sy. 5, ss. 42–51, 2022, doi: 10.29130/dubited.1107625.
ISNAD Şenkaya, Yeliz - Kurnaz, Çetin. “Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma”. Duzce University Journal of Science and Technology 10/5 (Aralık 2022), 42-51. https://doi.org/10.29130/dubited.1107625.
JAMA Şenkaya Y, Kurnaz Ç. Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma. DÜBİTED. 2022;10:42–51.
MLA Şenkaya, Yeliz ve Çetin Kurnaz. “Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma”. Duzce University Journal of Science and Technology, c. 10, sy. 5, 2022, ss. 42-51, doi:10.29130/dubited.1107625.
Vancouver Şenkaya Y, Kurnaz Ç. Bölütleme Kullanarak Doğal Görüntülerde Metin Tanıma. DÜBİTED. 2022;10(5):42-51.