American Sign Language Recognition using YOLOv4 Method

Ali Al-shaheen; Mesut Çevik; Alzubair Alqaraghulı

Derleme

American Sign Language Recognition using YOLOv4 Method

Yıl 2022, Cilt: 6 Sayı: 1, 61 - 65, 20.07.2022

Ali Al-shaheen , Mesut Çevik Alzubair Alqaraghulı

Öz

Abstract – Sign language is one of the ways of communication that is used by people who are unable to speak or hear (deaf and mute), so not all people are able to understand this language. Therefore, to facilitate communication between normal people and deaf and mute people, many systems have been invented that translate gestures and signs within sign language into words to be understood. The aim of this research is to train a model to be able to detect and recognize hand gestures and signs and then translate them into letters, numbers and words using the You Only Look Once (YOLO) method through pictures or videos, even in real-time. YOLO is one of the methods used in detecting and recognizing things that depend in their work on convolutional neural networks (CNN), which are characterized by accuracy and speed in work. In this research, we have created a data set consisting of 8000 images divided into 40 classes, for each class, 200 images were taken with different backgrounds and under lighting conditions, which allows the model to be able to differentiate the signal regardless of the intensity of the lighting or the clarity of the image. And after training the model on the dataset many times, in the experiment using image data we got very good results in terms of MAP = 98.01% as accuracy and current average loss=1.3 and recall=0.96 and F1=0.96, and for video results, it has the same accuracy and 28.9 frames per second (fps).

Anahtar Kelimeler

American Sign Language, Real-time Detection, You Only look Once, YOLO, CNN, Recognition, Hand Gestures, Computer Vision, Machine Learning, Deep Learning

Destekleyen Kurum

hora

Proje Numarası

Teşekkür

thanks

Kaynakça

[1] Z. Zafrulla, H. Brashear, P. Yin, P. Presti, T. Starner, and H. Hamilton, “American sign language phrase verification in an educational game for deaf children,” in 2010 20th International Conference on Pattern Recognition, 2010, pp. 3846–3849.
[2] C. Oz and M. C. Leu, “American sign language word recognition with a sensory glove using artificial neural networks,” Eng. Appl. Artif. Intell., vol. 24, no. 7, pp. 1204–1213, 2011.
[3] S. Lang, M. Block, and R. Rojas, “Sign language recognition using kinect,” in International Conference on Artificial Intelligence and Soft Computing, 2012, pp. 394–402.
[4] D. Aryanie and Y. Heryadi, “American sign language-based finger-spelling recognition using k-Nearest Neighbors classifier,” in 2015 3rd International Conference on Information and Communication Technology (ICoICT), 2015, pp. 533–536.
[5] R. A. Kadhim and M. Khamees, “A real-time american sign language recognition system using convolutional neural network for real datasets,” Tem J., vol. 9, no. 3, p. 937, 2020.
[6] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
[7] J. Du, “Understanding of object detection based on CNN family and YOLO,” in Journal of Physics: Conference Series, 2018, vol. 1004, no. 1, p. 12029.
[8] S. Daniels, N. Suciati, and C. Fathichah, “Indonesian Sign Language Recognition using YOLO Method,” in IOP Conference Series: Materials Science and Engineering, 2021, vol. 1077, no. 1, p. 12029.
[9] M. J. Shafiee, B. Chywl, F. Li, and A. Wong, “Fast YOLO: A fast you only look once system for real-time embedded object detection in video,” arXiv Prepr. arXiv1709.05943, 2017.
[10] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” arXiv Prepr. arXiv2004.10934, 2020.
[11] J. Yu and W. Zhang, “Face mask wearing detection algorithm based on improved YOLO-v4,” Sensors, vol. 21, no. 9, p. 3263, 2021.
[12] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587.
[13] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
[14] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Adv. Neural Inf. Process. Syst., vol. 28, 2015.
[15] Y. Chen, W. Li, C. Sakaridis, D. Dai, and L. Van Gool, “Domain adaptive faster r-cnn for object detection in the wild,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3339–3348.
[16] Z. He and L. Zhang, “Multi-adversarial faster-rcnn for unrestricted object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6668–6677.
[17] J. Li, X. Liang, S. Shen, T. Xu, J. Feng, and S. Yan, “Scale-aware fast R-CNN for pedestrian detection,” IEEE Trans. Multimed., vol. 20, no. 4, pp. 985–996, 2017.
[18] X. Wang, H. Ma, and X. Chen, “Salient object detection via fast R-CNN and low-level cues,” in 2016 IEEE International Conference on Image Processing (ICIP), 2016, pp. 1042–1046.
[19] X. Wang, A. Shrivastava, and A. Gupta, “A-fast-rcnn: Hard positive generation via adversary for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2606–2615.
[20] X. Xie, G. Cheng, J. Wang, X. Yao, and J. Han, “Oriented r-cnn for object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 3520–3529.
[21] Z. Cai and N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6154–6162.
[22] C. Chen, M.-Y. Liu, O. Tuzel, and J. Xiao, “R-CNN for small object detection,” in Asian conference on computer vision, 2016, pp. 214–230.
[23] S. Dasiopoulou, V. Mezaris, I. Kompatsiaris, V.-K. Papastathis, and M. G. Strintzis, “Knowledge-assisted semantic video object detection,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 10, pp. 1210–1224, 2005.
[24] L. Guan, Multimedia image and video processing. CRC press, 2017.
[25] J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. M. Rehg, “A scalable approach to activity recognition based on object use,” in 2007 IEEE 11th international conference on computer vision, 2007, pp. 1–8.

Yıl 2022, Cilt: 6 Sayı: 1, 61 - 65, 20.07.2022

Ali Al-shaheen , Mesut Çevik Alzubair Alqaraghulı

Öz

Proje Numarası

Kaynakça

[1] Z. Zafrulla, H. Brashear, P. Yin, P. Presti, T. Starner, and H. Hamilton, “American sign language phrase verification in an educational game for deaf children,” in 2010 20th International Conference on Pattern Recognition, 2010, pp. 3846–3849.
[2] C. Oz and M. C. Leu, “American sign language word recognition with a sensory glove using artificial neural networks,” Eng. Appl. Artif. Intell., vol. 24, no. 7, pp. 1204–1213, 2011.
[3] S. Lang, M. Block, and R. Rojas, “Sign language recognition using kinect,” in International Conference on Artificial Intelligence and Soft Computing, 2012, pp. 394–402.
[4] D. Aryanie and Y. Heryadi, “American sign language-based finger-spelling recognition using k-Nearest Neighbors classifier,” in 2015 3rd International Conference on Information and Communication Technology (ICoICT), 2015, pp. 533–536.
[5] R. A. Kadhim and M. Khamees, “A real-time american sign language recognition system using convolutional neural network for real datasets,” Tem J., vol. 9, no. 3, p. 937, 2020.
[6] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
[7] J. Du, “Understanding of object detection based on CNN family and YOLO,” in Journal of Physics: Conference Series, 2018, vol. 1004, no. 1, p. 12029.
[8] S. Daniels, N. Suciati, and C. Fathichah, “Indonesian Sign Language Recognition using YOLO Method,” in IOP Conference Series: Materials Science and Engineering, 2021, vol. 1077, no. 1, p. 12029.
[9] M. J. Shafiee, B. Chywl, F. Li, and A. Wong, “Fast YOLO: A fast you only look once system for real-time embedded object detection in video,” arXiv Prepr. arXiv1709.05943, 2017.
[10] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” arXiv Prepr. arXiv2004.10934, 2020.
[11] J. Yu and W. Zhang, “Face mask wearing detection algorithm based on improved YOLO-v4,” Sensors, vol. 21, no. 9, p. 3263, 2021.
[12] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587.
[13] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
[14] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Adv. Neural Inf. Process. Syst., vol. 28, 2015.
[15] Y. Chen, W. Li, C. Sakaridis, D. Dai, and L. Van Gool, “Domain adaptive faster r-cnn for object detection in the wild,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3339–3348.
[16] Z. He and L. Zhang, “Multi-adversarial faster-rcnn for unrestricted object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6668–6677.
[17] J. Li, X. Liang, S. Shen, T. Xu, J. Feng, and S. Yan, “Scale-aware fast R-CNN for pedestrian detection,” IEEE Trans. Multimed., vol. 20, no. 4, pp. 985–996, 2017.
[18] X. Wang, H. Ma, and X. Chen, “Salient object detection via fast R-CNN and low-level cues,” in 2016 IEEE International Conference on Image Processing (ICIP), 2016, pp. 1042–1046.
[19] X. Wang, A. Shrivastava, and A. Gupta, “A-fast-rcnn: Hard positive generation via adversary for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2606–2615.
[20] X. Xie, G. Cheng, J. Wang, X. Yao, and J. Han, “Oriented r-cnn for object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 3520–3529.
[21] Z. Cai and N. Vasconcelos, “Cascade r-cnn: Delving into high quality object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6154–6162.
[22] C. Chen, M.-Y. Liu, O. Tuzel, and J. Xiao, “R-CNN for small object detection,” in Asian conference on computer vision, 2016, pp. 214–230.
[23] S. Dasiopoulou, V. Mezaris, I. Kompatsiaris, V.-K. Papastathis, and M. G. Strintzis, “Knowledge-assisted semantic video object detection,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 10, pp. 1210–1224, 2005.
[24] L. Guan, Multimedia image and video processing. CRC press, 2017.
[25] J. Wu, A. Osuntogun, T. Choudhury, M. Philipose, and J. M. Rehg, “A scalable approach to activity recognition based on object use,” in 2007 IEEE 11th international conference on computer vision, 2007, pp. 1–8.

Toplam 25 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Ali Al-shaheen 0000-0002-9668-9556 Mesut Çevik Bu kişi benim Alzubair Alqaraghulı
Proje Numarası	60
Yayımlanma Tarihi	20 Temmuz 2022
Gönderilme Tarihi	17 Haziran 2022
Yayımlandığı Sayı	Yıl 2022 Cilt: 6 Sayı: 1

Kaynak Göster

IEEE	A. Al-shaheen, M. Çevik, ve A. Alqaraghulı, “American Sign Language Recognition using YOLOv4 Method”, IJMSIT, c. 6, sy. 1, ss. 61–65, 2022.

Kapak Resmi İndir

Makale Dosyaları

Tam Metin