Research Article
BibTex RIS Cite

LIP READING USING CNN FOR TURKISH NUMBERS

Year 2022, , 155 - 160, 31.12.2022
https://doi.org/10.46238/jobda.1100903

Abstract

Recently, lip reading has become one of the most important fields of study in the field of artificial intelligence. In this study, lip reading process was performed in Turkish language using convolutional neural networks (CNNs). For this purpose, people were asked to record the numbers video (61 video), and 9 video also collected from YouTube. The dataset was collected for 20 numbers. In this study, only the video was used and the sounds were completely removed. Due to the small dataset, it was tried to reproduce with different methods. The model was trained on the train dataset and 56.25% success was achieved on the test dataset.

References

  • Agrawal, S., & Omprakash, V. R. (2016, July). Lip reading techniques: A survey. In 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT) (pp. 753-757). IEEE.
  • Chen, X., Du, J., & Zhang, H. (2020). Lipreading with DenseNet and resBi-LSTM. Signal, Image and Video Processing, 14(5), 981-989.
  • Chung, J. S., Senior, A., Vinyals, O., & Zisserman, A. (2017, July). Lip reading sentences in the wild. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3444-3453). IEEE.
  • Elrefaei, L. A., Alhassan, T. Q., & Omar, S. S. (2019). An Arabic visual dataset for visual speech recognition. Procedia Computer Science, 163, 400-409.
  • Faisal, M., & Manzoor, S. (2018). Deep learning for lip reading using audio-visual information for urdu language. arXiv preprint arXiv:1802.05521.
  • Garg, A., Noyola, J., & Bagadia, S. (2016). Lip reading using CNN and LSTM. Technical report, Stanford University, CS231 n project report.
  • Li, Y., Takashima, Y., Takiguchi, T., & Ariki, Y. (2016, June). Lip reading using a dynamic feature of lip images and convolutional neural networks. In 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) (pp. 1-6). IEEE.
  • Martinez, B., Ma, P., Petridis, S., & Pantic, M. (2020, May). Lipreading using temporal convolutional networks. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6319-6323). IEEE.
  • Noda, K., Yamaguchi, Y., Nakadai, K., Okuno, H. G., & Ogata, T. (2014). Lipreading using convolutional neural network. In fifteenth annual conference of the international speech communication association, 1149-1153.
  • Ozcan, T., & Basturk, A. (2019). Lip reading using convolutional neural networks with and without pre-trained models. Balkan Journal of Electrical and Computer Engineering, 7(2), 195-201.
  • Petridis, S., Li, Z., & Pantic, M. (2017, March). End-to-end visual speech recognition with LSTMs. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2592-2596). IEEE.
  • Yargıç, A., & Doğan, M. (2013, June). A lip reading application on MS Kinect camera. In 2013 IEEE INISTA (pp. 1-5). IEEE.

Türk Rakamları İçin CNN İle Dudak Okuma

Year 2022, , 155 - 160, 31.12.2022
https://doi.org/10.46238/jobda.1100903

Abstract

References

  • Agrawal, S., & Omprakash, V. R. (2016, July). Lip reading techniques: A survey. In 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT) (pp. 753-757). IEEE.
  • Chen, X., Du, J., & Zhang, H. (2020). Lipreading with DenseNet and resBi-LSTM. Signal, Image and Video Processing, 14(5), 981-989.
  • Chung, J. S., Senior, A., Vinyals, O., & Zisserman, A. (2017, July). Lip reading sentences in the wild. In 2017 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3444-3453). IEEE.
  • Elrefaei, L. A., Alhassan, T. Q., & Omar, S. S. (2019). An Arabic visual dataset for visual speech recognition. Procedia Computer Science, 163, 400-409.
  • Faisal, M., & Manzoor, S. (2018). Deep learning for lip reading using audio-visual information for urdu language. arXiv preprint arXiv:1802.05521.
  • Garg, A., Noyola, J., & Bagadia, S. (2016). Lip reading using CNN and LSTM. Technical report, Stanford University, CS231 n project report.
  • Li, Y., Takashima, Y., Takiguchi, T., & Ariki, Y. (2016, June). Lip reading using a dynamic feature of lip images and convolutional neural networks. In 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS) (pp. 1-6). IEEE.
  • Martinez, B., Ma, P., Petridis, S., & Pantic, M. (2020, May). Lipreading using temporal convolutional networks. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6319-6323). IEEE.
  • Noda, K., Yamaguchi, Y., Nakadai, K., Okuno, H. G., & Ogata, T. (2014). Lipreading using convolutional neural network. In fifteenth annual conference of the international speech communication association, 1149-1153.
  • Ozcan, T., & Basturk, A. (2019). Lip reading using convolutional neural networks with and without pre-trained models. Balkan Journal of Electrical and Computer Engineering, 7(2), 195-201.
  • Petridis, S., Li, Z., & Pantic, M. (2017, March). End-to-end visual speech recognition with LSTMs. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2592-2596). IEEE.
  • Yargıç, A., & Doğan, M. (2013, June). A lip reading application on MS Kinect camera. In 2013 IEEE INISTA (pp. 1-5). IEEE.
There are 12 citations in total.

Details

Primary Language English
Journal Section Original Scientific Articles
Authors

Hadı Pourmousa 0000-0001-6713-5872

Üstün Özen 0000-0002-7595-4306

Publication Date December 31, 2022
Published in Issue Year 2022

Cite

APA Pourmousa, H., & Özen, Ü. (2022). LIP READING USING CNN FOR TURKISH NUMBERS. Journal of Business in The Digital Age, 5(2), 155-160. https://doi.org/10.46238/jobda.1100903
AMA Pourmousa H, Özen Ü. LIP READING USING CNN FOR TURKISH NUMBERS. JOBDA. December 2022;5(2):155-160. doi:10.46238/jobda.1100903
Chicago Pourmousa, Hadı, and Üstün Özen. “LIP READING USING CNN FOR TURKISH NUMBERS”. Journal of Business in The Digital Age 5, no. 2 (December 2022): 155-60. https://doi.org/10.46238/jobda.1100903.
EndNote Pourmousa H, Özen Ü (December 1, 2022) LIP READING USING CNN FOR TURKISH NUMBERS. Journal of Business in The Digital Age 5 2 155–160.
IEEE H. Pourmousa and Ü. Özen, “LIP READING USING CNN FOR TURKISH NUMBERS”, JOBDA, vol. 5, no. 2, pp. 155–160, 2022, doi: 10.46238/jobda.1100903.
ISNAD Pourmousa, Hadı - Özen, Üstün. “LIP READING USING CNN FOR TURKISH NUMBERS”. Journal of Business in The Digital Age 5/2 (December 2022), 155-160. https://doi.org/10.46238/jobda.1100903.
JAMA Pourmousa H, Özen Ü. LIP READING USING CNN FOR TURKISH NUMBERS. JOBDA. 2022;5:155–160.
MLA Pourmousa, Hadı and Üstün Özen. “LIP READING USING CNN FOR TURKISH NUMBERS”. Journal of Business in The Digital Age, vol. 5, no. 2, 2022, pp. 155-60, doi:10.46238/jobda.1100903.
Vancouver Pourmousa H, Özen Ü. LIP READING USING CNN FOR TURKISH NUMBERS. JOBDA. 2022;5(2):155-60.

                                                                Creative Commons Lisansı

Bu eser Creative Commons Atıf-AynıLisanslaPaylaş 4.0 Uluslararası Lisansı ile lisanslanmıştır.