Araştırma Makalesi

Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models

Cilt: 7 Sayı: 2 30 Nisan 2019
PDF İndir
EN

Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models

Öz

Lip reading has become a popular topic recently. There is a widespread literature studies on lip reading in human action recognition. Deep learning methods are frequently used in this area. In this paper, lip reading from video data is performed using self designed convolutional neural networks (CNNs). For this purpose, standard and also augmented AvLetters dataset is used train and test stages. To optimize network performance, minibatchsize parameter is also tuned and its effect is investigated. Additionally, experimental studies are performed using AlexNet and GoogleNet pre-trained CNNs. Detailed experimental results are presented.

Anahtar Kelimeler

Kaynakça

  1. S. Agrawal, V. R. Omprakash, and Ranvijay, “Lip reading techniques: A survey,” in 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT), pp. 753–757, July 2016.
  2. A. Garg, J. Noyola, and S. Bagadia, “Lip reading using CNN and LSTM,” in Technical Report, 2016.
  3. Y. Li, Y. Takashima, T. Takiguchi, and Y. Ariki, “Lip reading using a dynamic feature of lip images and convolutional neural networks,” in 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS), pp. 1–6, June 2016.
  4. S. Petridis, Z. Li, and M. Pantic, “End-to-end visual speech recognition with LSTMs,” CoRR, vol. abs/1701.05847, 2017.
  5. Y. Takashima, Y. Kakihara, R. Aihara, T. Takiguchi, Y. Ariki, N. Mitani, K. Omori, and K. Nakazono, “Audio-visual speech recognition using convolutive bottleneck networks for a person with severe hearing loss,” IPSJ Transactions on Computer Vision and Applications, vol. 7, pp. 64–68, 2015.
  6. A. Yargic and M. Dogan, “A lip reading application on MS Kinect camera,” in 2013 IEEE INISTA, pp. 1–5, June 2013.
  7. A. Rekik, A. Ben-Hamadou, and W. Mahdi, “A new visual speech recognition approach for RGB-D cameras,” in Image Analysis and Recognition (A. Campilho and M. Kamel, eds.), (Cham), pp. 21–28, Springer International Publishing, 2014.
  8. A. Rekik, A. Ben-Hamadou, andW. Mahdi, “Human machine interaction via visual speech spotting,” in Advanced Concepts for Intelligent Vision Systems (S. Battiato, J. Blanc-Talon, G. Gallo, W. Philips, D. Popescu, and P. Scheunders, eds.), (Cham), pp. 566–574, Springer International Publishing, 2015.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Elektrik Mühendisliği

Bölüm

Araştırma Makalesi

Yazarlar

Yayımlanma Tarihi

30 Nisan 2019

Gönderilme Tarihi

7 Kasım 2018

Kabul Tarihi

3 Nisan 2019

Yayımlandığı Sayı

Yıl 2019 Cilt: 7 Sayı: 2

Kaynak Göster

APA
Ozcan, T., & Basturk, A. (2019). Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models. Balkan Journal of Electrical and Computer Engineering, 7(2), 195-201. https://doi.org/10.17694/bajece.479891
AMA
1.Ozcan T, Basturk A. Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models. Balkan Journal of Electrical and Computer Engineering. 2019;7(2):195-201. doi:10.17694/bajece.479891
Chicago
Ozcan, Tayyip, ve Alper Basturk. 2019. “Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models”. Balkan Journal of Electrical and Computer Engineering 7 (2): 195-201. https://doi.org/10.17694/bajece.479891.
EndNote
Ozcan T, Basturk A (01 Nisan 2019) Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models. Balkan Journal of Electrical and Computer Engineering 7 2 195–201.
IEEE
[1]T. Ozcan ve A. Basturk, “Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models”, Balkan Journal of Electrical and Computer Engineering, c. 7, sy 2, ss. 195–201, Nis. 2019, doi: 10.17694/bajece.479891.
ISNAD
Ozcan, Tayyip - Basturk, Alper. “Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models”. Balkan Journal of Electrical and Computer Engineering 7/2 (01 Nisan 2019): 195-201. https://doi.org/10.17694/bajece.479891.
JAMA
1.Ozcan T, Basturk A. Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models. Balkan Journal of Electrical and Computer Engineering. 2019;7:195–201.
MLA
Ozcan, Tayyip, ve Alper Basturk. “Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models”. Balkan Journal of Electrical and Computer Engineering, c. 7, sy 2, Nisan 2019, ss. 195-01, doi:10.17694/bajece.479891.
Vancouver
1.Tayyip Ozcan, Alper Basturk. Lip Reading Using Convolutional Neural Networks with and without Pre-Trained Models. Balkan Journal of Electrical and Computer Engineering. 01 Nisan 2019;7(2):195-201. doi:10.17694/bajece.479891

Cited By

All articles published by BAJECE are licensed under the Creative Commons Attribution 4.0 International License. This permits anyone to copy, redistribute, remix, transmit and adapt the work provided the original work and source is appropriately cited.Creative Commons Lisans