Araştırma Makalesi

Speech recognition based on convolutional neural networks and MFCC algorithm

Cilt: 1 Sayı: 1 15 Ocak 2021
PDF İndir
EN

Speech recognition based on convolutional neural networks and MFCC algorithm

Öz

In this paper, an automatic speech recognition system based on convolutional neural networks and MFCC has been proposed. We have been investigated some deep models’ architecture with various hyperparameters options such as Dropout rate and Learning rate. The dataset used in this paper was collected from Kaggle TensorFlow Speech Recognition Challenge. Each audio file in the dataset contain one word with one second length the total words in the dataset correspond to 30 categories with one category for background noise. The dataset contains 64,721 files has been separated into 51,088 for the training set, 6,798 for the validation set and 6,835 for the testing set. We have evaluated 3 models with different hyperparameters configuration in order to choose the best model with higher accuracy. The highest accuracy achieved is 88.21%.

Anahtar Kelimeler

Kaynakça

  1. M. A. Anusuya and S. K. Katti. “Speech Recognition by Machine, A Review”. In: International Journal of Computer Science and Information Security, IJCSIS, Vol. 6, No. 3, pp. 181-205, December 2009, USA (Jan. 2010).
  2. Han, Wei, et al. "An efficient MFCC extraction method in speech recognition." 2006 IEEE international symposium on circuits and systems. IEEE, 2006.
  3. Jiang, Fei, et al. "An event recognition method for fiber distributed acoustic sensing systems based on the combination of MFCC and CNN." 2017 International Conference on Optical Instruments and Technology: Advanced Optical Sensors and Applications. Vol. 10618. International Society for Optics and Photonics, 2018.
  4. Warden, Pete. "Speech commands: A dataset for limited-vocabulary speech recognition." arXiv preprint arXiv:1804.03209 (2018).
  5. S. Davis and P. Mermelstein. “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”. In: IEEE Transactions on Acoustics, Speech, and Signal Processing 28.4 (Aug. 1980), pp. 357–366.
  6. Harshita Gupta and Divya Gupta. “LPC and LPCC method of feature extraction in Speech Recognition System”. In: 2016 6th International Conference - Cloud System and Big Data Engineering (Confluence). IEEE, Jan. 2016.
  7. D.J. Mashao, Y. Gotoh, and H.F. Silverman. “Analysis of LPC/DFT features for an HMM-based alphadigit recognizer”. In: IEEE Signal Processing Letters 3.4 (Apr. 1996), pp. 103–106.
  8. Y. Lecun et al. “Gradient-based learning applied to document recognition”. In: Proceedings of the IEEE 86.11 (1998), pp. 2278–2324.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Yapay Zeka

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

15 Ocak 2021

Gönderilme Tarihi

12 Temmuz 2020

Kabul Tarihi

25 Kasım 2020

Yayımlandığı Sayı

Yıl 2021 Cilt: 1 Sayı: 1

Kaynak Göster

APA
Mahmood, A., & Köse, U. (2021). Speech recognition based on convolutional neural networks and MFCC algorithm. Advances in Artificial Intelligence Research, 1(1), 6-12. https://izlik.org/JA77DD48TH
AMA
1.Mahmood A, Köse U. Speech recognition based on convolutional neural networks and MFCC algorithm. Adv. Artif. Intell. Res. 2021;1(1):6-12. https://izlik.org/JA77DD48TH
Chicago
Mahmood, Arzo, ve Utku Köse. 2021. “Speech recognition based on convolutional neural networks and MFCC algorithm”. Advances in Artificial Intelligence Research 1 (1): 6-12. https://izlik.org/JA77DD48TH.
EndNote
Mahmood A, Köse U (01 Ocak 2021) Speech recognition based on convolutional neural networks and MFCC algorithm. Advances in Artificial Intelligence Research 1 1 6–12.
IEEE
[1]A. Mahmood ve U. Köse, “Speech recognition based on convolutional neural networks and MFCC algorithm”, Adv. Artif. Intell. Res., c. 1, sy 1, ss. 6–12, Oca. 2021, [çevrimiçi]. Erişim adresi: https://izlik.org/JA77DD48TH
ISNAD
Mahmood, Arzo - Köse, Utku. “Speech recognition based on convolutional neural networks and MFCC algorithm”. Advances in Artificial Intelligence Research 1/1 (01 Ocak 2021): 6-12. https://izlik.org/JA77DD48TH.
JAMA
1.Mahmood A, Köse U. Speech recognition based on convolutional neural networks and MFCC algorithm. Adv. Artif. Intell. Res. 2021;1:6–12.
MLA
Mahmood, Arzo, ve Utku Köse. “Speech recognition based on convolutional neural networks and MFCC algorithm”. Advances in Artificial Intelligence Research, c. 1, sy 1, Ocak 2021, ss. 6-12, https://izlik.org/JA77DD48TH.
Vancouver
1.Arzo Mahmood, Utku Köse. Speech recognition based on convolutional neural networks and MFCC algorithm. Adv. Artif. Intell. Res. [Internet]. 01 Ocak 2021;1(1):6-12. Erişim adresi: https://izlik.org/JA77DD48TH

Advances in Artificial Intelligence Research is an open access journal which means that the content is freely available without charge to the user or his/her institution. All papers are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, which allows users to distribute, remix, adapt, and build upon the material in any medium or format for non-commercial purposes only, and only so long as attribution is given to the creator.

Graphic design @ Özden Işıktaş