A Comparison of Isolated Word Recognition Performances for Machine Learning and Hybrid Subspace Classifiers

Serkan Keser

doi:10.51764/smutgd.1338977

Araştırma Makalesi

A Comparison of Isolated Word Recognition Performances for Machine Learning and Hybrid Subspace Classifiers

Yıl 2023, Cilt: 6 Sayı: 2, 235 - 249, 31.12.2023

Serkan Keser

https://doi.org/10.51764/smutgd.1338977

Öz

One of the essential factors affecting recognition rates in speech recognition studies is environmental background noise. This study used a speech database containing different noise types to perform speaker-independent isolated word recognition. Thus, it will be possible to understand the effects of speech signals having noise on the recognition performance of classifiers. In the study, K-Nearest Neighbors (KNN), Fisher Linear Discriminant Analysis-KNN (FLDA-KNN), Discriminative Common Vector Approach (DCVA), Support Vector Machines (SVM), Convolutional Neural Network (CNN), and Recurrent Neural Network with Long Short-Term Memory (RNN-LSTM) were used as classifiers. MFCC and PLP coefficients were used as feature vectors. The DCVA classifier has been deeply tested for isolated word recognition for the first time in the literature. The recognition process was carried out using various distance measures for the KNN, FLDA-KNN, and DCVA classifiers. In addition, new (DCVA)PCA and (FLDA-KNN)PCA classifiers were designed as hybrid algorithms using Principle Component Analysis (PCA), and better recognition results were obtained from those of DCVA and FLDA-KNN classifiers. The highest recognition rate of RNN-LSTM was 93.22% in experimental studies. For the other classifiers, the highest recognition rates of the CNN, KNN, DCVA, (DCVA)PCA, SVM, FLDA-KNN, and (FLDA-KNN)PCA were 87.56%, 86.51%, 74.23%, 79%, 77.78%, 71.37% and 84.90%, respectively.

Anahtar Kelimeler

Noisy Speech Signals, Hybrid Subspace Classifiers, Machine Learning Classifiers, PLP, MFCC

Kaynakça

Abdel-Hamid, O., & Jiang, H. (2013, May). Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 7942-7946). IEEE.
Abu Alfeilat, H. A., Hassanat, A. B., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. S. (2019). Effects of distance measure choice on k-nearest neighbor classifier performance: a review. Big data, 7(4), 221-248.
Akyazi, Ö., Şahin, E., Özsoy, T., & Algül, M. (2019). A Solar Panel Cleaning Robot Design and Application. Avrupa Bilim ve Teknoloji Dergisi, 343-348.
Anggraeni, D., Sanjaya, W. S. M., Nurasyidiek, M. Y. S., & Munawwaroh, M. (2018). The implementation of speech recognition using mel-frequency cepstrum coefficients (MFCC) and support vector machine (SVM) method based on python to control robot arm. In IOP Conference Series: Materials Science and Engineering (Vol. 288, No. 1, p. 012042). IOP Publishing.
Beigi, H. (2011). Speaker recognition. In Fundamentals of Speaker Recognition (pp. 543-559). Springer, Boston, MA. Belhumeur P. N., Hespanha J. P., Kriegman D. J., “Eigenfaces vs fisherfaces: Recognition using class specific linear projection,” IEEE Trans. on PAMI, Vol. 19, No:7, pp. 711-720, 1997.
Bharali, S. S., & Kalita, S. K. (2015). A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language. International Journal of Speech Technology, 18(4), 673-684.
Cevikalp Hakan et al., “Discriminative common vectors for face recognition”, Pattern Analysis and Machine Intelligence IEEE Transactions on 27, vol. 1, pp. 4-13, 2005.
Dokuz, Y., & Tüfekci, Z. (2020). A Review on Deep Learning Architectures for Speech Recognition. Avrupa Bilim ve Teknoloji Dergisi, 169-176.
Filho, G. L., & Moir, T. J. (2010). From science fiction to science fact: a smart-house interface using speech technology and a photo-realistic avatar. International journal of computer applications in technology, 39(1-3), 32-39.
Furui, S., Kikuchi, T., Shinnaka, Y., & Hori, C. (2004). Speechto-text and speech-to-speech summarization of spontaneous speech. IEEE Transactions on Speech and Audio Processing, 12(4), 401-408.
Gaikwad, S. K., Gawali, B. W., & Yannawar, P. (2010). A review on speech recognition technique. International Journal of Computer Applications, 10(3), 16-24.
Garcia, C. I., Grasso, F., Luchetta, A., Piccirilli, M. C., Paolucci, L., & Talluri, G. (2020). A comparison of power quality disturbance detection and classification methods using CNN, LSTM and CNN-LSTM. Applied Sciences, 10(19), 6755.
Gulati, A., Qin, J., Chiu, C. C., Parmar, N., Zhang, Y., Yu, J., ... & Pang, R. (2020). Conformer: Convolution-augmented transformer for speech recognition. arXiv preprint arXiv:2005.08100.
Gulmezoglu, M. B., Dzhafarov, V., Keskin, M., & Barkana, A. (1999). A novel approach to isolated word recognition. IEEE Transactions on Speech and Audio Processing, 7(6), 620-628.
Gulmezoglu, M. B., Edizkan, R., Ergin, S., & Barkana, A. (2005, May). Improvements on isolated word recognition using FLDA. In Proceedings of the IEEE 13th Signal Processing and Communications Applications Conference, 2005. (pp. 703-706). IEEE.
Gunal, S., & Edizkan, R. (2008). Subspace based feature selection for pattern recognition. Information Sciences, 178(19), 3716-3726.
Haque, M. A., Verma, A., Alex, J. S. R., & Venkatesan, N. (2020). Experimental evaluation of CNN architecture for speech recognition. In First international conference on sustainable technologies for computational intelligence (pp. 507-514). Springer, Singapore.
Imtiaz, M. A., & Raja, G. (2016, November). Isolated word automatic speech recognition (ASR) system using MFCC, DTW & KNN. In 2016 asia pacific conference on multimedia and broadcasting (APMediaCast) (pp. 106-110). IEEE.
Keser, S., & Edizkan, R. (2009, April). Phonem-based isolated Turkish word recognition with subspace classifier. In 2009 IEEE 17th Signal Processing and Communications Applications Conference (pp. 93-96). IEEE.
Kolossa, D., Zeiler, S., Saeidi, R., & Astudillo, R. F. (2013). Noise-adaptive LDA: A new approach for speech recognition under observation uncertainty. IEEE Signal Processing Letters, 20(11), 1018-1021.
Lalitha, S., Mudupu, A., Nandyala, B. V., & Munagala, R. (2015, December). Speech emotion recognition using DWT. In 2015 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) (pp. 1-4). IEEE.
Miao, F., Zhang, P., Jin, L., & Wu, H. (2018, August). Chinese news text classification based on machine learning algorithm. In 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC) (Vol. 2, pp. 48-51). IEEE.
Mohan, B. J. (2014, January). Speech recognition using MFCC and DTW. In 2014 International Conference on Advances in Electrical Engineering (ICAEE) (pp. 1-4). IEEE.
Muhammad, H. Z., Nasrun, M., Setianingsih, C., & Murti, M. A. (2018, May). Speech recognition for English to Indonesian translator using hidden Markov model. In 2018 International Conference on Signals and Systems (ICSigSys) (pp. 255- 260). IEEE.
Myers, Jerome L.; Well, Arnold D. (2003). Research Design and Statistical Analysis (2nd ed.). Lawrence Erlbaum. pp. 508. ISBN 978-0-8058-4037-7.
Najkar, N., Razzazi, F., & Sameti, H. (2010). A novel approach to HMM-based speech recognition systems using particle swarm optimization. Mathematical and Computer Modelling, 52(11-12), 1910-1920.
Palaz, D., & Collobert, R. (2015). Analysis of cnn-based speech recognition system using raw speech as input (No. REP_WORK). Idiap.
Palaz, D., Magimai-Doss, M., & Collobert, R. (2019). End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition. Speech Communication, 108, 15-32.
Passricha, V., & Aggarwal, R. K. (2020). A hybrid of deep CNN and bidirectional LSTM for automatic speech recognition. Journal of Intelligent Systems, 29(1), 1261-1274.
Permanasari, Y., Harahap, E. H., & Ali, E. P. (2019, November). Speech recognition using dynamic time warping (DTW). In Journal of Physics: Conference Series (Vol. 1366, No. 1, p. 012091). IOP Publishing.
Seltzer, M. L., Yu, D., & Wang, Y. (2013, May). An investigation of deep neural networks for noise robust speech recognition. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 7398-7402). IEEE.
Sivaram, G. S., Nemala, S. K., Mesgarani, N., & Hermansky, H. (2010). Data-driven and feedback based spectro-temporal features for speech recognition. IEEE Signal Processing Letters, 17(11), 957-960.
Song, Y., Huang, J., Zhou, D., Zha, H., & Giles, C. L. (2007, September). Iknn: Informative k-nearest neighbor pattern classification. In European Conference on Principles of Data Mining and Knowledge Discovery (pp. 248-264). Springer, Berlin, Heidelberg.
Song, K. T., Han, M. J., & Wang, S. C. (2014). Speech signal-based emotion recognition and its application to entertainment robots. Journal of the Chinese Institute of Engineers, 37(1), 14-25.
Soucy, P., & Mineau, G. W. (2001, November). A simple KNN algorithm for text categorization. In Proceedings 2001 IEEE international conference on data mining (pp. 647-648). IEEE.
Soujanya, M., & Kumar, S. (2010, August). Personalized IVR system in contact center. In 2010 International Conference on Electronics and Information Engineering (Vol. 1, pp. V1- 453). IEEE.
Speech commands dataset version 2 (2018). [Online]. Available:http://download.tensorflow.org/data/speech_commands_v0.02.tar.gz
Srisuwan, N., Phukpattaranont, P., & Limsakul, C. (2018). Comparison of feature evaluation criteria for speech recognition based on electromyography. Medical & biological engineering & computing, 56(6), 1041-1051.
Sumit, S. H., Al Muntasir, T., Zaman, M. A., Nandi, R. N., & Sourov, T. (2018, September). Noise robust end-to-end speech recognition for bangla language. In 2018 international conference on bangla speech and language processing (ICBSLP) (pp. 1-5). IEEE.
Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
Tan, K., & Wang, D. (2018, September). A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement. In Interspeech (Vol. 2018, pp. 3229-3233).
Tan, T., Qian, Y., Hu, H., Zhou, Y., Ding, W., & Yu, K. (2018). Adaptive very deep convolutional residual network for noise robust speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(8), 1393-1405.
Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., & Kitamura, T. (2000, June). Speech parameter generation algorithms for HMM-based speech synthesis. In 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 00CH37100) (Vol. 3, pp. 1315-1318). IEEE.
Wahyuni, E. S. (2017, November). Arabic speech recognition using MFCC feature extraction and ANN classification. In 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE) (pp. 22-25). IEEE.
Yang, L., & Chen, S. (2014). Linear discriminant analysis with worst between-class separation and average within-class compactness. Frontiers of Computer Science, 8(5), 785-792.
Yavuz, H. S., Çevikalp, H., & Barkana, A. (2006). Twodimensional CLAFIC methods for image recognition. In 2006 IEEE 14th Signal Processing and Communications
Zhang, S. X., & Gales, M. J. (2012). Structured SVMs for automatic speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 21(3), 544-555.

Makine Öğrenimi ve Hibrit Altuzay Sınıflandırıcılar için Yalıtılmış Kelime Tanıma Performanslarının Karşılaştırılması

Yıl 2023, Cilt: 6 Sayı: 2, 235 - 249, 31.12.2023

Serkan Keser

https://doi.org/10.51764/smutgd.1338977

Öz

Konuşma tanıma çalışmalarında tanıma oranlarını etkileyen temel faktörlerden biri çevresel arka plan gürültüsüdür. Bu çalışmada, konuşmacıdan bağımsız izole kelime tanıma işlemini gerçekleştirmek için farklı gürültü türlerini içeren bir konuşma veritabanı kullanılmıştır. Böylece gürültülü konuşma sinyallerinin sınıflandırıcıların tanıma performansı üzerindeki etkilerini anlamak mümkün olacaktır. Çalışmada K-En Yakın Komşular (KNN), Fisher Doğrusal Diskriminant Analizi-KNN (FLDA-KNN), Ayrımcı Ortak Vektör Yaklaşımı (DCVA), Destek Vektör Makineleri (SVM), Evrişimsel Sinir Ağı (CNN) ve Tekrarlayan Sinir Ağı kullanılmıştır. Sınıflandırıcı olarak Uzun Kısa Süreli Bellek (RNN-LSTM) kullanıldı. Özellik vektörleri olarak MFCC ve PLP katsayıları kullanıldı. DCVA sınıflandırıcısı, literatürde ilk kez izole edilmiş kelime tanıma açısından derinlemesine test edilmiştir. Tanıma işlemi KNN, FLDA-KNN ve DCVA sınıflandırıcıları için çeşitli mesafe ölçütleri kullanılarak gerçekleştirilmiştir. Ayrıca, yeni (DCVA)PCA ve (FLDA-KNN)PCA sınıflandırıcıları, Temel Bileşen Analizi (PCA) kullanılarak hibrit algoritmalar olarak tasarlanmış ve DCVA ve FLDA-KNN sınıflandırıcılarından daha iyi tanıma sonuçları elde edilmiştir. En yüksek tanınma oranı deneysel çalışmalarda RNN-LSTM ile %93,22 bulunmuştur. Diğer sınıflandırıcılar için ise en yüksek tanınma oranları sırasıyla CNN, KNN, DCVA, (DCVA)PCA, SVM, FLDA-KNN ve (FLDA-KNN)PCA’nın %87,56, %86,51, %74,23, %79, %77,78, %71,37 ve %84,90’dir.

Anahtar Kelimeler

Gürültülü Konuşma Sinyalleri, Hibrit Altuzay Sınıflandırıcılar, Makine Öğrenimi Sınıflandırıcılar, PLP, MFCC

Kaynakça

Abdel-Hamid, O., & Jiang, H. (2013, May). Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 7942-7946). IEEE.
Abu Alfeilat, H. A., Hassanat, A. B., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., Eyal Salman, H. S., & Prasath, V. S. (2019). Effects of distance measure choice on k-nearest neighbor classifier performance: a review. Big data, 7(4), 221-248.
Akyazi, Ö., Şahin, E., Özsoy, T., & Algül, M. (2019). A Solar Panel Cleaning Robot Design and Application. Avrupa Bilim ve Teknoloji Dergisi, 343-348.
Anggraeni, D., Sanjaya, W. S. M., Nurasyidiek, M. Y. S., & Munawwaroh, M. (2018). The implementation of speech recognition using mel-frequency cepstrum coefficients (MFCC) and support vector machine (SVM) method based on python to control robot arm. In IOP Conference Series: Materials Science and Engineering (Vol. 288, No. 1, p. 012042). IOP Publishing.
Beigi, H. (2011). Speaker recognition. In Fundamentals of Speaker Recognition (pp. 543-559). Springer, Boston, MA. Belhumeur P. N., Hespanha J. P., Kriegman D. J., “Eigenfaces vs fisherfaces: Recognition using class specific linear projection,” IEEE Trans. on PAMI, Vol. 19, No:7, pp. 711-720, 1997.
Bharali, S. S., & Kalita, S. K. (2015). A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language. International Journal of Speech Technology, 18(4), 673-684.
Cevikalp Hakan et al., “Discriminative common vectors for face recognition”, Pattern Analysis and Machine Intelligence IEEE Transactions on 27, vol. 1, pp. 4-13, 2005.
Dokuz, Y., & Tüfekci, Z. (2020). A Review on Deep Learning Architectures for Speech Recognition. Avrupa Bilim ve Teknoloji Dergisi, 169-176.
Filho, G. L., & Moir, T. J. (2010). From science fiction to science fact: a smart-house interface using speech technology and a photo-realistic avatar. International journal of computer applications in technology, 39(1-3), 32-39.
Furui, S., Kikuchi, T., Shinnaka, Y., & Hori, C. (2004). Speechto-text and speech-to-speech summarization of spontaneous speech. IEEE Transactions on Speech and Audio Processing, 12(4), 401-408.
Gaikwad, S. K., Gawali, B. W., & Yannawar, P. (2010). A review on speech recognition technique. International Journal of Computer Applications, 10(3), 16-24.
Garcia, C. I., Grasso, F., Luchetta, A., Piccirilli, M. C., Paolucci, L., & Talluri, G. (2020). A comparison of power quality disturbance detection and classification methods using CNN, LSTM and CNN-LSTM. Applied Sciences, 10(19), 6755.
Gulati, A., Qin, J., Chiu, C. C., Parmar, N., Zhang, Y., Yu, J., ... & Pang, R. (2020). Conformer: Convolution-augmented transformer for speech recognition. arXiv preprint arXiv:2005.08100.
Gulmezoglu, M. B., Dzhafarov, V., Keskin, M., & Barkana, A. (1999). A novel approach to isolated word recognition. IEEE Transactions on Speech and Audio Processing, 7(6), 620-628.
Gulmezoglu, M. B., Edizkan, R., Ergin, S., & Barkana, A. (2005, May). Improvements on isolated word recognition using FLDA. In Proceedings of the IEEE 13th Signal Processing and Communications Applications Conference, 2005. (pp. 703-706). IEEE.
Gunal, S., & Edizkan, R. (2008). Subspace based feature selection for pattern recognition. Information Sciences, 178(19), 3716-3726.
Haque, M. A., Verma, A., Alex, J. S. R., & Venkatesan, N. (2020). Experimental evaluation of CNN architecture for speech recognition. In First international conference on sustainable technologies for computational intelligence (pp. 507-514). Springer, Singapore.
Imtiaz, M. A., & Raja, G. (2016, November). Isolated word automatic speech recognition (ASR) system using MFCC, DTW & KNN. In 2016 asia pacific conference on multimedia and broadcasting (APMediaCast) (pp. 106-110). IEEE.
Keser, S., & Edizkan, R. (2009, April). Phonem-based isolated Turkish word recognition with subspace classifier. In 2009 IEEE 17th Signal Processing and Communications Applications Conference (pp. 93-96). IEEE.
Kolossa, D., Zeiler, S., Saeidi, R., & Astudillo, R. F. (2013). Noise-adaptive LDA: A new approach for speech recognition under observation uncertainty. IEEE Signal Processing Letters, 20(11), 1018-1021.
Lalitha, S., Mudupu, A., Nandyala, B. V., & Munagala, R. (2015, December). Speech emotion recognition using DWT. In 2015 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) (pp. 1-4). IEEE.
Miao, F., Zhang, P., Jin, L., & Wu, H. (2018, August). Chinese news text classification based on machine learning algorithm. In 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC) (Vol. 2, pp. 48-51). IEEE.
Mohan, B. J. (2014, January). Speech recognition using MFCC and DTW. In 2014 International Conference on Advances in Electrical Engineering (ICAEE) (pp. 1-4). IEEE.
Muhammad, H. Z., Nasrun, M., Setianingsih, C., & Murti, M. A. (2018, May). Speech recognition for English to Indonesian translator using hidden Markov model. In 2018 International Conference on Signals and Systems (ICSigSys) (pp. 255- 260). IEEE.
Myers, Jerome L.; Well, Arnold D. (2003). Research Design and Statistical Analysis (2nd ed.). Lawrence Erlbaum. pp. 508. ISBN 978-0-8058-4037-7.
Najkar, N., Razzazi, F., & Sameti, H. (2010). A novel approach to HMM-based speech recognition systems using particle swarm optimization. Mathematical and Computer Modelling, 52(11-12), 1910-1920.
Palaz, D., & Collobert, R. (2015). Analysis of cnn-based speech recognition system using raw speech as input (No. REP_WORK). Idiap.
Palaz, D., Magimai-Doss, M., & Collobert, R. (2019). End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition. Speech Communication, 108, 15-32.
Passricha, V., & Aggarwal, R. K. (2020). A hybrid of deep CNN and bidirectional LSTM for automatic speech recognition. Journal of Intelligent Systems, 29(1), 1261-1274.
Permanasari, Y., Harahap, E. H., & Ali, E. P. (2019, November). Speech recognition using dynamic time warping (DTW). In Journal of Physics: Conference Series (Vol. 1366, No. 1, p. 012091). IOP Publishing.
Seltzer, M. L., Yu, D., & Wang, Y. (2013, May). An investigation of deep neural networks for noise robust speech recognition. In 2013 IEEE international conference on acoustics, speech and signal processing (pp. 7398-7402). IEEE.
Sivaram, G. S., Nemala, S. K., Mesgarani, N., & Hermansky, H. (2010). Data-driven and feedback based spectro-temporal features for speech recognition. IEEE Signal Processing Letters, 17(11), 957-960.
Song, Y., Huang, J., Zhou, D., Zha, H., & Giles, C. L. (2007, September). Iknn: Informative k-nearest neighbor pattern classification. In European Conference on Principles of Data Mining and Knowledge Discovery (pp. 248-264). Springer, Berlin, Heidelberg.
Song, K. T., Han, M. J., & Wang, S. C. (2014). Speech signal-based emotion recognition and its application to entertainment robots. Journal of the Chinese Institute of Engineers, 37(1), 14-25.
Soucy, P., & Mineau, G. W. (2001, November). A simple KNN algorithm for text categorization. In Proceedings 2001 IEEE international conference on data mining (pp. 647-648). IEEE.
Soujanya, M., & Kumar, S. (2010, August). Personalized IVR system in contact center. In 2010 International Conference on Electronics and Information Engineering (Vol. 1, pp. V1- 453). IEEE.
Speech commands dataset version 2 (2018). [Online]. Available:http://download.tensorflow.org/data/speech_commands_v0.02.tar.gz
Srisuwan, N., Phukpattaranont, P., & Limsakul, C. (2018). Comparison of feature evaluation criteria for speech recognition based on electromyography. Medical & biological engineering & computing, 56(6), 1041-1051.
Sumit, S. H., Al Muntasir, T., Zaman, M. A., Nandi, R. N., & Sourov, T. (2018, September). Noise robust end-to-end speech recognition for bangla language. In 2018 international conference on bangla speech and language processing (ICBSLP) (pp. 1-5). IEEE.
Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
Tan, K., & Wang, D. (2018, September). A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement. In Interspeech (Vol. 2018, pp. 3229-3233).
Tan, T., Qian, Y., Hu, H., Zhou, Y., Ding, W., & Yu, K. (2018). Adaptive very deep convolutional residual network for noise robust speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26(8), 1393-1405.
Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., & Kitamura, T. (2000, June). Speech parameter generation algorithms for HMM-based speech synthesis. In 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No. 00CH37100) (Vol. 3, pp. 1315-1318). IEEE.
Wahyuni, E. S. (2017, November). Arabic speech recognition using MFCC feature extraction and ANN classification. In 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE) (pp. 22-25). IEEE.
Yang, L., & Chen, S. (2014). Linear discriminant analysis with worst between-class separation and average within-class compactness. Frontiers of Computer Science, 8(5), 785-792.
Yavuz, H. S., Çevikalp, H., & Barkana, A. (2006). Twodimensional CLAFIC methods for image recognition. In 2006 IEEE 14th Signal Processing and Communications
Zhang, S. X., & Gales, M. J. (2012). Structured SVMs for automatic speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 21(3), 544-555.

Toplam 47 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Devreler ve Sistemler
Bölüm	Makaleler
Yazarlar	Serkan Keser 0000-0001-8435-0507
Erken Görünüm Tarihi	30 Aralık 2023
Yayımlanma Tarihi	31 Aralık 2023
Gönderilme Tarihi	7 Ağustos 2023
Kabul Tarihi	4 Ekim 2023
Yayımlandığı Sayı	Yıl 2023 Cilt: 6 Sayı: 2

Kaynak Göster

APA	Keser, S. (2023). A Comparison of Isolated Word Recognition Performances for Machine Learning and Hybrid Subspace Classifiers. Sürdürülebilir Mühendislik Uygulamaları Ve Teknolojik Gelişmeler Dergisi, 6(2), 235-249. https://doi.org/10.51764/smutgd.1338977

Kapak Resmi İndir

Makale Dosyaları

Tam Metin

Bu eser Creative Commons Atıf 4.0 Uluslararası Lisansı ile lisanslanmıştır.