Research Article
BibTex RIS Cite

Speaker Recognition and Speaker Verification by Comparison of MFCC and LBP Methods

Year 2022, Volume: 15 Issue: 2, 104 - 109, 15.12.2022
https://doi.org/10.54525/tbbmd.1083707

Abstract

Speaker recognition or speaker identification is the automatic recognition of the speaker by analyzing the parameters of the audio signals. Human voices contain a very high attachment to their owner. For this reason, in this study, a dataset was obtained from Youtube to determine who is from 46 different people who read Surah Yasin. Feature extraction was done from the obtained audio files with MFCC and LBP. Feature vectors have been tested with various classification algorithms and 35.10% success has been obtained for MFCC, while 90.74% success has been obtained for LBP. For person verification, 100% classification success was achieved in LBP.

References

  • Abdul Z. K., “Kurdish speaker identification based on one dimensional convolutional neural network,” Computational Methods for Differential Equations, vol. 7, no. 4 (Special Issue), pp. 566-572, 2019.
  • Patel K. ve Prasad R., “Speech recognition and verification using MFCC & VQ,” Int. J. Emerg. Sci. Eng.(IJESE), vol. 1, no. 7, pp. 137-140, 2013.
  • Kumar C. S. ve Rao P. M., “Design of an automatic speaker recognition system using MFCC, vector quantization and LBG algorithm,” International Journal on Computer Science and Engineering, vol. 3, no. 8, p. 2942, 2011.
  • Le-Qing L., “Insect sound recognition based on mfcc and pnn,” in 2011 International Conference on Multimedia and Signal Processing, 2011, vol. 2: IEEE, pp. 42-46.
  • Wanli Z. ve L. Guoxin L., “The research of feature extraction based on MFCC for speaker recognition,” in Proceedings of 2013 3rd International Conference on Computer Science and Network Technology, 2013: IEEE, pp. 1074-1077.
  • Bimbot F. et al., “A tutorial on text-independent speaker verification,” EURASIP Journal on Advances in Signal Processing, vol. 2004, no. 4, pp. 1-22, 2004.
  • Singh S., “Forensic and Automatic Speaker Recognition System,” International Journal of Electrical & Computer Engineering (2088-8708), vol. 8, no. 5, 2018.
  • Kinnunen T. ve Li H., “An overview of text-independent speaker recognition: From features to supervectors,” Speech communication, vol. 52, no. 1, pp. 12-40, 2010.
  • Larcher A, Lee K. A., Ma B., ve Li H., “Text-dependent speaker verification: Classifiers, databases and RSR2015,” Speech Communication, vol. 60, pp. 56-77, 2014.
  • Sanjaya M. ve Salleh Z., “Implementasi Pengenalan Pola Suara Menggunakan Mel-Frequency Cepstrum Coefficients (Mfcc) Dan Adaptive Neuro-Fuzzy Inferense System (Anfis) Sebagai Kontrol Lampu Otomatis,” ALHAZEN Journal of Physics, vol. 1, no. 1, pp. 43-54, 2014.
  • Tiwari V., “MFCC and its applications in speaker recognition,” International journal on emerging technologies, vol. 1, no. 1, pp. 19-22, 2010.
  • Bansal P., Imam S. A., ve Bharti R., “Speaker recognition using MFCC, shifted MFCC with vector quantization and fuzzy,” in 2015 International Conference on Soft Computing Techniques and Implementations (ICSCTI), 2015: IEEE, pp. 41-44.
  • Yutai W., Bo L., Xiaoqing J., Feng L., ve Lihao W., “Speaker recognition based on dynamic MFCC parameters,” in 2009 International Conference on Image Analysis and Signal Processing: IEEE, pp. 406-409, 2009.
  • Ohini Kafui T., ve Mignotte M.. “Environmental sound classification using local binary pattern and audio features collaboration.” IEEE Transactions on Multimedia, 2020, vol. 23: pp. 3978-3985.
  • Sengupta, N., Sahidullah, M., & Saha, G. 2017. “Lung sound classification using local binary pattern”. arXiv preprint arXiv:1710.01703.
  • ER, M.B. “Heart sounds classification using convolutional neural network with 1D-local binary pattern and 1D-local ternary pattern features”. Applied Acoustics, 2021, vol. 180: 108152.
  • Yang, W., Krishnan, S. “Combining temporal features by local binary pattern for acoustic scene classification.” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017, vol. 25, no. 6: 1315-1321.
  • Abidin, S., Togneri, R., & Sohel, F. “Spectrotemporal analysis using local binary pattern variants for acoustic scene classification.” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 11, 2112-2121, 2018.
  • Yang L., Chen X. ve Tao L., “Acoustic scene classification using multi-scale features”, Proc. Detection Classification Acoustic Scenes Events (DCASE), pp. 29-33, 2018.
  • Deng, M., Meng, T., Cao, J., Wang, S., Zhang, J., & Fan, H. “Heart sound classification based on improved MFCC features and convolutional recurrent neural networks.” Neural Networks, 130, 22-32, 2020.
  • Chauhan, S., Wang, P., Lim, C. S., & Anantharaman, V. “A computer-aided MFCC-based HMM system for automatic auscultation.” Computers in biology and medicine, 38(2), 221-233, 2008.
  • Rahmandani, M., Nugroho, H. A., & Setiawan, N. A. Cardiac sound classification using Mel-frequency cepstral coefficients (MFCC) and artificial neural network (ANN). In 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE), pp. 22-26, IEEE, 2018.
  • Şaşmaz E., ve Tek, F. B. “Animal sound classification using a convolutional neural network.” In 2018 3rd International Conference on Computer Science and Engineering (UBMK) (pp. 625-629). IEEE, 2018.
  • Dewi S. P., Prasasti A. L., ve Irawan B. “The study of baby crying analysis using MFCC and LFCC in different classification methods.” In 2019 IEEE International Conference on Signals and Systems (ICSigSys) (pp. 18-23). IEEE, 2019.
  • Leena R., Mehta S.P., Mahajan A.S.. “Dabhade Comparative Study Of MFCC And LPC For Marathi Isolated Word Recognition System.” International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering vol. 2, no. 6, p. 2133-2139, 2013.

MFCC ve LBP Yöntemlerinin Karşılaştırılması ile Konuşmacı Tanıma ve Konuşmacı Doğrulama

Year 2022, Volume: 15 Issue: 2, 104 - 109, 15.12.2022
https://doi.org/10.54525/tbbmd.1083707

Abstract

Konuşmacıyı tanıma ya da konuşmacıyı tanımlama konuşmacının ses sinyallerine ait parametrelerinin analiz edilmesi ile otomatik olarak tanınmasıdır. İnsan sesleri sahibine çok yüksek bağlılık içerir. Bu nedenle bu çalışmada Yasin Suresini okuyan 46 farklı kişiden kim olduğunu belirlemek için Youtube üzerinden veri kümesi elde edilmiştir. Elde edilen ses dosyalarından MFCC ve LBP ile öznitelik çıkarımı yapılmıştır. Öznitelik vektörleri çeşitli sınıflandırma algoritmaları ile sınanmış ve MFCC için %35,10 başarı elde edilirken LBP için %90,74 oranında başarılı sonuçlar elde edilmiştir. Kişi doğrulama için ise LBP’de %100 sınıflandırma başarısı elde edilmiştir.

References

  • Abdul Z. K., “Kurdish speaker identification based on one dimensional convolutional neural network,” Computational Methods for Differential Equations, vol. 7, no. 4 (Special Issue), pp. 566-572, 2019.
  • Patel K. ve Prasad R., “Speech recognition and verification using MFCC & VQ,” Int. J. Emerg. Sci. Eng.(IJESE), vol. 1, no. 7, pp. 137-140, 2013.
  • Kumar C. S. ve Rao P. M., “Design of an automatic speaker recognition system using MFCC, vector quantization and LBG algorithm,” International Journal on Computer Science and Engineering, vol. 3, no. 8, p. 2942, 2011.
  • Le-Qing L., “Insect sound recognition based on mfcc and pnn,” in 2011 International Conference on Multimedia and Signal Processing, 2011, vol. 2: IEEE, pp. 42-46.
  • Wanli Z. ve L. Guoxin L., “The research of feature extraction based on MFCC for speaker recognition,” in Proceedings of 2013 3rd International Conference on Computer Science and Network Technology, 2013: IEEE, pp. 1074-1077.
  • Bimbot F. et al., “A tutorial on text-independent speaker verification,” EURASIP Journal on Advances in Signal Processing, vol. 2004, no. 4, pp. 1-22, 2004.
  • Singh S., “Forensic and Automatic Speaker Recognition System,” International Journal of Electrical & Computer Engineering (2088-8708), vol. 8, no. 5, 2018.
  • Kinnunen T. ve Li H., “An overview of text-independent speaker recognition: From features to supervectors,” Speech communication, vol. 52, no. 1, pp. 12-40, 2010.
  • Larcher A, Lee K. A., Ma B., ve Li H., “Text-dependent speaker verification: Classifiers, databases and RSR2015,” Speech Communication, vol. 60, pp. 56-77, 2014.
  • Sanjaya M. ve Salleh Z., “Implementasi Pengenalan Pola Suara Menggunakan Mel-Frequency Cepstrum Coefficients (Mfcc) Dan Adaptive Neuro-Fuzzy Inferense System (Anfis) Sebagai Kontrol Lampu Otomatis,” ALHAZEN Journal of Physics, vol. 1, no. 1, pp. 43-54, 2014.
  • Tiwari V., “MFCC and its applications in speaker recognition,” International journal on emerging technologies, vol. 1, no. 1, pp. 19-22, 2010.
  • Bansal P., Imam S. A., ve Bharti R., “Speaker recognition using MFCC, shifted MFCC with vector quantization and fuzzy,” in 2015 International Conference on Soft Computing Techniques and Implementations (ICSCTI), 2015: IEEE, pp. 41-44.
  • Yutai W., Bo L., Xiaoqing J., Feng L., ve Lihao W., “Speaker recognition based on dynamic MFCC parameters,” in 2009 International Conference on Image Analysis and Signal Processing: IEEE, pp. 406-409, 2009.
  • Ohini Kafui T., ve Mignotte M.. “Environmental sound classification using local binary pattern and audio features collaboration.” IEEE Transactions on Multimedia, 2020, vol. 23: pp. 3978-3985.
  • Sengupta, N., Sahidullah, M., & Saha, G. 2017. “Lung sound classification using local binary pattern”. arXiv preprint arXiv:1710.01703.
  • ER, M.B. “Heart sounds classification using convolutional neural network with 1D-local binary pattern and 1D-local ternary pattern features”. Applied Acoustics, 2021, vol. 180: 108152.
  • Yang, W., Krishnan, S. “Combining temporal features by local binary pattern for acoustic scene classification.” IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017, vol. 25, no. 6: 1315-1321.
  • Abidin, S., Togneri, R., & Sohel, F. “Spectrotemporal analysis using local binary pattern variants for acoustic scene classification.” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 26, no. 11, 2112-2121, 2018.
  • Yang L., Chen X. ve Tao L., “Acoustic scene classification using multi-scale features”, Proc. Detection Classification Acoustic Scenes Events (DCASE), pp. 29-33, 2018.
  • Deng, M., Meng, T., Cao, J., Wang, S., Zhang, J., & Fan, H. “Heart sound classification based on improved MFCC features and convolutional recurrent neural networks.” Neural Networks, 130, 22-32, 2020.
  • Chauhan, S., Wang, P., Lim, C. S., & Anantharaman, V. “A computer-aided MFCC-based HMM system for automatic auscultation.” Computers in biology and medicine, 38(2), 221-233, 2008.
  • Rahmandani, M., Nugroho, H. A., & Setiawan, N. A. Cardiac sound classification using Mel-frequency cepstral coefficients (MFCC) and artificial neural network (ANN). In 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE), pp. 22-26, IEEE, 2018.
  • Şaşmaz E., ve Tek, F. B. “Animal sound classification using a convolutional neural network.” In 2018 3rd International Conference on Computer Science and Engineering (UBMK) (pp. 625-629). IEEE, 2018.
  • Dewi S. P., Prasasti A. L., ve Irawan B. “The study of baby crying analysis using MFCC and LFCC in different classification methods.” In 2019 IEEE International Conference on Signals and Systems (ICSigSys) (pp. 18-23). IEEE, 2019.
  • Leena R., Mehta S.P., Mahajan A.S.. “Dabhade Comparative Study Of MFCC And LPC For Marathi Isolated Word Recognition System.” International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering vol. 2, no. 6, p. 2133-2139, 2013.
There are 25 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Makaleler(Araştırma)
Authors

Emrah Aydemir 0000-0002-8380-7891

Early Pub Date December 3, 2022
Publication Date December 15, 2022
Published in Issue Year 2022 Volume: 15 Issue: 2

Cite

APA Aydemir, E. (2022). MFCC ve LBP Yöntemlerinin Karşılaştırılması ile Konuşmacı Tanıma ve Konuşmacı Doğrulama. Türkiye Bilişim Vakfı Bilgisayar Bilimleri Ve Mühendisliği Dergisi, 15(2), 104-109. https://doi.org/10.54525/tbbmd.1083707
AMA Aydemir E. MFCC ve LBP Yöntemlerinin Karşılaştırılması ile Konuşmacı Tanıma ve Konuşmacı Doğrulama. TBV-BBMD. December 2022;15(2):104-109. doi:10.54525/tbbmd.1083707
Chicago Aydemir, Emrah. “MFCC Ve LBP Yöntemlerinin Karşılaştırılması Ile Konuşmacı Tanıma Ve Konuşmacı Doğrulama”. Türkiye Bilişim Vakfı Bilgisayar Bilimleri Ve Mühendisliği Dergisi 15, no. 2 (December 2022): 104-9. https://doi.org/10.54525/tbbmd.1083707.
EndNote Aydemir E (December 1, 2022) MFCC ve LBP Yöntemlerinin Karşılaştırılması ile Konuşmacı Tanıma ve Konuşmacı Doğrulama. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 15 2 104–109.
IEEE E. Aydemir, “MFCC ve LBP Yöntemlerinin Karşılaştırılması ile Konuşmacı Tanıma ve Konuşmacı Doğrulama”, TBV-BBMD, vol. 15, no. 2, pp. 104–109, 2022, doi: 10.54525/tbbmd.1083707.
ISNAD Aydemir, Emrah. “MFCC Ve LBP Yöntemlerinin Karşılaştırılması Ile Konuşmacı Tanıma Ve Konuşmacı Doğrulama”. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 15/2 (December 2022), 104-109. https://doi.org/10.54525/tbbmd.1083707.
JAMA Aydemir E. MFCC ve LBP Yöntemlerinin Karşılaştırılması ile Konuşmacı Tanıma ve Konuşmacı Doğrulama. TBV-BBMD. 2022;15:104–109.
MLA Aydemir, Emrah. “MFCC Ve LBP Yöntemlerinin Karşılaştırılması Ile Konuşmacı Tanıma Ve Konuşmacı Doğrulama”. Türkiye Bilişim Vakfı Bilgisayar Bilimleri Ve Mühendisliği Dergisi, vol. 15, no. 2, 2022, pp. 104-9, doi:10.54525/tbbmd.1083707.
Vancouver Aydemir E. MFCC ve LBP Yöntemlerinin Karşılaştırılması ile Konuşmacı Tanıma ve Konuşmacı Doğrulama. TBV-BBMD. 2022;15(2):104-9.

Article Acceptance

Use user registration/login to upload articles online.

The acceptance process of the articles sent to the journal consists of the following stages:

1. Each submitted article is sent to at least two referees at the first stage.

2. Referee appointments are made by the journal editors. There are approximately 200 referees in the referee pool of the journal and these referees are classified according to their areas of interest. Each referee is sent an article on the subject he is interested in. The selection of the arbitrator is done in a way that does not cause any conflict of interest.

3. In the articles sent to the referees, the names of the authors are closed.

4. Referees are explained how to evaluate an article and are asked to fill in the evaluation form shown below.

5. The articles in which two referees give positive opinion are subjected to similarity review by the editors. The similarity in the articles is expected to be less than 25%.

6. A paper that has passed all stages is reviewed by the editor in terms of language and presentation, and necessary corrections and improvements are made. If necessary, the authors are notified of the situation.

0

.   This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.