Research Article
BibTex RIS Cite

TÜRKÇE KONUŞMA TANIMA SİSTEMİ İÇİN GENİŞ KELİME DAĞARCIĞINA SAHİP TEST VERİ KÜMESİNİN GELİŞTİRİLMESİ VE YENİ BİR TEST PROSEDÜRÜ

Year 2022, Volume: 9 Issue: 16, 156 - 164, 14.04.2022
https://doi.org/10.54365/adyumbd.1038766

Abstract

Otomatik konuşma tanıma sistemlerindeki en temel sorun, alana özgü bir otomatik konuşma tanıma sisteminin geliştirilmesi değil, geniş kelime dağarcığına sahip bir otomatik konuşma tanıma sisteminin geliştirilmesidir. Geniş kelime dağarcığına sahip olacak şekilde geliştirilen otomatik konuşma tanıma sistemleri, geniş kelime dağarcığına sahip bir test veri kümesi ile test edilmelidir. Bu nedenle çalışma kapsamında bir otomatik konuşma tanıma test veri kümesi hazırlanmıştır. Hazırlanan otomatik konuşma tanıma test veri kümesi, 20 farklı alandan konuşmaları ve bu konuşmalara karşılık gelen metin dosyalarını içermektedir. Çalışma kapsamında sunulan test prosedürü, geniş kelime dağarcığına sahip farklı Türkçe otomatik konuşma tanıma sistemleri üzerinde de test edilmiştir. Elde edilen kelime hata oranı sonuçlarının %14-21 arasında değişkenlik gösterdiği görülmüştür. Geniş kelime dağarcığına sahip olacak şekilde hazırlanan test veri kümesi ve test prosedürü, ilerideki çalışmalarda otomatik konuşma tanıma sistemlerinin başarısının daha net ortaya konması için yol göstericidir.

References

  • Prakoso H, Ferdiana R, Hartanto R. Indonesian Automatic Speech Recognition system using CMUSphinx toolkit and limited dataset. International Symposium Electronic Smart Devices 2016: 283-286.
  • Miao Y. Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN. arXiv CoRR, 2014;1401.6:1-4, 2014.
  • Yang X, Audhkhasi K, Rosenberg A, Thomas S, Ramabhadran B, Hasegawa-Johnson M. Joint modeling of accents and acoustics for multi-accent speech recognition. IEEE International Conference Acoustic Speech Signal Processing. 2018:5989-5993.
  • Rebai I, Benayed Y, Mahdi W, Lorré J.P. Improving speech recognition using data augmentation and acoustic model fusion. Procedia Computer Science. 2017; 112:316-322.
  • Jain A, Singh V.P, Rath S.P. A multi-accent acoustic model using mixture of experts for speech recognition. Annual Conference International Speech Communication Association. 2019: 779-783.
  • Zeineldeen M, Glushko A, Michel W, Zeyer A, Schlüter R, Ney H. Investigating methods to improve language model integration for attention-based encoder-decoder ASR models. Annual Conference of the International Speech Communication Association. 2021: 2856-2860.
  • Gandhe A, Rastrow A. Audio-attention discriminative language model for ASR rescoring. International Conference Acoustic Speech Signal Processing. 2020: 7944-7948.
  • Anusuya M.A, Katti S.K. Speech recognition by machine, a review. International Journal of Computer Science and Information Security. 2009; 6:181-205.
  • Dikici E, Saraçlar M. Semi-supervised and unsupervised discriminative language model training for automatic speech recognition. Speech Communication. 2016; 83:54-63.
  • Irie K, Tüske Z, Alkhouli T, Schlüter R, Ney H. LSTM, GRU, highway and a bit of attention: An empirical overview for language modeling in speech recognition. Annual Conference of the International Speech Communication Association. 2016: 08-12.
  • Siddharth D, Xinjian L, Florian M, Alan W. Domain robust feature extraction for rapid low resource ASR development. Black Language Technologies Institute. Carnegie Mellon University; Pittsburgh, USA. 2018; 258-265.
  • Inaguma H, Cho J, Baskar M.K, Kawahara T, Watanabe S. Transfer learning of languageindependent end-to-end ASR with language model fusion. IEEE International Conference Acoustic Speech Signal Processing. 2019: 6096-6100.
  • Arısoy E, Can D, Parlak S, Saraçlar M, Sak H. Turkish broadcast news transcription and retrieval,” IEEE Transaction Audio, Speech Language Processing. 2019; 17: 874-883.
  • Salor Ö, Pellom B.L, Ciloglu T, Demirekler M. Turkish speech corpora and recognition tools developed by porting SONIC: Towards multilingual speech recognition. Computer. Speech Language. 2007; 21:580-593.
  • Polat H, Oyucu S. Building a speech and text corpus of Turkish: Large corpus collection with initial speech recognition results. Symmetry. 2020; 12: 1-19.
  • Abate S.T. Large vocabulary read speech corpora for four Ethiopian languages: Amharic, Tigrigna, Oromo and Wolaytta. International Conference Language Resource Evaluation Conference Proceeding. 2020: 4167-4171.
  • Urban E, Buck A, Farley P, Bullwinkle M. Evaluate and improve Custom Speech accuracy. Microsoft Documents, 2022: 1-5.

DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE

Year 2022, Volume: 9 Issue: 16, 156 - 164, 14.04.2022
https://doi.org/10.54365/adyumbd.1038766

Abstract

The most fundamental problem in the automatic speech recognition systems is not the development of a domainspecific automatic speech recognition system, but the development of an automatic speech recognition system with a large vocabulary. Developed automatic speech recognition systems should be tested with a large vocabulary test dataset. For this reason, an automatic speech recognition test corpus was prepared within the scope of the study. Prepared automatic speech recognition test corpus includes conversations from 20 different areas and text files of these conversations. The test procedure presented in the study was also tested on Turkish automatic speech recognition systems with a large vocabulary. It has been observed that the word error rate results ranged between 14-21%. The test corpus and test procedure with a large vocabulary prepared are guiding for the success of automatic speech recognition systems in future studies to be revealed more clearly.

References

  • Prakoso H, Ferdiana R, Hartanto R. Indonesian Automatic Speech Recognition system using CMUSphinx toolkit and limited dataset. International Symposium Electronic Smart Devices 2016: 283-286.
  • Miao Y. Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN. arXiv CoRR, 2014;1401.6:1-4, 2014.
  • Yang X, Audhkhasi K, Rosenberg A, Thomas S, Ramabhadran B, Hasegawa-Johnson M. Joint modeling of accents and acoustics for multi-accent speech recognition. IEEE International Conference Acoustic Speech Signal Processing. 2018:5989-5993.
  • Rebai I, Benayed Y, Mahdi W, Lorré J.P. Improving speech recognition using data augmentation and acoustic model fusion. Procedia Computer Science. 2017; 112:316-322.
  • Jain A, Singh V.P, Rath S.P. A multi-accent acoustic model using mixture of experts for speech recognition. Annual Conference International Speech Communication Association. 2019: 779-783.
  • Zeineldeen M, Glushko A, Michel W, Zeyer A, Schlüter R, Ney H. Investigating methods to improve language model integration for attention-based encoder-decoder ASR models. Annual Conference of the International Speech Communication Association. 2021: 2856-2860.
  • Gandhe A, Rastrow A. Audio-attention discriminative language model for ASR rescoring. International Conference Acoustic Speech Signal Processing. 2020: 7944-7948.
  • Anusuya M.A, Katti S.K. Speech recognition by machine, a review. International Journal of Computer Science and Information Security. 2009; 6:181-205.
  • Dikici E, Saraçlar M. Semi-supervised and unsupervised discriminative language model training for automatic speech recognition. Speech Communication. 2016; 83:54-63.
  • Irie K, Tüske Z, Alkhouli T, Schlüter R, Ney H. LSTM, GRU, highway and a bit of attention: An empirical overview for language modeling in speech recognition. Annual Conference of the International Speech Communication Association. 2016: 08-12.
  • Siddharth D, Xinjian L, Florian M, Alan W. Domain robust feature extraction for rapid low resource ASR development. Black Language Technologies Institute. Carnegie Mellon University; Pittsburgh, USA. 2018; 258-265.
  • Inaguma H, Cho J, Baskar M.K, Kawahara T, Watanabe S. Transfer learning of languageindependent end-to-end ASR with language model fusion. IEEE International Conference Acoustic Speech Signal Processing. 2019: 6096-6100.
  • Arısoy E, Can D, Parlak S, Saraçlar M, Sak H. Turkish broadcast news transcription and retrieval,” IEEE Transaction Audio, Speech Language Processing. 2019; 17: 874-883.
  • Salor Ö, Pellom B.L, Ciloglu T, Demirekler M. Turkish speech corpora and recognition tools developed by porting SONIC: Towards multilingual speech recognition. Computer. Speech Language. 2007; 21:580-593.
  • Polat H, Oyucu S. Building a speech and text corpus of Turkish: Large corpus collection with initial speech recognition results. Symmetry. 2020; 12: 1-19.
  • Abate S.T. Large vocabulary read speech corpora for four Ethiopian languages: Amharic, Tigrigna, Oromo and Wolaytta. International Conference Language Resource Evaluation Conference Proceeding. 2020: 4167-4171.
  • Urban E, Buck A, Farley P, Bullwinkle M. Evaluate and improve Custom Speech accuracy. Microsoft Documents, 2022: 1-5.
There are 17 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Makaleler
Authors

Saadin Oyucu 0000-0003-3880-3039

Publication Date April 14, 2022
Submission Date December 20, 2021
Published in Issue Year 2022 Volume: 9 Issue: 16

Cite

APA Oyucu, S. (2022). DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, 9(16), 156-164. https://doi.org/10.54365/adyumbd.1038766
AMA Oyucu S. DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi. April 2022;9(16):156-164. doi:10.54365/adyumbd.1038766
Chicago Oyucu, Saadin. “DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE”. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi 9, no. 16 (April 2022): 156-64. https://doi.org/10.54365/adyumbd.1038766.
EndNote Oyucu S (April 1, 2022) DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi 9 16 156–164.
IEEE S. Oyucu, “DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE”, Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, vol. 9, no. 16, pp. 156–164, 2022, doi: 10.54365/adyumbd.1038766.
ISNAD Oyucu, Saadin. “DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE”. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi 9/16 (April 2022), 156-164. https://doi.org/10.54365/adyumbd.1038766.
JAMA Oyucu S. DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi. 2022;9:156–164.
MLA Oyucu, Saadin. “DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE”. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, vol. 9, no. 16, 2022, pp. 156-64, doi:10.54365/adyumbd.1038766.
Vancouver Oyucu S. DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi. 2022;9(16):156-64.

Cited By