DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE

Saadin Oyucu

doi:10.54365/adyumbd.1038766

Research Article

TÜRKÇE KONUŞMA TANIMA SİSTEMİ İÇİN GENİŞ KELİME DAĞARCIĞINA SAHİP TEST VERİ KÜMESİNİN GELİŞTİRİLMESİ VE YENİ BİR TEST PROSEDÜRÜ

Year 2022, , 156 - 164, 14.04.2022

Saadin Oyucu

https://doi.org/10.54365/adyumbd.1038766

Cited By: 2

Abstract

Otomatik konuşma tanıma sistemlerindeki en temel sorun, alana özgü bir otomatik konuşma tanıma sisteminin geliştirilmesi değil, geniş kelime dağarcığına sahip bir otomatik konuşma tanıma sisteminin geliştirilmesidir. Geniş kelime dağarcığına sahip olacak şekilde geliştirilen otomatik konuşma tanıma sistemleri, geniş kelime dağarcığına sahip bir test veri kümesi ile test edilmelidir. Bu nedenle çalışma kapsamında bir otomatik konuşma tanıma test veri kümesi hazırlanmıştır. Hazırlanan otomatik konuşma tanıma test veri kümesi, 20 farklı alandan konuşmaları ve bu konuşmalara karşılık gelen metin dosyalarını içermektedir. Çalışma kapsamında sunulan test prosedürü, geniş kelime dağarcığına sahip farklı Türkçe otomatik konuşma tanıma sistemleri üzerinde de test edilmiştir. Elde edilen kelime hata oranı sonuçlarının %14-21 arasında değişkenlik gösterdiği görülmüştür. Geniş kelime dağarcığına sahip olacak şekilde hazırlanan test veri kümesi ve test prosedürü, ilerideki çalışmalarda otomatik konuşma tanıma sistemlerinin başarısının daha net ortaya konması için yol göstericidir.

Keywords

Konuşma tanıma, Türkçe konuşma tanıma, Konuşma veri seti, Türkçe konuşma veri seti, Test veri seti

References

Prakoso H, Ferdiana R, Hartanto R. Indonesian Automatic Speech Recognition system using CMUSphinx toolkit and limited dataset. International Symposium Electronic Smart Devices 2016: 283-286.
Miao Y. Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN. arXiv CoRR, 2014;1401.6:1-4, 2014.
Yang X, Audhkhasi K, Rosenberg A, Thomas S, Ramabhadran B, Hasegawa-Johnson M. Joint modeling of accents and acoustics for multi-accent speech recognition. IEEE International Conference Acoustic Speech Signal Processing. 2018:5989-5993.
Rebai I, Benayed Y, Mahdi W, Lorré J.P. Improving speech recognition using data augmentation and acoustic model fusion. Procedia Computer Science. 2017; 112:316-322.
Jain A, Singh V.P, Rath S.P. A multi-accent acoustic model using mixture of experts for speech recognition. Annual Conference International Speech Communication Association. 2019: 779-783.
Zeineldeen M, Glushko A, Michel W, Zeyer A, Schlüter R, Ney H. Investigating methods to improve language model integration for attention-based encoder-decoder ASR models. Annual Conference of the International Speech Communication Association. 2021: 2856-2860.
Gandhe A, Rastrow A. Audio-attention discriminative language model for ASR rescoring. International Conference Acoustic Speech Signal Processing. 2020: 7944-7948.
Anusuya M.A, Katti S.K. Speech recognition by machine, a review. International Journal of Computer Science and Information Security. 2009; 6:181-205.
Dikici E, Saraçlar M. Semi-supervised and unsupervised discriminative language model training for automatic speech recognition. Speech Communication. 2016; 83:54-63.
Irie K, Tüske Z, Alkhouli T, Schlüter R, Ney H. LSTM, GRU, highway and a bit of attention: An empirical overview for language modeling in speech recognition. Annual Conference of the International Speech Communication Association. 2016: 08-12.
Siddharth D, Xinjian L, Florian M, Alan W. Domain robust feature extraction for rapid low resource ASR development. Black Language Technologies Institute. Carnegie Mellon University; Pittsburgh, USA. 2018; 258-265.
Inaguma H, Cho J, Baskar M.K, Kawahara T, Watanabe S. Transfer learning of languageindependent end-to-end ASR with language model fusion. IEEE International Conference Acoustic Speech Signal Processing. 2019: 6096-6100.
Arısoy E, Can D, Parlak S, Saraçlar M, Sak H. Turkish broadcast news transcription and retrieval,” IEEE Transaction Audio, Speech Language Processing. 2019; 17: 874-883.
Salor Ö, Pellom B.L, Ciloglu T, Demirekler M. Turkish speech corpora and recognition tools developed by porting SONIC: Towards multilingual speech recognition. Computer. Speech Language. 2007; 21:580-593.
Polat H, Oyucu S. Building a speech and text corpus of Turkish: Large corpus collection with initial speech recognition results. Symmetry. 2020; 12: 1-19.
Abate S.T. Large vocabulary read speech corpora for four Ethiopian languages: Amharic, Tigrigna, Oromo and Wolaytta. International Conference Language Resource Evaluation Conference Proceeding. 2020: 4167-4171.
Urban E, Buck A, Farley P, Bullwinkle M. Evaluate and improve Custom Speech accuracy. Microsoft Documents, 2022: 1-5.

DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE

Year 2022, , 156 - 164, 14.04.2022

Saadin Oyucu

https://doi.org/10.54365/adyumbd.1038766

Cited By: 2

Abstract

The most fundamental problem in the automatic speech recognition systems is not the development of a domainspecific automatic speech recognition system, but the development of an automatic speech recognition system with a large vocabulary. Developed automatic speech recognition systems should be tested with a large vocabulary test dataset. For this reason, an automatic speech recognition test corpus was prepared within the scope of the study. Prepared automatic speech recognition test corpus includes conversations from 20 different areas and text files of these conversations. The test procedure presented in the study was also tested on Turkish automatic speech recognition systems with a large vocabulary. It has been observed that the word error rate results ranged between 14-21%. The test corpus and test procedure with a large vocabulary prepared are guiding for the success of automatic speech recognition systems in future studies to be revealed more clearly.

Keywords

Speech recognition, Turkish speech recognition, speech corpus, test corpus, Turkish speech corpus

References

Prakoso H, Ferdiana R, Hartanto R. Indonesian Automatic Speech Recognition system using CMUSphinx toolkit and limited dataset. International Symposium Electronic Smart Devices 2016: 283-286.
Miao Y. Kaldi+PDNN: Building DNN-based ASR Systems with Kaldi and PDNN. arXiv CoRR, 2014;1401.6:1-4, 2014.
Yang X, Audhkhasi K, Rosenberg A, Thomas S, Ramabhadran B, Hasegawa-Johnson M. Joint modeling of accents and acoustics for multi-accent speech recognition. IEEE International Conference Acoustic Speech Signal Processing. 2018:5989-5993.
Rebai I, Benayed Y, Mahdi W, Lorré J.P. Improving speech recognition using data augmentation and acoustic model fusion. Procedia Computer Science. 2017; 112:316-322.
Jain A, Singh V.P, Rath S.P. A multi-accent acoustic model using mixture of experts for speech recognition. Annual Conference International Speech Communication Association. 2019: 779-783.
Zeineldeen M, Glushko A, Michel W, Zeyer A, Schlüter R, Ney H. Investigating methods to improve language model integration for attention-based encoder-decoder ASR models. Annual Conference of the International Speech Communication Association. 2021: 2856-2860.
Gandhe A, Rastrow A. Audio-attention discriminative language model for ASR rescoring. International Conference Acoustic Speech Signal Processing. 2020: 7944-7948.
Anusuya M.A, Katti S.K. Speech recognition by machine, a review. International Journal of Computer Science and Information Security. 2009; 6:181-205.
Dikici E, Saraçlar M. Semi-supervised and unsupervised discriminative language model training for automatic speech recognition. Speech Communication. 2016; 83:54-63.
Irie K, Tüske Z, Alkhouli T, Schlüter R, Ney H. LSTM, GRU, highway and a bit of attention: An empirical overview for language modeling in speech recognition. Annual Conference of the International Speech Communication Association. 2016: 08-12.
Siddharth D, Xinjian L, Florian M, Alan W. Domain robust feature extraction for rapid low resource ASR development. Black Language Technologies Institute. Carnegie Mellon University; Pittsburgh, USA. 2018; 258-265.
Inaguma H, Cho J, Baskar M.K, Kawahara T, Watanabe S. Transfer learning of languageindependent end-to-end ASR with language model fusion. IEEE International Conference Acoustic Speech Signal Processing. 2019: 6096-6100.
Arısoy E, Can D, Parlak S, Saraçlar M, Sak H. Turkish broadcast news transcription and retrieval,” IEEE Transaction Audio, Speech Language Processing. 2019; 17: 874-883.
Salor Ö, Pellom B.L, Ciloglu T, Demirekler M. Turkish speech corpora and recognition tools developed by porting SONIC: Towards multilingual speech recognition. Computer. Speech Language. 2007; 21:580-593.
Polat H, Oyucu S. Building a speech and text corpus of Turkish: Large corpus collection with initial speech recognition results. Symmetry. 2020; 12: 1-19.
Abate S.T. Large vocabulary read speech corpora for four Ethiopian languages: Amharic, Tigrigna, Oromo and Wolaytta. International Conference Language Resource Evaluation Conference Proceeding. 2020: 4167-4171.
Urban E, Buck A, Farley P, Bullwinkle M. Evaluate and improve Custom Speech accuracy. Microsoft Documents, 2022: 1-5.

There are 17 citations in total.

Details

Primary Language	English
Subjects	Engineering
Journal Section	Research Article
Authors	Saadin Oyucu 0000-0003-3880-3039
Publication Date	April 14, 2022
Submission Date	December 20, 2021
Published in Issue	Year 2022

Cite

APA	Oyucu, S. (2022). DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, 9(16), 156-164. https://doi.org/10.54365/adyumbd.1038766
AMA	Oyucu S. DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi. April 2022;9(16):156-164. doi:10.54365/adyumbd.1038766
Chicago	Oyucu, Saadin. “DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE”. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi 9, no. 16 (April 2022): 156-64. https://doi.org/10.54365/adyumbd.1038766.
EndNote	Oyucu S (April 1, 2022) DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi 9 16 156–164.
IEEE	S. Oyucu, “DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE”, Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, vol. 9, no. 16, pp. 156–164, 2022, doi: 10.54365/adyumbd.1038766.
ISNAD	Oyucu, Saadin. “DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE”. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi 9/16 (April 2022), 156-164. https://doi.org/10.54365/adyumbd.1038766.
JAMA	Oyucu S. DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi. 2022;9:156–164.
MLA	Oyucu, Saadin. “DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE”. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi, vol. 9, no. 16, 2022, pp. 156-64, doi:10.54365/adyumbd.1038766.
Vancouver	Oyucu S. DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE. Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi. 2022;9(16):156-64.

Adıyaman Üniversitesi Mühendislik Bilimleri Dergisi

TÜRKÇE KONUŞMA TANIMA SİSTEMİ İÇİN GENİŞ KELİME DAĞARCIĞINA SAHİP TEST VERİ KÜMESİNİN GELİŞTİRİLMESİ VE YENİ BİR TEST PROSEDÜRÜ

Abstract

Keywords

References

DEVELOPMENT OF TEST CORPUS WITH LARGE VOCABULARY FOR TURKISH SPEECH RECOGNITION SYSTEM AND A NEW TEST PROCEDURE

Abstract

Keywords

References

Details

Cite

Cited By

AI adoption in crowdsourcing

Procedia Computer Science

https://doi.org/10.1016/j.procs.2025.01.311

https://doi.org/