Sessizliğin Kaldırılması ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi

Saadin Oyucu; Hüseyin Polat; Hayri Sever

doi:10.29130/dubited.560135

Research Article

Sessizliğin Kaldırılması ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi

Year 2020, , 334 - 346, 31.01.2020

Saadin Oyucu , Hüseyin Polat , Hayri Sever

https://doi.org/10.29130/dubited.560135

Cited By: 2

Abstract

Otomatik Konuşma Tanıma sistemleri
temel olarak akustik bilgiden faydalanılarak geliştirilmektedir. Akustik bilgiden
fonem bilgisinin elde edilmesi için eşleştirilmiş konuşma ve metin verileri
kullanılmaktadır. Bu veriler ile eğitilen akustik modeller gerçek hayattaki
bütün akustik bilgiyi modelleyememektedir. Bu nedenle belirli ön işlemlerin
yapılması ve otomatik konuşma tanıma sistemlerinin başarımını düşürecek akustik
bilgilerin ortadan kaldırılması gerekmektedir. Bu çalışmada konuşma içerisinde
geçen sessizliklerin kaldırılması için bir yöntem önerilmiştir. Önerilen
yöntemin amacı sessizlik bilgisinin ortadan kaldırılması ve akustik bilgide
uzun bağımlılıklar sağlayan konuşmaların parçalara ayrılmasıdır. Geliştirilen
yöntemin sonunda elde edilen sessizlik içermeyen ve parçalara ayrılan konuşma
bilgisi bir Türkçe Otomatik Konuşma Tanıma sistemine girdi olarak verilmiştir.
Otomatik Konuşma Tanıma sisteminin çıkışında sisteme giriş olarak verilen
konuşma parçalarına karşılık gelen metinler birleştirilerek sunulmuştur.
Gerçekleştirilen deneylerde sessizliğin kaldırılması ve konuşmanın parçalara
ayrılması işleminin Otomatik Konuşma Tanıma sistemlerinin başarımını artırdığı
görülmüştür.

Keywords

Otomatik konuşma tanıma, Sessizliğin kaldırılması, Konuşmanın parçalanması

Thanks

Bu çalışma, EMFA Yazılım Danışmanlık A.Ş. tarafından desteklenmiştir. Desteklerinden dolayı EMFA Yazılım Danışmanlık A.Ş. yönetim kurulu başkanı Emre EVREN teşekkür ederiz.

References

[1] M. Abushariah, S. Gunawan, O. Khalifa, ve M. Abushariah, “English digits speech recognition system based on Hidden Markov Models,” International Conference on Computer and Communication Engineering, Kuala Lumpur, Malaysia, 2010, ss. 1–5.
[2] H. Prakoso, R. Ferdiana, ve R. Hartanto, “Indonesian Automatic Speech Recognition system using CMUSphinx toolkit and limited dataset,” International Symposium on Electronics and Smart Devices, Bandung, Indonesia, 2016, ss. 283–286.
[3] C. Kurian, ve K. Balakrishnan, “Speech recognition of Malayalam numbers,” World Congress Natural Biology Inspired Compututer, Coimbatore, India, 2009, ss. 1475–1479.
[4] C. Howard, ve D. David, “Automatic Measurement of Speech Recognition Performance: A Comparison of Six Speaker-Dependent Recognition Devices,” Computer Speech & Language, c. 2, s. 2, ss. 87-108, 1987.
[5] D. Amodei, “Deep speech 2:end-to-end speech recognition in english and mandarin,” International Conference on International Conference on Machine Learning, New York, USA, 2006, ss. 1–28.
[6] Y. G. Thimmaraja ve H. S. Jayanna, “Creating language and acoustic models using Kaldi to build an automatic speech recognition system for Kannada language,” International Conference on Recent Trends in Electronics, Information & Communication Technology, Bangalore, India, 2017, ss. 161–165.
[7] E. Bocchieri, ve D. Caseiro, “Use of geographical meta-data in ASR language and acoustic models,” International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA, 2010, ss. 5118–5121.
[8] J. Neto, “Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system,” European Conference on Speech Communication and Technology, Madrid, Spain, 1995, ss. 2171–2174.
[9] G. Hinton, “Deep Neural Networks for Acoustic Modeling in Speech Recognition,” Signal Processing Magazine, c. 29, s. 6, ss. 82–97, 2012.
[10] W. Chan, ve I. Lane, “Deep convolutional neural networks for acoustic modeling in low resource languages,” International Conference on Acoustics, Speech and Signal Processing, Brisbane, QLD, Australia, 2015, ss. 2056–2060.
[11] C. Ris, ve S. Dupont, “Assessing local noise level estimation methods: Application to noise robust ASR,” Speech Communication, c. 34, s. 1, ss. 141-158, 2001.
[12] C. Guoguo, X. Hainan, W. Minhua, P. Daniel, ve K. Sanjeev, “Pronunciation and silence probability modeling for ASR,” Annual Conference of the International Speech Communication Association, Dresden, Germany, 2015, ss. 533-537.
[13] L. Karray, ve A. Martin, “Toward improving speech detection robustness for speech recognition in adverse environments,” Speech Communication, c. 40, s. 3, ss. 261–276, 2003.
[14] J. Ramírez, J.C. Segura, C. Benítez, ve A. Torre, “A new adaptive longterm spectral estimation voice activity detector,” European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, ss. 3041–3044.
[15] J. Ramírez, “Spectral estimation voice activity detector,” European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, ss. 3121–3125.
[16] ITU-T Recommendation G.729-Annex B. “A silence compression scheme for G.729 optimized for terminals conforming to recommendation,” c. 70, 1996.
[17] F. Basbug, K. Swaminathan, ve S. Nandkumar, “Noise reduction and echo cancellation front-end for speech codecs,” Transaction Speech Audio Processing, c. 11, s. 1, ss. 1–13, 2004.
[18] S. Gustafsson, R. Martin, P. Jax, ve P. Vary, “A psychoacoustic approach to combined acoustic echo cancellation and noise reduction,” Transaction Speech and Audio Processing, c. 10, s. 5, ss. 245–256, 2002.
[19] J. Sohn, N.S. Kim, ve W. Sung, “A statistical model-based voice activity detection,” Signal Processing Letters, c. 16, s. 1, ss. 1–3, 1999.
[20] S. Gazor, ve W. Zhang, “A soft voice activity detector based on a Laplacian-Gaussian model,” Transaction Speech Audio Processing, c. 11, s. 5, ss. 498–505, 2003.
[21] L. Armani, M. Matassoni, M. Omologo, ve P. Svaizer, “Use of a CSP-based voice activity detector for distant-talking ASR,” European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, ss. 501–504.
[22] K. Woo, T. Yang, K. Park, ve C. Lee, “Robust voice activity detection algorithm for estimating noise spectrum,” Electronics Letters, c. 36, s. 2, ss. 180–181, 2000.
[23] M. Marzinzik, ve B. Kollmeier, “Speech pause detection for noise spectrum estimation by tracking power envelope dynamics,” Transaction Speech Audio Processing, c. 10, s. 6, ss. 341–351, 2002.
[24] R. Chengalvarayan, “Robust energy normalization using speech/non-speech discriminator for German connected digit recognition,” European Conference on Speech Communication and Technology, Budapest, Hungary, 1999, ss. 61–64.
[25] M. Marzinzik, ve B. Kollmeier, “Speech pause detection for noise spectrum estimation by tracking power envelope dynamics,” Transaction Speech Audio Processing, c. 10, s. 6, ss. 341–351, 2002.
[26] J. Zheng, Q. Zhou, ve C. Lee, “Robust, real-time endpoint detector with energy normalization for ASR in adverse environments,” International Conference on Acoustics, Speech, and Signal Processing, Lake City, UT, USA, 2001, ss. 233-236.
[27] C. Suyanto, “Signal energy-based automatic speech splitter: A tool for developing speech corpus,” Region 10 Conference, Taipei, Taiwan, 2007, ss. 2–5.
[28] M. Asadullah, ve S. Nisar, “A silence removal and endpoint detection approach for speech processing,” 3rd International Multidisciplinary Research Conference On Global Prosperity through Research & Innovation, Peşaver, Pakistan, 2013, ss. 10-15.
[29] X. Huang, ve L. Deng, “An overview of Modern Speech Recognition,” Handbook Natural Language Processing, 1. baskı, London, England: Chapman and Hall, 2010, böl. 3, ss. 339–367.
[30] D. Povey et al., “The Kaldi speech recognition toolkit,” Transactions on Audio, Speech, and Language Processing, Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, 2014, ss.1–4.
[31] S. Narang, ve M. Divya Gupta, “Speech Feature Extraction Techniques: A Review,” International Journal of Computer Science and Mobile Computing, c. 4, s. 3, ss. 107–114, 2015.
[32] A. Guglani, ve N. Mishra, “Continuous Punjabi Speech Recognition Model Based on Kaldi ASR Toolkit,” International Journal of Speech Technology, c. 18, s. 3, ss.1–6, 2018.
[33] B. Tombaloǧlu, ve H. Erdem, “Development of a MFCC-SVM based Turkish speech recognition system,” Signal Processing and Communication Application Conference, Zonguldak, Türkiye, 2016, ss. 1–4.
[34] A. R. Yuliani, R. Sustika, R. S. Yuwana, ve H. F. Pardede, “Feature transformations for robust speech recognition in reverberant conditions,” International Conference on Computer, Control, Informatics and its Applications, Jakarta, Indonesia, 2017, ss. 57-62.
[35] A. V. Haridas, R. Marimuthu, ve V. G. Sivakumar, “A Critical Review and Analysis on Techniques of Speech Recognition: The Road Ahead,” International Journal of Knowledge-Based and Intelligent Engineering Systems, c. 22, s. 1, ss. 39–57, 2018.
[36] M. Shahin, B. Ahmed, J. Mckechnie, K. Ballard, ve R. Gutierrez-osuna, “A comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech,” Annual Conference of the International Speech Communication Association, Singapore, Singapore, 2014, ss.1583-1590.
[37] L. Saul, ve F. Pereira, “Aggregate and mixed-order Markov models for statistical language processing,” International Conference on Empirical Methods in Natural Language Processing, New Jersey, USA, 1997, ss.81-19.
[38] N. Guglani, ve J. Mishra, “Continuous Punjabi Speech Recognition Model Based on Kaldi ASR Toolkit,” International Journal Speech Technology, c. 17, s. 1, ss. 1–6, 2018.
[39] N. John, J. Wendy, ve N. Philip, “Sing formant frequencies in speech recognition,” 5th European Conference on Speech Communication and Technology, Rhodes, Greece, 1997, ss. 22-28.
[40] S. Chowdhury, U. Garain, ve T. Chattopadhyay, “A Weighted Finite-State Transducer (WFST)-based language model for online Indic script handwriting recognition, ” International Conference on Document Analysis and Recognition, Beijing, China, 2011, ss. 599–602.
[41] V. Shah, R. Anstotz, I. Obeid, ve J. Picone, “Adapting an ASR to event classification of electroencephalograms,” Signal Processing Medical Biology, Pennsylvania, USA, 2018, ss. 1–5.
[42] P. Chan, ve R. Lee, The Java class libraries : an annotated reference, 1. baskı, Boston, USA: Addison-Wesley, 1997, böl. 3, ss. 266-310.
[43] E. Arısoy, D. Can, S. Parlak, M. Saraçlar, ve H. Sak, “Turkish Broadcast News Transcription and Retrieval, ” Transactions on Audio, Speech, and Language Processing, c. 17, s. 5, ss. 874–883, 2009.

The Effect of Removal the Silence and Speech Parsing Processes on Turkish Automatic Speech Recognition

Year 2020, , 334 - 346, 31.01.2020

Saadin Oyucu , Hüseyin Polat , Hayri Sever

https://doi.org/10.29130/dubited.560135

Cited By: 2

Abstract

Automatic Speech
Recognition systems are mainly developed using acoustic information. Paired
speech and text data are used to obtain phoneme information from acoustic
information. The acoustic models trained with these data cannot model all
acoustic information in real life. For this reason, it is necessary to carry
out certain pre-processing and eliminate the acoustic information that will
reduce the performance of automatic speech recognition systems. In this study,
a method for removing silences in the speech was proposed. The aim of the
proposed method is to eliminate silence and to break down conversations that
give long dependencies. The speech information, which does not contain any
silence and is divided into pieces, is given as an input to the Turkish
Automatic Speech Recognition system. In the output of the Automatic Speech
Recognition system, the speech that is given as input to the system are
presented by combining the corresponding texts. In the experiments carried out,
it was seen that the removal of silence and parsing of speech increased the
performance of Automatic Speech Recognition systems.

Keywords

Automatic speech recognition, Silence removal

References

[1] M. Abushariah, S. Gunawan, O. Khalifa, ve M. Abushariah, “English digits speech recognition system based on Hidden Markov Models,” International Conference on Computer and Communication Engineering, Kuala Lumpur, Malaysia, 2010, ss. 1–5.
[2] H. Prakoso, R. Ferdiana, ve R. Hartanto, “Indonesian Automatic Speech Recognition system using CMUSphinx toolkit and limited dataset,” International Symposium on Electronics and Smart Devices, Bandung, Indonesia, 2016, ss. 283–286.
[3] C. Kurian, ve K. Balakrishnan, “Speech recognition of Malayalam numbers,” World Congress Natural Biology Inspired Compututer, Coimbatore, India, 2009, ss. 1475–1479.
[4] C. Howard, ve D. David, “Automatic Measurement of Speech Recognition Performance: A Comparison of Six Speaker-Dependent Recognition Devices,” Computer Speech & Language, c. 2, s. 2, ss. 87-108, 1987.
[5] D. Amodei, “Deep speech 2:end-to-end speech recognition in english and mandarin,” International Conference on International Conference on Machine Learning, New York, USA, 2006, ss. 1–28.
[6] Y. G. Thimmaraja ve H. S. Jayanna, “Creating language and acoustic models using Kaldi to build an automatic speech recognition system for Kannada language,” International Conference on Recent Trends in Electronics, Information & Communication Technology, Bangalore, India, 2017, ss. 161–165.
[7] E. Bocchieri, ve D. Caseiro, “Use of geographical meta-data in ASR language and acoustic models,” International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA, 2010, ss. 5118–5121.
[8] J. Neto, “Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system,” European Conference on Speech Communication and Technology, Madrid, Spain, 1995, ss. 2171–2174.
[9] G. Hinton, “Deep Neural Networks for Acoustic Modeling in Speech Recognition,” Signal Processing Magazine, c. 29, s. 6, ss. 82–97, 2012.
[10] W. Chan, ve I. Lane, “Deep convolutional neural networks for acoustic modeling in low resource languages,” International Conference on Acoustics, Speech and Signal Processing, Brisbane, QLD, Australia, 2015, ss. 2056–2060.
[11] C. Ris, ve S. Dupont, “Assessing local noise level estimation methods: Application to noise robust ASR,” Speech Communication, c. 34, s. 1, ss. 141-158, 2001.
[12] C. Guoguo, X. Hainan, W. Minhua, P. Daniel, ve K. Sanjeev, “Pronunciation and silence probability modeling for ASR,” Annual Conference of the International Speech Communication Association, Dresden, Germany, 2015, ss. 533-537.
[13] L. Karray, ve A. Martin, “Toward improving speech detection robustness for speech recognition in adverse environments,” Speech Communication, c. 40, s. 3, ss. 261–276, 2003.
[14] J. Ramírez, J.C. Segura, C. Benítez, ve A. Torre, “A new adaptive longterm spectral estimation voice activity detector,” European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, ss. 3041–3044.
[15] J. Ramírez, “Spectral estimation voice activity detector,” European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, ss. 3121–3125.
[16] ITU-T Recommendation G.729-Annex B. “A silence compression scheme for G.729 optimized for terminals conforming to recommendation,” c. 70, 1996.
[17] F. Basbug, K. Swaminathan, ve S. Nandkumar, “Noise reduction and echo cancellation front-end for speech codecs,” Transaction Speech Audio Processing, c. 11, s. 1, ss. 1–13, 2004.
[18] S. Gustafsson, R. Martin, P. Jax, ve P. Vary, “A psychoacoustic approach to combined acoustic echo cancellation and noise reduction,” Transaction Speech and Audio Processing, c. 10, s. 5, ss. 245–256, 2002.
[19] J. Sohn, N.S. Kim, ve W. Sung, “A statistical model-based voice activity detection,” Signal Processing Letters, c. 16, s. 1, ss. 1–3, 1999.
[20] S. Gazor, ve W. Zhang, “A soft voice activity detector based on a Laplacian-Gaussian model,” Transaction Speech Audio Processing, c. 11, s. 5, ss. 498–505, 2003.
[21] L. Armani, M. Matassoni, M. Omologo, ve P. Svaizer, “Use of a CSP-based voice activity detector for distant-talking ASR,” European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, ss. 501–504.
[22] K. Woo, T. Yang, K. Park, ve C. Lee, “Robust voice activity detection algorithm for estimating noise spectrum,” Electronics Letters, c. 36, s. 2, ss. 180–181, 2000.
[23] M. Marzinzik, ve B. Kollmeier, “Speech pause detection for noise spectrum estimation by tracking power envelope dynamics,” Transaction Speech Audio Processing, c. 10, s. 6, ss. 341–351, 2002.
[24] R. Chengalvarayan, “Robust energy normalization using speech/non-speech discriminator for German connected digit recognition,” European Conference on Speech Communication and Technology, Budapest, Hungary, 1999, ss. 61–64.
[25] M. Marzinzik, ve B. Kollmeier, “Speech pause detection for noise spectrum estimation by tracking power envelope dynamics,” Transaction Speech Audio Processing, c. 10, s. 6, ss. 341–351, 2002.
[26] J. Zheng, Q. Zhou, ve C. Lee, “Robust, real-time endpoint detector with energy normalization for ASR in adverse environments,” International Conference on Acoustics, Speech, and Signal Processing, Lake City, UT, USA, 2001, ss. 233-236.
[27] C. Suyanto, “Signal energy-based automatic speech splitter: A tool for developing speech corpus,” Region 10 Conference, Taipei, Taiwan, 2007, ss. 2–5.
[28] M. Asadullah, ve S. Nisar, “A silence removal and endpoint detection approach for speech processing,” 3rd International Multidisciplinary Research Conference On Global Prosperity through Research & Innovation, Peşaver, Pakistan, 2013, ss. 10-15.
[29] X. Huang, ve L. Deng, “An overview of Modern Speech Recognition,” Handbook Natural Language Processing, 1. baskı, London, England: Chapman and Hall, 2010, böl. 3, ss. 339–367.
[30] D. Povey et al., “The Kaldi speech recognition toolkit,” Transactions on Audio, Speech, and Language Processing, Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, 2014, ss.1–4.
[31] S. Narang, ve M. Divya Gupta, “Speech Feature Extraction Techniques: A Review,” International Journal of Computer Science and Mobile Computing, c. 4, s. 3, ss. 107–114, 2015.
[32] A. Guglani, ve N. Mishra, “Continuous Punjabi Speech Recognition Model Based on Kaldi ASR Toolkit,” International Journal of Speech Technology, c. 18, s. 3, ss.1–6, 2018.
[33] B. Tombaloǧlu, ve H. Erdem, “Development of a MFCC-SVM based Turkish speech recognition system,” Signal Processing and Communication Application Conference, Zonguldak, Türkiye, 2016, ss. 1–4.
[34] A. R. Yuliani, R. Sustika, R. S. Yuwana, ve H. F. Pardede, “Feature transformations for robust speech recognition in reverberant conditions,” International Conference on Computer, Control, Informatics and its Applications, Jakarta, Indonesia, 2017, ss. 57-62.
[35] A. V. Haridas, R. Marimuthu, ve V. G. Sivakumar, “A Critical Review and Analysis on Techniques of Speech Recognition: The Road Ahead,” International Journal of Knowledge-Based and Intelligent Engineering Systems, c. 22, s. 1, ss. 39–57, 2018.
[36] M. Shahin, B. Ahmed, J. Mckechnie, K. Ballard, ve R. Gutierrez-osuna, “A comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech,” Annual Conference of the International Speech Communication Association, Singapore, Singapore, 2014, ss.1583-1590.
[37] L. Saul, ve F. Pereira, “Aggregate and mixed-order Markov models for statistical language processing,” International Conference on Empirical Methods in Natural Language Processing, New Jersey, USA, 1997, ss.81-19.
[38] N. Guglani, ve J. Mishra, “Continuous Punjabi Speech Recognition Model Based on Kaldi ASR Toolkit,” International Journal Speech Technology, c. 17, s. 1, ss. 1–6, 2018.
[39] N. John, J. Wendy, ve N. Philip, “Sing formant frequencies in speech recognition,” 5th European Conference on Speech Communication and Technology, Rhodes, Greece, 1997, ss. 22-28.
[40] S. Chowdhury, U. Garain, ve T. Chattopadhyay, “A Weighted Finite-State Transducer (WFST)-based language model for online Indic script handwriting recognition, ” International Conference on Document Analysis and Recognition, Beijing, China, 2011, ss. 599–602.
[41] V. Shah, R. Anstotz, I. Obeid, ve J. Picone, “Adapting an ASR to event classification of electroencephalograms,” Signal Processing Medical Biology, Pennsylvania, USA, 2018, ss. 1–5.
[42] P. Chan, ve R. Lee, The Java class libraries : an annotated reference, 1. baskı, Boston, USA: Addison-Wesley, 1997, böl. 3, ss. 266-310.
[43] E. Arısoy, D. Can, S. Parlak, M. Saraçlar, ve H. Sak, “Turkish Broadcast News Transcription and Retrieval, ” Transactions on Audio, Speech, and Language Processing, c. 17, s. 5, ss. 874–883, 2009.

There are 43 citations in total.

Details

Primary Language	Turkish
Subjects	Engineering
Journal Section	Articles
Authors	Saadin Oyucu 0000-0003-3880-3039 Hüseyin Polat 0000-0003-4128-2625 Hayri Sever 0000-0002-8261-0675
Publication Date	January 31, 2020
Published in Issue	Year 2020

Cite

APA	Oyucu, S., Polat, H., & Sever, H. (2020). Sessizliğin Kaldırılması ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi. Duzce University Journal of Science and Technology, 8(1), 334-346. https://doi.org/10.29130/dubited.560135
AMA	Oyucu S, Polat H, Sever H. Sessizliğin Kaldırılması ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi. DÜBİTED. January 2020;8(1):334-346. doi:10.29130/dubited.560135
Chicago	Oyucu, Saadin, Hüseyin Polat, and Hayri Sever. “Sessizliğin Kaldırılması Ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi”. Duzce University Journal of Science and Technology 8, no. 1 (January 2020): 334-46. https://doi.org/10.29130/dubited.560135.
EndNote	Oyucu S, Polat H, Sever H (January 1, 2020) Sessizliğin Kaldırılması ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi. Duzce University Journal of Science and Technology 8 1 334–346.
IEEE	S. Oyucu, H. Polat, and H. Sever, “Sessizliğin Kaldırılması ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi”, DÜBİTED, vol. 8, no. 1, pp. 334–346, 2020, doi: 10.29130/dubited.560135.
ISNAD	Oyucu, Saadin et al. “Sessizliğin Kaldırılması Ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi”. Duzce University Journal of Science and Technology 8/1 (January 2020), 334-346. https://doi.org/10.29130/dubited.560135.
JAMA	Oyucu S, Polat H, Sever H. Sessizliğin Kaldırılması ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi. DÜBİTED. 2020;8:334–346.
MLA	Oyucu, Saadin et al. “Sessizliğin Kaldırılması Ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi”. Duzce University Journal of Science and Technology, vol. 8, no. 1, 2020, pp. 334-46, doi:10.29130/dubited.560135.
Vancouver	Oyucu S, Polat H, Sever H. Sessizliğin Kaldırılması ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi. DÜBİTED. 2020;8(1):334-46.

Düzce Üniversitesi Bilim ve Teknoloji Dergisi

Sessizliğin Kaldırılması ve Konuşmanın Parçalara Ayrılması İşleminin Türkçe Otomatik Konuşma Tanıma Üzerindeki Etkisi

Abstract

Keywords

Thanks

References

The Effect of Removal the Silence and Speech Parsing Processes on Turkish Automatic Speech Recognition

Abstract

Keywords

References

Details

Cite

Cited By

Raspbraille: Conversion to Braille Alphabet with Optical Character Recognition and Voice Recognition Algorithm

Hittite Journal of Science and Engineering

https://doi.org/10.17350/HJSE19030000278

Yalıtık Sözcüklü bir Türkçe Konuşma Tanıma Sisteminin Yapay Veri Artırımı ile Tasarımı ve Gerçekleştirimi

Afyon Kocatepe University Journal of Sciences and Engineering

İbrahim USLU

https://doi.org/10.35414/akufemubid.803547