Examination of Energy Based Voice Activity Detection Algorithms for Noisy Speech Signals

Selma Özaydın

doi:10.31590/ejosat.637741

Araştırma Makalesi

Examination of Energy Based Voice Activity Detection Algorithms for Noisy Speech Signals

Yıl 2019, Özel Sayı 2019, 157 - 163, 31.10.2019

Selma Özaydın

https://doi.org/10.31590/ejosat.637741

Cited By: 4

Öz

This paper examines the behavior of two different energy-based voice activity detector (VAD) algorithms for noisy input signals. The examined detectors use time-domain methods to find speech boundaries. Time-domain short time energy features and/or zero-crossing rate of speech signals are used to evaluate the performance of the methods. In the first stage of both algorithms, time-domain short-time energy (STE) features are calculated for each speech segment. Then energy ratios and threshold values are used to detect any voicing activity of speech signals. The decision threshold value is calculated by evaluating the average STE of an initial silence period. The effectiveness of the selected methods is tested for clean and noisy speech samples. The methods are tested using the noisy speech signals under different SNR levels. The results indicated that both methods achieve a reasonable accuracy as low as an SNR value nearly 0dB with a slowly decreasing performance. But, under 0dB SNR, both methods lose their effectiveness against noisy conditions

Anahtar Kelimeler

Voice activity detection , Speech analysis , Speech/silence classification , Endpoint detection , Noise measurement

Kaynakça

R. G. Bachu, S. Kopparthi, B. Adapa and B. D. Barkana (2010), Voiced/Unvoiced Decision for Speech Signals Based on Zero-Crossing Rate and Energy, January, 2010, Advanced Techniques in Computing Sciences and Software Engineering, pp 279-282, 2010; DOI 10.1007/978-90-481-3660-5_47
K.Sakhnov, E.Verteletskaya and B. Simak (2009), Dynamical Energy-Based Speech/Silence Detector for Speech Enhancement Applications, Proceedings of the World Congress on Engineering 2009 Vol I, WCE 2009, July 1 - 3, London, U.K., ISBN: 978-988-17012-5-1 L. R. Rabiner ; M. R. Sambur (1975), An algorithm for determining the endpoints of isolated utterances, The Bell System Technical Journal ( Volume: 54 , Issue: 2 , Feb. 1975 ), (ISSN: 0005-8580), DOI: 10.1002/j.1538-7305.1975.tb02840.x, pp. 297 – 315,
Prasad, V. (2002), Comparison of voice activity detection algorithms for VoIP, Proceedings - International Symposium on Computers and Communications, ·DOI: 10.1109/ISCC.2002.1021726, pp.62-65,
Pollak, P., Sovka, P., Uhlir, J. (1993), Noise Suppression System for a Car, proc. of the Third European Conference on Speech, Communication and Technology – EUROSPEECH’93, (Berlin, Germany), p. 1 073–1 076, vol.5, Sept..
A. M. Kondoz (1999), Digital Speech. New York: John Wiley and Sons,
L. R. Rabiner and R. W. Schafer (2007), Introduction to Digital Speech Processing, Foundations and Trends in Signal Processing. Boston: Now Publishers Inc.,
P.Renevey, A.Drygajlo, (2001), Entropy based voice activity detection in very noisy conditions, in Proc. Eurospeech 2001, pp.1887-1890

Enerji Tabanlı Konuşma Aktivitesi Belirleme Algoritmalarının Gürültülü Konuşma Sinyalleri için İncelenmesi

Yıl 2019, Özel Sayı 2019, 157 - 163, 31.10.2019

Selma Özaydın

https://doi.org/10.31590/ejosat.637741

Cited By: 4

Öz

Bu çalışmada, iki farklı enerji tabanlı konuşma bölgesi aktivasyonu detektör (KAD) algoritmasının gürültülü giriş sinyallerine karşı davranışları incelenmektedir. İncelenen KAD detektörleri, konuşma sınırlarını etkin bir şekilde belirlemek için zaman düzlemindeki metotları kullanmaktadır. Zaman düzlemi kısa zaman aralığında enerji hesabı ve/veya sıfır geçiş oranı, metotların performansını değerlendirmede kullanılmaktadır. Her iki algoritmanın ilk aşamasında, zaman düzleminde her bir konuşma alt kesitinde enerji değerleri hesaplanmaktadır. Enerji oranları ve eşik değerler, konuşma sinyalinin aktif bölgelerini belirlemede kullanılmaktadır. Karar eşik değeri, konuşma sinyalinin başında sessiz bir bölge aralığında hesaplanmaktadır. Seçilen metotların etkinliği temiz ve gürültülü konuşma sinyal örnekleri için test edilmiştir. Metotlar, değişik SNR seviyelerinde gürültülü konuşma sinyalleri kullanarak test edilmiştir. Sonuçlar göstermiştir ki, 0dB SNR seviyesine kadar yavaşca azalan performansla her iki metot etkinliklerini koruyabilmekte, ancak 0dB SNR seviyesi altında her iki metot etkinliğini kaybetmektedir.

Anahtar Kelimeler

Konuşma Aktivite belirleme , Konuşma analizi , Konuşma/sessiz bölge sınıflandırma , Sınır değer belirleme , Gürültü hesaplama

Kaynakça

R. G. Bachu, S. Kopparthi, B. Adapa and B. D. Barkana (2010), Voiced/Unvoiced Decision for Speech Signals Based on Zero-Crossing Rate and Energy, January, 2010, Advanced Techniques in Computing Sciences and Software Engineering, pp 279-282, 2010; DOI 10.1007/978-90-481-3660-5_47
K.Sakhnov, E.Verteletskaya and B. Simak (2009), Dynamical Energy-Based Speech/Silence Detector for Speech Enhancement Applications, Proceedings of the World Congress on Engineering 2009 Vol I, WCE 2009, July 1 - 3, London, U.K., ISBN: 978-988-17012-5-1 L. R. Rabiner ; M. R. Sambur (1975), An algorithm for determining the endpoints of isolated utterances, The Bell System Technical Journal ( Volume: 54 , Issue: 2 , Feb. 1975 ), (ISSN: 0005-8580), DOI: 10.1002/j.1538-7305.1975.tb02840.x, pp. 297 – 315,
Prasad, V. (2002), Comparison of voice activity detection algorithms for VoIP, Proceedings - International Symposium on Computers and Communications, ·DOI: 10.1109/ISCC.2002.1021726, pp.62-65,
Pollak, P., Sovka, P., Uhlir, J. (1993), Noise Suppression System for a Car, proc. of the Third European Conference on Speech, Communication and Technology – EUROSPEECH’93, (Berlin, Germany), p. 1 073–1 076, vol.5, Sept..
A. M. Kondoz (1999), Digital Speech. New York: John Wiley and Sons,
L. R. Rabiner and R. W. Schafer (2007), Introduction to Digital Speech Processing, Foundations and Trends in Signal Processing. Boston: Now Publishers Inc.,
P.Renevey, A.Drygajlo, (2001), Entropy based voice activity detection in very noisy conditions, in Proc. Eurospeech 2001, pp.1887-1890

Toplam 7 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Selma Özaydın 0000-0002-4613-9441
Yayımlanma Tarihi	31 Ekim 2019
Yayımlandığı Sayı	Yıl 2019 Özel Sayı 2019

Kaynak Göster

APA	Özaydın, S. (2019). Examination of Energy Based Voice Activity Detection Algorithms for Noisy Speech Signals. Avrupa Bilim ve Teknoloji Dergisi157-163. https://doi.org/10.31590/ejosat.637741

Cited By

MATHEMATICAL MODEL OF THE SYSTEM OF ACTIVE PROTECTION AGAINST EAVESDROPPING OF SPEECH INFORMATION ON THE SCRAMBLER GENERATOR

EUREKA: Physics and Engineering

Volodymyr Blintsov

https://doi.org/10.21303/2461-4262.2020.001241

Development of a mathematical model of scrambler-type speech-like interference generator for system of prevent speech information from leaking via acoustic and vibration channels

Avrupa Bilim ve Teknoloji Dergisi

Examination of Energy Based Voice Activity Detection Algorithms for Noisy Speech Signals

Öz

Anahtar Kelimeler

Kaynakça

Enerji Tabanlı Konuşma Aktivitesi Belirleme Algoritmalarının Gürültülü Konuşma Sinyalleri için İncelenmesi

Öz

Anahtar Kelimeler

Kaynakça

Ayrıntılar

Kaynak Göster

Cited By

MATHEMATICAL MODEL OF THE SYSTEM OF ACTIVE PROTECTION AGAINST EAVESDROPPING OF SPEECH INFORMATION ON THE SCRAMBLER GENERATOR

EUREKA: Physics and Engineering

Volodymyr Blintsov

https://doi.org/10.21303/2461-4262.2020.001241

Development of a mathematical model of scrambler-type speech-like interference generator for system of prevent speech information from leaking via acoustic and vibration channels

Technology audit and production reserves

Volodymyr Blintsov

https://doi.org/10.15587/2312-8372.2019.185133

Active Speaker Detection Using Audio, Visual, and Depth Modalities: A Survey

IEEE Access

https://doi.org/10.1109/ACCESS.2024.3426670

Enhancing video salient object detection via SAM-based multimodal energy prompting

Pattern Analysis and Applications

https://doi.org/10.1007/s10044-025-01531-9