Yıl 2020,
Cilt: 22 Sayı: 64, 47 - 58, 24.01.2020
Gürhan Bulu
Ahmet Semih Bingöl
- [1] European Telecommunications Standards Institute (ETSI), "Speech Processing Transmission & Quality Aspects (STQ) QoS parameter definitions and measurements," ETSI, ETSI EG 201 769-1 [online], http://www.etsi.org/deliver/etsi_eg/201700_201799/20176901/01.01.01_60/ eg_20176901v010101p.pdf, v1.1.1, 2000.
- [2] O’Shaughnessy, D. 1987. Speech Communications - Human and Machine. Addison-Wesley, Reading, Massachusetts.
- [3] Patrick, P. J., Steele, R. ve Xydeas, C. S., 1993. Frequency compression of 7.6 khz speech into 3.3 khz bandwith, IEEE Transactions on Communications, Cilt. 31(5), s. 692–701. DOI: 10.1109/TCOM.1983.1095876
- [4] Epps, J. ve Holmes, W. H. 2001. A new very low bit rate wideband speech coder with a sinusoidal highband model. IEEE International Symposium on Circuits and Systems ISCAS, Sydney.
- [5] Geiser, B. ve Vary, P. 2007. Backwards compatible wideband telephony in mobile networks: Celp watermarking and bandwidth extension. Procceedings of ICASSP, Honolulu.
- [6] Prasad, N ve Kumar, T. K. 2016. Speech bandwidth extension using magnitude spectrum data hiding. International Conference on Computing, Analytics and Security Trends (CAST), Pune.
- [7] Carl, H., Heute U. 1994. Bandwidth enhancement of narrowband speech signals. Procceedings of EUSIPCO, VII European Signal Processing Conference, Edinburg, 1178–1181.
- [8] Yoshida, Y., and Abe, M 1994. An algorithm to reconstruct wideband speech from narrowband speech based on codebook mapping. Proc. Int. Conf. on Spoken Language Processing (ICSLP), Yokohama, 1591-1594
- [9] Chan, C-F., and Hui, W-K. 1996. Wideband enhancement of narrowband coded speech using mbe re-synthesis 3rd International Conference on Signal Processing Procceedings of ICSP , Beijing.
- [10] Epps, J., Holmes, W. H. 1998. Speech enhancement using stc-based bandwidth extension. Proc. Int. Conf. on Spoken Language Processing (ICSLP), Sidney.
- [11] Epps, J., Holmes, W. H. 1999. A new technique for wideband enhancement of coded narrowband speech, Procceedings of IEEE Workshop on Speech Coding, Porvoo. DOI: 10.1109/SCFT.1999.781522
- [12] Jax, P., Vary, P., 2000. Wideband extension of telephone speech using a hidden markov model. Proc. of IEEE Workshop on Speech Coding, Delevan. DOI: 10.1109/SCFT.2000.878427.
- [13] Gustafsson, H., Lindgren, U.A., Claesson, I. 2006. Low-complexity featuremapped speech bandwidth extension. IEEE Transactions on Audio, Speech and Language Processing, Cilt. 14(2), p. 577–588. DOI: 10.1109/TSA.2005.855837
- [14] Chennoukh, S., Gerrits, A., Miet, G., Sluijter, R. 2001. Speech Enhancement via Frequency Bandwidth Extension using Line Spectral Frequencies. Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP-2001, Salt Lake City, DOI: 10.1109/ICASSP.2001.940919.
- [15] Vaseghi, S., Zavarehei, E., Qin, Y. 2006. Speech Bandwidth Extension: Extrapolations of Spectral Envelop and Harmonicity Quality of Excitation, Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP-2006, Toulouse, DOI: 10.1109/ICASSP.2006.1660786.
- [16] Prasad, N., Kumar, T. K. 2016. Bandwidth Extension of Speech Signals: A Comprehensive Review. I.J. Intelligent Systems and Applications, Cilt. 2016(2), pp. 45-52, DOI: 10.5815/ijisa.2016.02.06
- [17] Lahouti, F., Fazel, A. R., Safavi-Naeini, A. H., Khandani, A. K. 2006. Single and Double Frame Coding of Speech LPC Parameters using a Lattice-Based Quantization Scheme. IEEE Transactions on Audio, Speech and Language Processing, Cilt. 14(5), s. 1624–1701. DOI: 10.1109/TSA.2005.858560
- [18] Rabiner, L. R., Schafer, R. W. 1978. Digital Processing of Speech Signals, Prentice-Hall, New Jersey.
- [19] Kay, S. M. 1988. Modern Spectral Estimation: Theory and Application, Prentice-Hall, New Jersey.
- [20] Moon, T. K., Stirling, W. C. 2000. Mathematical Methods and Algorithms for Signal Processing, Prentice-Hall, New Jersey.
- [21] Viswanathan, R., Makhoul, J. 1975. Quantization properties of transmission parameters in linear predictive systems. IEEE Transactions on Acoustics, Speech and Signal Processing, Cilt. 23(3), s. 309–321. DOI: 10.1109/TASSP.1975.1162675
- [22] Gray, Jr. A. H., Markel, J. D. 1976. Quantization and bit allocation in speech processing. IEEE Transactions on Acoustics, Speech and Signal Processing, Cilt. 24( 6), s. 459–473. DOI: 10.1109/TASSP.1976.1162857
- [23] F. Itakura, 1975 Line spectrum representation of linear predictive coefficients of speech signals. Journal of the Acoustical Society of America, Cilt. 57(1), p. 35. DOI: 10.1121/1.1995189
- [24] Gray, R., Buzo, A., Gray, A., Matsuyama, Y. 1980. Distortion Measures for Speech Processing. IEEE Transactions on Acoustics, Speech, and Signal Processing , Cilt. 28(4), s. 367–376. DOI: 10.1109/TASSP.1980.1163421
- [25] P. Jax. 2004 Bandwidth Extension for Speech , s. 171-235. Larsen, E., Aarts, R.M. 2004. Audio Bandwidth Extension: Application of Psychoacoustics, Signal Processing and Loudspeaker Design, John Wiley & Sons
- [26] McAulay, R., Quatieri, T. 1986. Speech Analysis-Synthesis Based on a Sinusoidal Representation, IEEE Transactions on Acoustics, Speech, and Signal Processing, Cilt 34(4), s. 744-754. DOI: 10.1109/TASSP.1986.1164910
- [27] Jayant, N. S., Noll, P. 1984. Digital Coding of Waveforms, Prentice-Hall, New Jersey.
Dar Bantlı Konuşma İzgel Zarfının Sinir Ağları Kullanılarak Genişletilmesi ve Kalitesinin Artırılması
Yıl 2020,
Cilt: 22 Sayı: 64, 47 - 58, 24.01.2020
Gürhan Bulu
Ahmet Semih Bingöl
Geleneksel telefon sistemlerinde,
7−8 kHz’e kadar bileşenleri olan insan sesi 3.4 kHz bant genişliğine sahip bir
alçak geçirgen süzgeçten geçirilip 8 kHz’te örneklenerek gönderilir. Kodlama ve
benzeri işlemlerden kaynaklanan hiçbir kayıp olmasa bile, yüksek frekans
bölgesinin süzülerek kaybolmasından kaynaklanan bir kalite kaybı söz konusudur.
Bu kayıp, anlaşılabilirliği pek etkilememekle beraber konuşma kalitesinde
hissedilir bir bozulmaya yol açmaktadır. Bu çalışmada, izgel zarfın düşük
frekanslı bölgelerinden faydalanılarak; süzülen yüksek frekanslı bölgelerin
izgel zarfı elde edilmeye çalışılmış ve izgel zarfın genişletilmesi olarak da
nitelenebilecek bu işlem için yapay sinir ağları yöntemi kullanılmıştır. Daha
sonra genişletilen bu izgel zarf kullanılarak kaybolan yüksek frekanslı
bölgeler kaynak-süzgeç modeliyle yeniden oluşturulmuş ve böylece konuşma kalitesini
artırılması hedeflenmiştir. Geliştirilen bu yöntem telefon kalitesindeki
konuşmalar için kullanılmakla birlikte, daha düşük bant genişliğine sahip
konuşmalar (1.8 kHz) için de kullanılabilir.
- [1] European Telecommunications Standards Institute (ETSI), "Speech Processing Transmission & Quality Aspects (STQ) QoS parameter definitions and measurements," ETSI, ETSI EG 201 769-1 [online], http://www.etsi.org/deliver/etsi_eg/201700_201799/20176901/01.01.01_60/ eg_20176901v010101p.pdf, v1.1.1, 2000.
- [2] O’Shaughnessy, D. 1987. Speech Communications - Human and Machine. Addison-Wesley, Reading, Massachusetts.
- [3] Patrick, P. J., Steele, R. ve Xydeas, C. S., 1993. Frequency compression of 7.6 khz speech into 3.3 khz bandwith, IEEE Transactions on Communications, Cilt. 31(5), s. 692–701. DOI: 10.1109/TCOM.1983.1095876
- [4] Epps, J. ve Holmes, W. H. 2001. A new very low bit rate wideband speech coder with a sinusoidal highband model. IEEE International Symposium on Circuits and Systems ISCAS, Sydney.
- [5] Geiser, B. ve Vary, P. 2007. Backwards compatible wideband telephony in mobile networks: Celp watermarking and bandwidth extension. Procceedings of ICASSP, Honolulu.
- [6] Prasad, N ve Kumar, T. K. 2016. Speech bandwidth extension using magnitude spectrum data hiding. International Conference on Computing, Analytics and Security Trends (CAST), Pune.
- [7] Carl, H., Heute U. 1994. Bandwidth enhancement of narrowband speech signals. Procceedings of EUSIPCO, VII European Signal Processing Conference, Edinburg, 1178–1181.
- [8] Yoshida, Y., and Abe, M 1994. An algorithm to reconstruct wideband speech from narrowband speech based on codebook mapping. Proc. Int. Conf. on Spoken Language Processing (ICSLP), Yokohama, 1591-1594
- [9] Chan, C-F., and Hui, W-K. 1996. Wideband enhancement of narrowband coded speech using mbe re-synthesis 3rd International Conference on Signal Processing Procceedings of ICSP , Beijing.
- [10] Epps, J., Holmes, W. H. 1998. Speech enhancement using stc-based bandwidth extension. Proc. Int. Conf. on Spoken Language Processing (ICSLP), Sidney.
- [11] Epps, J., Holmes, W. H. 1999. A new technique for wideband enhancement of coded narrowband speech, Procceedings of IEEE Workshop on Speech Coding, Porvoo. DOI: 10.1109/SCFT.1999.781522
- [12] Jax, P., Vary, P., 2000. Wideband extension of telephone speech using a hidden markov model. Proc. of IEEE Workshop on Speech Coding, Delevan. DOI: 10.1109/SCFT.2000.878427.
- [13] Gustafsson, H., Lindgren, U.A., Claesson, I. 2006. Low-complexity featuremapped speech bandwidth extension. IEEE Transactions on Audio, Speech and Language Processing, Cilt. 14(2), p. 577–588. DOI: 10.1109/TSA.2005.855837
- [14] Chennoukh, S., Gerrits, A., Miet, G., Sluijter, R. 2001. Speech Enhancement via Frequency Bandwidth Extension using Line Spectral Frequencies. Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP-2001, Salt Lake City, DOI: 10.1109/ICASSP.2001.940919.
- [15] Vaseghi, S., Zavarehei, E., Qin, Y. 2006. Speech Bandwidth Extension: Extrapolations of Spectral Envelop and Harmonicity Quality of Excitation, Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP-2006, Toulouse, DOI: 10.1109/ICASSP.2006.1660786.
- [16] Prasad, N., Kumar, T. K. 2016. Bandwidth Extension of Speech Signals: A Comprehensive Review. I.J. Intelligent Systems and Applications, Cilt. 2016(2), pp. 45-52, DOI: 10.5815/ijisa.2016.02.06
- [17] Lahouti, F., Fazel, A. R., Safavi-Naeini, A. H., Khandani, A. K. 2006. Single and Double Frame Coding of Speech LPC Parameters using a Lattice-Based Quantization Scheme. IEEE Transactions on Audio, Speech and Language Processing, Cilt. 14(5), s. 1624–1701. DOI: 10.1109/TSA.2005.858560
- [18] Rabiner, L. R., Schafer, R. W. 1978. Digital Processing of Speech Signals, Prentice-Hall, New Jersey.
- [19] Kay, S. M. 1988. Modern Spectral Estimation: Theory and Application, Prentice-Hall, New Jersey.
- [20] Moon, T. K., Stirling, W. C. 2000. Mathematical Methods and Algorithms for Signal Processing, Prentice-Hall, New Jersey.
- [21] Viswanathan, R., Makhoul, J. 1975. Quantization properties of transmission parameters in linear predictive systems. IEEE Transactions on Acoustics, Speech and Signal Processing, Cilt. 23(3), s. 309–321. DOI: 10.1109/TASSP.1975.1162675
- [22] Gray, Jr. A. H., Markel, J. D. 1976. Quantization and bit allocation in speech processing. IEEE Transactions on Acoustics, Speech and Signal Processing, Cilt. 24( 6), s. 459–473. DOI: 10.1109/TASSP.1976.1162857
- [23] F. Itakura, 1975 Line spectrum representation of linear predictive coefficients of speech signals. Journal of the Acoustical Society of America, Cilt. 57(1), p. 35. DOI: 10.1121/1.1995189
- [24] Gray, R., Buzo, A., Gray, A., Matsuyama, Y. 1980. Distortion Measures for Speech Processing. IEEE Transactions on Acoustics, Speech, and Signal Processing , Cilt. 28(4), s. 367–376. DOI: 10.1109/TASSP.1980.1163421
- [25] P. Jax. 2004 Bandwidth Extension for Speech , s. 171-235. Larsen, E., Aarts, R.M. 2004. Audio Bandwidth Extension: Application of Psychoacoustics, Signal Processing and Loudspeaker Design, John Wiley & Sons
- [26] McAulay, R., Quatieri, T. 1986. Speech Analysis-Synthesis Based on a Sinusoidal Representation, IEEE Transactions on Acoustics, Speech, and Signal Processing, Cilt 34(4), s. 744-754. DOI: 10.1109/TASSP.1986.1164910
- [27] Jayant, N. S., Noll, P. 1984. Digital Coding of Waveforms, Prentice-Hall, New Jersey.