Research Article
BibTex RIS Cite

Mobil Metne Bağımlı Tek Cümle Konuşmacı Tanıma Uygulamasında Kayıttan Sahte Doğrulama

Year 2017, Volume: 8 Issue: 1, 77 - 88, 01.03.2017

Abstract

Son yıllarda akıllı telefon gibi mobil araçların
kullanımındaki hızlı artış farklı teknolojileri bu platformlar için
gerçekleştirmeyi önemli bir sektör haline getirmiştir. Mobil uygulama
sayısındaki bu artış bu uygulamalardaki güvenlik meselesini de ön plana
çıkarmıştır. Konuşmacının sesinden kimliğinin otomatik olarak belirlenmesini
sağlayan konuşmacı tanıma teknolojisi kişisel bilgi güvenliği gerektiren mobil uygulamalarda
güvenlik açığını gidermek için kullanılabilir.



Metne bağımlı tek cümle konuşmacı tanıma uygulamasında
konuşmacılar eğitim ve tanıma sırasında ortak parola cümlesini tekrar ederler.
Eğitim ve tanımada aynı metnin tekrarlaması tanıma performansını arttırdığı
gibi kullanım kolaylığı da sağlamaktadır. Bununla birlikte tek cümle
uygulamaları özellikle kayıttan sahte doğrulama ataklarına karşı son derece
savunmasızdır. Bu çalışmada metne bağımlı tek cümle uygulamasının kayıttan
sahte doğrulama ataklarına karşı dayanıklılığı test edilmiştir.



Bu çalışmada mobil araçlar için geliştirilecek tek
cümle uygulamasının kayıttan sahte doğrulama ataklarına karşı dayanıklılığını
test edebilmek için yeni bir konuşmacı tanıma veri tabanı oluşturulmuştur. Bu
veri tabanında 124 konuşmacı (62 bayan + 62 bay) 2 ayrı oturumda belirlenen
parola cümlesini tekrar etmiştir. Kayıtlar 2 farklı akıllı telefon kullanılarak
alınmıştır. Bu veri tabanı ile kayıttan sahte doğrulama saldırıları simüle
edilmiştir.



Gauss karışım modeli (Gaussian mixture models - GKM)
metinden bağımsız uygulamalarda en sık kullanılan yöntemlerdendir. Saklı Markov
model (hidden Markov model - SMM) tabanlı yöntemler ise metne bağımlı
uygulamalarda artikülasyon bilgisinden daha iyi faydalandıkları için tercih
edilmektedir. Son dönemlerde kanal uyuşmazlığı problemini gidermek için
i-vektör/PLDA yöntemi önerilmiş ve özellikle metinden bağımsız uygulamalarda
son derece başarılı sonuçlar vermiştir.



Bu çalışmada GKM, cümle
SMM ve i-vektör/PLDA yöntemleri mobil metne bağımlı tek cümle uygulamasında kayıttan
sahte doğrulama ataklarına karşı test edilmiştir. Deneylerde tüm yöntemlerin
sahte doğrulama saldırılarından önemli ölçüde etkilendiği gözlenmiştir. Yaptığımız
testlerde eşit hata oranları normal sahte doğrulama denemelerinde %0.5-1
aralığındayken, kayıttan sahte doğrulama denemeleriyle %10-25 aralığına yükselmiştir.

References

  • Alam, M.J., Kenny, P., Bhattacharya, G. Stafylakis, T., (2015). Development of CRIM System for the Automatic Speaker Verification Spoofing and Countermeasures Challenge 2015, Proc. of the European Conference on Speech Communication and Technology 2015 (INTERSPEECH 2015).
  • Alegre, F., Janicki, A., Evans, N., (2014). Re-assessing the threat of replay spoofing attacks against automatic speaker verification, in Proc. Int. Conf. of the Biometrics Special Interest Group (BIOSIG), 2014.
  • Aronowitz, H., (2012). Voice biometrics for user authentication, Afeka-AVIOS Speech Processing Conference 2012, Tel-Aviv, Israel, pp. 1-4.
  • Blouet, R., Mokbel, C., Mokbel, H., Soto, E. S., Chollet, G., Greige, H., (2004). Becars: A free software for speaker verification, Proc. of the Speaker and Language Recognition Workshop 2004 (ODYSSEY 2004), Toledo, Spain.
  • Buyuk, O., (2011). Telephone-based Text-Dependent Speaker Verification, PhD. Thesis, Bogazici University, Turkey.
  • Buyuk, O., Arslan, L.M., (2012). Model Selection and Score Normalization for Text-Dependent Single Utterance Speaker Verification, Turkish Journal of Electrical Engineering and Computer Sciences 20 (sup.2), 1277-1295.
  • Chen, N., Qian, Y., Dinkel, H., Chen, B., Yu, K., (2015). Robust Deep Feature for Spoofing Detection - The SJTU System for ASVspoof 2015 Challenge, Proc. of the European Conference on Speech Communication and Technology 2015 (INTERSPEECH 2015).
  • Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P., (2011). Front-end factor analysis for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing 19 (4), pp. 788-798.
  • Ferrer, L., McLaren, M., Scheffer, N., Lei, Y., Graciarena, M., Mitra, V., (2013). A noise-robust system for NIST 2012 speaker recognition evaluation, Proc. of the European Conference on Speech Communication and Technology 2013 (INTERSPEECH 2013), Lyon, France, pp. 1981-1985.
  • Garcia-Romero, D., Espy-Wilson, C. Y., (2011). Analysis of i-vector length normalization in speaker recognition systems, Proc. of the European Conference on Speech Communication and Technology 2011 (INTERSPEECH 2011), Florence, Italy, pp. 249-252.
  • Hasan, T., Sadjadi, S. O., Liu, G., Shokouhi, N., Boril, H., Hansen, J. H., (2013). CRSS systems for 2012 NIST speaker recognition evaluation, Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing 2013 (ICASSP 2013), Vancouver, Canada, pp. 6783-6787.
  • Janicki, A., (2015). Spoofing Countermeasure Based on Analysis of Linear Prediction Error, Proc. of the European Conference on Speech Communication and Technology 2015 (INTERSPEECH 2015).
  • Kenny, P., (2010). Bayesian speaker verification with heavy-tailed priors, Proc. of the Speaker and Language Recognition Workshop 2010 (ODYSSEY 2010), Brno, Czech Republic, pp. 014.
  • Kenny, P., Stafylakis, T., Alam, J., Oullet, P., Kockmann, M. (2014). Joint factor analysis for text-dependent speaker verification, Proc. of the Speaker and Language Recognition Workshop 2014 (ODYSSEY 2014), Joensuu, Finland, pp. 200-207.
  • Larcher, A., Lee, K. A., Ma, B., Li, H. (2013). “Phonetically constrained PLDA modeling for text-dependent speaker verification with multiple short utterances”, Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing 2013 (ICASSP 2013), Vancouver, Canada, pp. 7673-7677.
  • Novoselov, S., Pekhovsky, T., Shulipa, A., Sholokhov, A., (2014). Text-dependent GMM-JFA system for password based speaker verification, Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing 2014 (ICASSP 2014), Florence, Italy, pp. 729-737.
  • Prince, S. J. D., Elder, J. H., (2007). Probabilistic linear discriminant analysis for inferences about identity, Proc. of the IEEE International Conference on Computer Vision 2007 (ICCV 2007), Rio de Janeiro, Brazil, pp. 1-8.
  • Reynolds, D.A., Quatieri, T.F., Dunn, R.B., (2000). Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing, 10 (1-3), 19-41.
  • Sadjadi, S. O., Slaney, M., Heck, L. P., (2013). MSR identity toolbox: A MATLAB toolbox for speaker recognition research, version 1.0, Technical Report, Microsoft Research, Conversational Systems Research Center (CSRC), Nov. 2013.
  • Shang, W., Stevenson, M., (2010). Score normalization in playback attack detection, Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing 2010 (ICASSP 2010).
  • Stafylakis. T., Kenny, P., Ouellet, P., Perez, J., Kockmann, M., Dumouchel, P., (2013): I-Vector/PLDA variants for text-dependent speaker recognition, Technical Report, June 2013, Montreal, CRIM.
  • Sturim, D., Campbell, W., Dehak, N., Karam, Z., McCree, A., Reynolds, D. A., Richardson, F., Torres-Carrasquillo, P., Shum, S., (2011). The MIT LL 2010 speaker recognition evaluation system: Scalable language-independent speaker recognition, Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing 2011 (ICASSP 2011), Prague, Czech Republic, pp. 5272-5275.
  • Super Monitoring (2013), State of Mobile 2013, http://www.supermonitoring.com/blog/2013/09/23/state-of-mobile-2013-infographic/#tt Son erişim tarihi: 19 Mart 2014
  • Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P., (2006). The HTK Book (for HTK Version 3.4), Cambridge University Engineering Department.
  • Wu, Z., Kinnunen, T., Chng, E.S., Li, H., Ambikairajah, E., (2012). A study on spoofing attack in state-of-the-art speaker verification: the telephone speech case, Proc. of the Asia-Pacific Signal Information Processing Association Annual Summit and Conference 2012 (APSIPA ASC 2012).
  • Wu, Z., Gao, S., Cling, E. S., Li, H, (2014). A study on replay attack and anti-spoofing for text-dependent speaker verification, Proc. of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA ASC 2014).
  • Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H (2015a). Spoofing and countermeasures for speaker verification: a survey, Speech Communication 66, pp. 130–153.
  • Wu, Z., Kinnunen, T., Evans, N., Yamagishi, J., Hanilci, C., Sahidullah, M., Sizov, A., (2015b). ASVspoof 2015: the First Automatic Speaker Verification Spoofing and Countermeasures Challenge, Proc. of the European Conference on Speech Communication and Technology 2015 (INTERSPEECH 2015).
Year 2017, Volume: 8 Issue: 1, 77 - 88, 01.03.2017

Abstract

References

  • Alam, M.J., Kenny, P., Bhattacharya, G. Stafylakis, T., (2015). Development of CRIM System for the Automatic Speaker Verification Spoofing and Countermeasures Challenge 2015, Proc. of the European Conference on Speech Communication and Technology 2015 (INTERSPEECH 2015).
  • Alegre, F., Janicki, A., Evans, N., (2014). Re-assessing the threat of replay spoofing attacks against automatic speaker verification, in Proc. Int. Conf. of the Biometrics Special Interest Group (BIOSIG), 2014.
  • Aronowitz, H., (2012). Voice biometrics for user authentication, Afeka-AVIOS Speech Processing Conference 2012, Tel-Aviv, Israel, pp. 1-4.
  • Blouet, R., Mokbel, C., Mokbel, H., Soto, E. S., Chollet, G., Greige, H., (2004). Becars: A free software for speaker verification, Proc. of the Speaker and Language Recognition Workshop 2004 (ODYSSEY 2004), Toledo, Spain.
  • Buyuk, O., (2011). Telephone-based Text-Dependent Speaker Verification, PhD. Thesis, Bogazici University, Turkey.
  • Buyuk, O., Arslan, L.M., (2012). Model Selection and Score Normalization for Text-Dependent Single Utterance Speaker Verification, Turkish Journal of Electrical Engineering and Computer Sciences 20 (sup.2), 1277-1295.
  • Chen, N., Qian, Y., Dinkel, H., Chen, B., Yu, K., (2015). Robust Deep Feature for Spoofing Detection - The SJTU System for ASVspoof 2015 Challenge, Proc. of the European Conference on Speech Communication and Technology 2015 (INTERSPEECH 2015).
  • Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P., (2011). Front-end factor analysis for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing 19 (4), pp. 788-798.
  • Ferrer, L., McLaren, M., Scheffer, N., Lei, Y., Graciarena, M., Mitra, V., (2013). A noise-robust system for NIST 2012 speaker recognition evaluation, Proc. of the European Conference on Speech Communication and Technology 2013 (INTERSPEECH 2013), Lyon, France, pp. 1981-1985.
  • Garcia-Romero, D., Espy-Wilson, C. Y., (2011). Analysis of i-vector length normalization in speaker recognition systems, Proc. of the European Conference on Speech Communication and Technology 2011 (INTERSPEECH 2011), Florence, Italy, pp. 249-252.
  • Hasan, T., Sadjadi, S. O., Liu, G., Shokouhi, N., Boril, H., Hansen, J. H., (2013). CRSS systems for 2012 NIST speaker recognition evaluation, Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing 2013 (ICASSP 2013), Vancouver, Canada, pp. 6783-6787.
  • Janicki, A., (2015). Spoofing Countermeasure Based on Analysis of Linear Prediction Error, Proc. of the European Conference on Speech Communication and Technology 2015 (INTERSPEECH 2015).
  • Kenny, P., (2010). Bayesian speaker verification with heavy-tailed priors, Proc. of the Speaker and Language Recognition Workshop 2010 (ODYSSEY 2010), Brno, Czech Republic, pp. 014.
  • Kenny, P., Stafylakis, T., Alam, J., Oullet, P., Kockmann, M. (2014). Joint factor analysis for text-dependent speaker verification, Proc. of the Speaker and Language Recognition Workshop 2014 (ODYSSEY 2014), Joensuu, Finland, pp. 200-207.
  • Larcher, A., Lee, K. A., Ma, B., Li, H. (2013). “Phonetically constrained PLDA modeling for text-dependent speaker verification with multiple short utterances”, Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing 2013 (ICASSP 2013), Vancouver, Canada, pp. 7673-7677.
  • Novoselov, S., Pekhovsky, T., Shulipa, A., Sholokhov, A., (2014). Text-dependent GMM-JFA system for password based speaker verification, Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing 2014 (ICASSP 2014), Florence, Italy, pp. 729-737.
  • Prince, S. J. D., Elder, J. H., (2007). Probabilistic linear discriminant analysis for inferences about identity, Proc. of the IEEE International Conference on Computer Vision 2007 (ICCV 2007), Rio de Janeiro, Brazil, pp. 1-8.
  • Reynolds, D.A., Quatieri, T.F., Dunn, R.B., (2000). Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing, 10 (1-3), 19-41.
  • Sadjadi, S. O., Slaney, M., Heck, L. P., (2013). MSR identity toolbox: A MATLAB toolbox for speaker recognition research, version 1.0, Technical Report, Microsoft Research, Conversational Systems Research Center (CSRC), Nov. 2013.
  • Shang, W., Stevenson, M., (2010). Score normalization in playback attack detection, Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing 2010 (ICASSP 2010).
  • Stafylakis. T., Kenny, P., Ouellet, P., Perez, J., Kockmann, M., Dumouchel, P., (2013): I-Vector/PLDA variants for text-dependent speaker recognition, Technical Report, June 2013, Montreal, CRIM.
  • Sturim, D., Campbell, W., Dehak, N., Karam, Z., McCree, A., Reynolds, D. A., Richardson, F., Torres-Carrasquillo, P., Shum, S., (2011). The MIT LL 2010 speaker recognition evaluation system: Scalable language-independent speaker recognition, Proc. of the IEEE International Conference on Acoustics, Speech and Signal Processing 2011 (ICASSP 2011), Prague, Czech Republic, pp. 5272-5275.
  • Super Monitoring (2013), State of Mobile 2013, http://www.supermonitoring.com/blog/2013/09/23/state-of-mobile-2013-infographic/#tt Son erişim tarihi: 19 Mart 2014
  • Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P., (2006). The HTK Book (for HTK Version 3.4), Cambridge University Engineering Department.
  • Wu, Z., Kinnunen, T., Chng, E.S., Li, H., Ambikairajah, E., (2012). A study on spoofing attack in state-of-the-art speaker verification: the telephone speech case, Proc. of the Asia-Pacific Signal Information Processing Association Annual Summit and Conference 2012 (APSIPA ASC 2012).
  • Wu, Z., Gao, S., Cling, E. S., Li, H, (2014). A study on replay attack and anti-spoofing for text-dependent speaker verification, Proc. of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA ASC 2014).
  • Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H (2015a). Spoofing and countermeasures for speaker verification: a survey, Speech Communication 66, pp. 130–153.
  • Wu, Z., Kinnunen, T., Evans, N., Yamagishi, J., Hanilci, C., Sahidullah, M., Sizov, A., (2015b). ASVspoof 2015: the First Automatic Speaker Verification Spoofing and Countermeasures Challenge, Proc. of the European Conference on Speech Communication and Technology 2015 (INTERSPEECH 2015).
There are 28 citations in total.

Details

Primary Language Turkish
Journal Section Articles
Authors

Osman Büyük This is me

Publication Date March 1, 2017
Submission Date June 7, 2016
Published in Issue Year 2017 Volume: 8 Issue: 1

Cite

IEEE O. Büyük, “Mobil Metne Bağımlı Tek Cümle Konuşmacı Tanıma Uygulamasında Kayıttan Sahte Doğrulama”, DUJE, vol. 8, no. 1, pp. 77–88, 2017.
DUJE tarafından yayınlanan tüm makaleler, Creative Commons Atıf 4.0 Uluslararası Lisansı ile lisanslanmıştır. Bu, orijinal eser ve kaynağın uygun şekilde belirtilmesi koşuluyla, herkesin eseri kopyalamasına, yeniden dağıtmasına, yeniden düzenlemesine, iletmesine ve uyarlamasına izin verir. 24456