Mitigating Data Imbalance Problem in Transformer-Based Intent Detection

Osman Büyük; Mustafa Erden; Levent Arslan

doi:10.31590/ejosat.1044812

Araştırma Makalesi

Dönüştürücü Tabanlı Niyet Tespitinde Veri Dengesizliği Etkisinin Azaltılması

Yıl 2021, Sayı: 32, 445 - 450, 31.12.2021

Osman Büyük , Mustafa Erden , Levent Arslan

https://doi.org/10.31590/ejosat.1044812

https://izlik.org/JA45TD77DD

Öz

Bir niyet tespiti uygulamasını yeni bir müşteri için gerçekleştirirken iki temel problem ile karşılaşılmaktadır. İlki müşteriden gelen alana özgü veri miktarının genellikle az ve her sınıftan dengesiz sayıda örnek içermesidir. Ayrıca, müşteriler benzer alanlarda bir uygulama gerçekleştirmek isteseler de, belirledikleri niyet kategorileri genellikle farklı olmaktadır. Bu durum, farklı müşteriler için toplanan verilerin tek ve daha büyük bir veri seti haline getirilmesini zorlaştırmaktadır. Bu çalışmada veri dengesizliği problemini azaltmak için kayıp fonksiyonunda sınıf ağırlıkları kullanılmıştır. Sınıf ağırlıkları, eğitim verisinde az örneği olan sınıflara daha fazla ağırlık vermek için, sınıftaki örnek sayısı ile ters orantılı olarak belirlenmiştir. Ayrıca, benzer alanlarda toplanmış veri setlerindeki bilgiden faydalanmak için iki uyarlama aşaması olan bir transfer öğrenme yöntemi denenmiştir. Deneylerde, ağırlıklı kayıp fonksiyonu ile iki aşamalı transfer öğrenme yönteminin birlikte kullanılmasının niyet tespiti sınıflandırma başarımını önemli oranda arttırdığı gözlenmiştir. Yüzde tanıma oranındaki net artış dönüştürücü tabanlı referans sisteme göre %2 olarak gerçekleşmiştir.

Anahtar Kelimeler

Niyet Tespiti , Derin Öğrenme , Dönüştürücüler , Veri Dengesizliği , Transfer Öğrenme

Proje Numarası

3189149

Kaynakça

Büyük, O., Erden, M. and Arslan, L. M. (2021). "Leveraging the information in in-domain datasets for transformer-based intent detection," Innovations in Intelligent Systems and Applications Conference (ASYU 2021), 2021, pp. 1-4, doi: 10.1109/ASYU52992.2021.9599055.
Casanueva, I., Temčinas, T., Gerz, D., Henderson, M., Vulić, I. (2020). “Efficient intent detection with dual sentence encoders,” arXiv preprint, arXiv:2003.04807.
Deveci, C., Demirbağ, S., Erden, M., Arslan, L.M. (2020) “Query Intent Classification with Short Sentences in Agglutinative Languages,” IEEE 28th Signal Processing and Communications Applications Conference (SIU 2020), Gaziantep, Turkey.
Devlin, J., Chang, M. W., Lee, K., Toutanova, K. (2018) “BERT: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint, arXiv:1810.04805.
Dündar, E.B., Kiliç, O.F., Çekiç, T., Manav, Y., Deniz, O. (2020) “Large scale intent detection in Turkish short sentences with contextual word embeddings,” 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (KDIR 2020), pp. 187-192.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V. (2019). “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint, arXiv:1907.11692.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I. (2019). “Language models are unsupervised multitask learners,” OpenAI blog, 1(8), 9.
Squad, SQuAD2.0 The Stanford Question Answering Dataset (2021), https://rajpurkar.github.io/SQuAD-explorer/. Song, K., Tan, X., Qin, T., Lu, J., Liu, T.Y. (2020). “MPnet: Masked and permuted pre-training for language understanding,” arXiv preprint, arXiv:2004.09297.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I. (2017). “Attention is all you need,” arXiv preprint, arXiv:1706.03762.
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q.V. (2019). “XLnet: Generalized autoregressive pretraining for language understanding,” arXiv preprint, arXiv:1906.08237, 2019.

Mitigating Data Imbalance Problem in Transformer-Based Intent Detection

Yıl 2021, Sayı: 32, 445 - 450, 31.12.2021

Osman Büyük , Mustafa Erden , Levent Arslan

https://doi.org/10.31590/ejosat.1044812

https://izlik.org/JA45TD77DD

Öz

There are two major problems when deploying a practical intent detection system for a new customer. First, domain-specific data from the customer could be limited and imbalanced. Additionally, despite different customers might share the same domain, their intent categories might be different from each other. Thus, it might be difficult to combine the datasets collected for different customers into a single and larger one. In this paper, we use class weights in the loss computation to alleviate the data imbalance problem. The class weights are defined inversely proportional to the frequency of the class in the training set in order to give more influence to less observed classes. We also employ a two-pass fine-tuning procedure to utilize the information in different in-domain datasets. Experimental results show that intent detection performance is improved significantly when the weighted loss function is used together with the two-pass transfer learning procedure. The absolute performance improvement in percent detection accuracy is approximately 2% over a transformer-based baseline.

Anahtar Kelimeler

Intent Detection , Deep Learning , Transformers , Data Imbalance , Transfer Learning

Destekleyen Kurum

TÜRKİYE BİLİMSEL VE TEKNOLOJİK ARAŞTIRMA KURUMU (TÜBİTAK)

Proje Numarası

3189149

Teşekkür

This work was supported by The Scientific and Technological Research Council of Turkey (TUBITAK) under the project number 3189149.

Kaynakça

Büyük, O., Erden, M. and Arslan, L. M. (2021). "Leveraging the information in in-domain datasets for transformer-based intent detection," Innovations in Intelligent Systems and Applications Conference (ASYU 2021), 2021, pp. 1-4, doi: 10.1109/ASYU52992.2021.9599055.
Casanueva, I., Temčinas, T., Gerz, D., Henderson, M., Vulić, I. (2020). “Efficient intent detection with dual sentence encoders,” arXiv preprint, arXiv:2003.04807.
Deveci, C., Demirbağ, S., Erden, M., Arslan, L.M. (2020) “Query Intent Classification with Short Sentences in Agglutinative Languages,” IEEE 28th Signal Processing and Communications Applications Conference (SIU 2020), Gaziantep, Turkey.
Devlin, J., Chang, M. W., Lee, K., Toutanova, K. (2018) “BERT: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint, arXiv:1810.04805.
Dündar, E.B., Kiliç, O.F., Çekiç, T., Manav, Y., Deniz, O. (2020) “Large scale intent detection in Turkish short sentences with contextual word embeddings,” 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (KDIR 2020), pp. 187-192.
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., Stoyanov, V. (2019). “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint, arXiv:1907.11692.
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I. (2019). “Language models are unsupervised multitask learners,” OpenAI blog, 1(8), 9.
Squad, SQuAD2.0 The Stanford Question Answering Dataset (2021), https://rajpurkar.github.io/SQuAD-explorer/. Song, K., Tan, X., Qin, T., Lu, J., Liu, T.Y. (2020). “MPnet: Masked and permuted pre-training for language understanding,” arXiv preprint, arXiv:2004.09297.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., Polosukhin, I. (2017). “Attention is all you need,” arXiv preprint, arXiv:1706.03762.
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q.V. (2019). “XLnet: Generalized autoregressive pretraining for language understanding,” arXiv preprint, arXiv:1906.08237, 2019.

Toplam 10 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Mühendislik
Bölüm	Araştırma Makalesi
Yazarlar	Osman Büyük 0000-0003-1039-3234 Mustafa Erden 0000-0002-2661-1200 Levent Arslan 0000-0002-6086-8018
Proje Numarası	3189149
Yayımlanma Tarihi	31 Aralık 2021
DOI	https://doi.org/10.31590/ejosat.1044812
IZ	https://izlik.org/JA45TD77DD
Yayımlandığı Sayı	Yıl 2021 Sayı: 32

Kaynak Göster

APA	Büyük, O., Erden, M., & Arslan, L. (2021). Mitigating Data Imbalance Problem in Transformer-Based Intent Detection. Avrupa Bilim ve Teknoloji Dergisi, 32, 445-450. https://doi.org/10.31590/ejosat.1044812