Türkçe Sosyal Medya Yorumlarındaki Siber Zorbalığın Derin Öğrenme ile Tespiti
Year 2021,
Issue: 31, 77 - 84, 31.12.2021
Gözde Nergiz
,
Erdinç Avaroğlu
Abstract
Siber zorbalık, internet teknolojisinin gelişimi ve sosyal ağlara erişim kolaylığı ile birlikte büyük bir problem haline dönüşmüştür. Bir kişi veya grup tarafından gerçekleştirilen siber zorbalık, başkalarını taciz etmek için bilgi ve iletişim teknolojilerinin kullanılması anlamına gelir. İntihar ile sonuçlanan siber zorbalık vakaları siber zorbalık tespitini önemli hale getirmiştir. Bu çalışmada günümüzde yaygın olarak kullanılan Twitter, Instagram ve Youtube sosyal ağlarından toplanan Türkçe yorumlar üzerinde siber zorbalık tespiti yapılmıştır. Derin öğrenme tabanlı kelime gömme modelleri kullanılarak sınıflandırma modelleri oluşturulup başarı oranları karşılaştırılmıştır. En yüksek başarı oranı elde eden Fasttext modeli ile LSTM sinir ağı kullanılarak sosyal medya yorumlarının sınıflandırılması sağlanmıştır.
Supporting Institution
Mersin Üniversitesi
Project Number
2019-1-TP2-3339
Thanks
Bu çalışma Mersin Üniversitesi Bilimsel Araştırma Projeleri Birimi tarafından desteklenmiştir.
References
- Aksaray, P. D. S. (2011). SİBER ZORBALIK. Journal of the Cukurova University Institute of Social Sciences, 20(2).
- Campbell, M. A. (2005). Cyber bullying: An old problem in a new guise?. Journal of Psychologists and Counsellors in Schools, 15(1), 68-76.
- hang, X., Tong, J., Vishwamitra, N., Whittaker, E., Mazer, J. P., Kowalski, R., ... & Dillon, E. (2016, December). Cyberbullying detection with a pronunciation based convolutional neural network. In 2016 15th IEEE international conference on machine learning and applications (ICMLA) (pp. 740-745). IEEE.
- Dadvar, M., Jong, F. D., Ordelman, R., & Trieschnigg, D. (2012). Improved cyberbullying detection using gender information. In Proceedings of the Twelfth Dutch-Belgian Information Retrieval Workshop (DIR 2012). University of Ghent.
- Parime, S., & Suri, V. (2014, March). Cyberbullying detection and prevention: Data mining and psychological perspective. In 2014 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2014] (pp. 1541-1547). IEEE.
- Huang, Q., Singh, V. K., & Atrey, P. K. (2014, November). Cyber bullying detection using social and textual analysis. In Proceedings of the 3rd International Workshop on Socially-aware Multimedia (pp. 3-6).
- Soni, D., & Singh, V. (2018, June). Time reveals all wounds: Modeling temporal characteristics of cyberbullying. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 12, No. 1).
- Chen, Y., Zhou, Y., Zhu, S., & Xu, H. (2012, September). Detecting offensive language in social media to protect adolescent online safety. In 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing (pp. 71-80). IEEE.
- Özel, S. A., Saraç, E., Akdemir, S., & Aksu, H. (2017, October). Detection of cyberbullying on social media messages in Turkish. In 2017 International Conference on Computer Science and Engineering (UBMK) (pp. 366-370). IEEE.10.1109/UBMK.2017.8093411.
- Bozyiğit, A., Utku, S., & Nasiboğlu, E. (2018). Sanal zorbalık içeren sosyal medya mesajlarının tespiti. In 3rd International Conference on Computer Sciences and Engineering UBMK.
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
- Erdinҫ, H. Y., & Güran, A. (2019, April). Semi-supervised turkish text categorization with word2vec, doc2vec and fasttext algorithms. In 2019 27th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE.
- Alessa, A., Faezipour, M., & Alhassan, Z. (2018, June). Text classification of flu-related tweets using fasttext with sentiment and keyword features. In 2018 IEEE International Conference on Healthcare Informatics (ICHI) (pp. 366-367). IEEE.
- Çelik, Ö., & Koç, B. C. TF-IDF, Word2vec ve Fasttext Vektör Model Yöntemleri ile Türkçe Haber Metinlerinin Sınıflandırılması. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi, 23(67), 121-127.
- Le, Q., & Mikolov, T. (2014, June). Distributed representations of sentences and documents. In International conference on machine learning (pp. 1188-1196). PMLR.
- Olah, C. (2021, 27 Mayıs), Understanding LSTM Networks, Erişim Adresi: http://colah.github.io/posts/2015-08-UnderstandingLSTMs/
- Zhang, Z., & Sabuncu, M. R. (2018, January). Generalized cross entropy loss for training deep neural networks with noisy labels. In 32nd Conference on Neural Information Processing Systems (NeurIPS).
Detection of Cyberbullying in Turkish Social Media Comments with Deep Learning
Year 2021,
Issue: 31, 77 - 84, 31.12.2021
Gözde Nergiz
,
Erdinç Avaroğlu
Abstract
Cyberbullying has become a big problem with the development of internet technology and ease of access to social networks. Cyberbullying is by a person or group refers to the use of information and communication technologies to harass others. Cyberbullying cases resulting in suicide have made the detection of cyberbullying important. In this study, cyberbullying was detected on Turkish comments collected from Twitter, Instagram and Youtube social networks, which are widely used today. Classification models were created using deep learning-based word embedding models and success rates were compared. The classification of social media comments was achieved by using the Fasttext model, which achieved the highest success rate, and the LSTM neural network.
Project Number
2019-1-TP2-3339
References
- Aksaray, P. D. S. (2011). SİBER ZORBALIK. Journal of the Cukurova University Institute of Social Sciences, 20(2).
- Campbell, M. A. (2005). Cyber bullying: An old problem in a new guise?. Journal of Psychologists and Counsellors in Schools, 15(1), 68-76.
- hang, X., Tong, J., Vishwamitra, N., Whittaker, E., Mazer, J. P., Kowalski, R., ... & Dillon, E. (2016, December). Cyberbullying detection with a pronunciation based convolutional neural network. In 2016 15th IEEE international conference on machine learning and applications (ICMLA) (pp. 740-745). IEEE.
- Dadvar, M., Jong, F. D., Ordelman, R., & Trieschnigg, D. (2012). Improved cyberbullying detection using gender information. In Proceedings of the Twelfth Dutch-Belgian Information Retrieval Workshop (DIR 2012). University of Ghent.
- Parime, S., & Suri, V. (2014, March). Cyberbullying detection and prevention: Data mining and psychological perspective. In 2014 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2014] (pp. 1541-1547). IEEE.
- Huang, Q., Singh, V. K., & Atrey, P. K. (2014, November). Cyber bullying detection using social and textual analysis. In Proceedings of the 3rd International Workshop on Socially-aware Multimedia (pp. 3-6).
- Soni, D., & Singh, V. (2018, June). Time reveals all wounds: Modeling temporal characteristics of cyberbullying. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 12, No. 1).
- Chen, Y., Zhou, Y., Zhu, S., & Xu, H. (2012, September). Detecting offensive language in social media to protect adolescent online safety. In 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing (pp. 71-80). IEEE.
- Özel, S. A., Saraç, E., Akdemir, S., & Aksu, H. (2017, October). Detection of cyberbullying on social media messages in Turkish. In 2017 International Conference on Computer Science and Engineering (UBMK) (pp. 366-370). IEEE.10.1109/UBMK.2017.8093411.
- Bozyiğit, A., Utku, S., & Nasiboğlu, E. (2018). Sanal zorbalık içeren sosyal medya mesajlarının tespiti. In 3rd International Conference on Computer Sciences and Engineering UBMK.
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
- Erdinҫ, H. Y., & Güran, A. (2019, April). Semi-supervised turkish text categorization with word2vec, doc2vec and fasttext algorithms. In 2019 27th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE.
- Alessa, A., Faezipour, M., & Alhassan, Z. (2018, June). Text classification of flu-related tweets using fasttext with sentiment and keyword features. In 2018 IEEE International Conference on Healthcare Informatics (ICHI) (pp. 366-367). IEEE.
- Çelik, Ö., & Koç, B. C. TF-IDF, Word2vec ve Fasttext Vektör Model Yöntemleri ile Türkçe Haber Metinlerinin Sınıflandırılması. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi, 23(67), 121-127.
- Le, Q., & Mikolov, T. (2014, June). Distributed representations of sentences and documents. In International conference on machine learning (pp. 1188-1196). PMLR.
- Olah, C. (2021, 27 Mayıs), Understanding LSTM Networks, Erişim Adresi: http://colah.github.io/posts/2015-08-UnderstandingLSTMs/
- Zhang, Z., & Sabuncu, M. R. (2018, January). Generalized cross entropy loss for training deep neural networks with noisy labels. In 32nd Conference on Neural Information Processing Systems (NeurIPS).