Araştırma Makalesi
BibTex RIS Kaynak Göster

Detecting Violence Towards Women from Turkish Social Media Posts with Natural Language Processing and Machine Learning

Yıl 2025, Cilt: 8 Sayı: 2, 115 - 129, 29.09.2025
https://doi.org/10.38016/jista.1580712

Öz

There is an increase in physical and emotional violence towards women in Turkey. However, the development of new mechanisms to prevent this increase cannot catch up. One of the first steps that can be taken to prevent violence towards women using the technological progress in artificial intelligence is to detect the social media posts that support such violence, and then to ban them. This article presents a natural language processing (NLP) study conducted with the mentioned goal in Turkish social media. After selecting a popular social media platform that has been widely used in Turkey for many years, more than five subject titles are selected and the posts below them are scraped, then labeled, constructing a novel Turkish data collection. Following data analyses with various techniques, popular NLP feature extraction techniques and several machine learning models such as bag of words, Random Forests, Gradient Boosting are used to detect posts that support violence towards women. According to the findings, the number of posts containing violence towards women in Turkish are less than those that defend women, and the existing violent posts contain psychological violence and humiliation towards women. During the model evaluation, precision, recall, F1, and AUC (Area Under Curve) metrics are utilized. According to the results, 76% AUC and 77% recall rates can be obtained in detecting violence towards women from social media posts. These findings demonstrate the possibility of applying real-life sensitive measures on social media such as the detection and blocking of emotional violence towards women in Turkey.

Destekleyen Kurum

TÜBİTAK 2209A

Proje Numarası

1919B012303496

Teşekkür

This study was supported by TÜBİTAK 2209A funding grant with project number: 1919B012303496.

Kaynakça

  • Alshehri, A., Nagoudi, E. M. B., Abdul-Mageed, M., 2020. Understanding and detecting dangerous speech on social media. arXiv:2005.06608.
  • Ashraf, N., Mustafa, R., Sidorov, G., Gelbukh, A., 2020. Classification of individual and group violence threats in online discussions. Companion Proceedings of the Web Conference, 629–633.
  • Aytaç, S., Eteman, F. S., Aydın, G. C., Reçber, B., Sezen, H. K., 2016. Kadına yönelik şiddetin dünü, bugünü, yarını: Kestirim tabanlı bir araştırma. Istanbul Journal of Sociological Studies, (54), 275–297.
  • Bayram, U., 2022. Uncovering the impacts of the pandemic by analyzing COVID-19 related news articles using machine learning and network analysis. Journal of Information Technologies, 15(2), 209–220.
  • Bayram, U., Benhiba, L., 2021. Determining a person's suicide risk based on short-term tweet history. Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access, 81–86.
  • Bayram, U., Benhiba, L., 2022. Emotionally-informed models for detecting moments of change and suicide risk levels in longitudinal social media data. In Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology, 219–225.
  • Bayram, U., Pestian, J., Santel, D., Minai, A. A., 2019. What's in a word? Detecting party affiliation from word usage in congressional speeches. International Joint Conference on Neural Networks (IJCNN), 1–8. IEEE.
  • Blei, D. M., Ng, A. Y., Jordan, M. I., 2003. Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
  • Castorena, C. M., Abundez, I. M., Alejo, R., Granda-Gutiérrez, E. E., Rendón, E., Villegas, O., 2021. Detection of gender-based violence in Twitter messages using deep neural networks. Mathematics, 9(8), 807.
  • Cihan, Ü., Karakaya, H., 2017. Kadın-erkek kavramları bağlamında şiddet ve şiddetle mücadelede sosyal hizmetin rolü. Bolu Abant İzzet Baysal Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, 17(4), 297–324.
  • Deng, K., Bol, P. K., Li, K. J., Liu, J. S., 2016. Unsupervised analysis of domain-specific Chinese texts. Proceedings of the National Academy of Sciences, 113(22), 6154–6159.
  • Desai, A., Kalaskar, S., Kumbhar, O., Dhumal, R., 2021. Detecting cyberbullying on social media using machine learning. ITM Web of Conferences, 40, 03038. EDP Sciences.
  • Ensari, T., Ensari, B., Dağtekin, M., 2022. Violence detection with machine learning: a sociodemographic approach. Journal of European Science and Technology, (44), 104–107.
  • González, G. A. R., Cantu-Ortiz, F. J., 2021. Sentiment analysis and unsupervised learning approach for digital violence against women: the Monterrey case. 4th International Conference on Information and Computer Technologies (ICICT), 18–26. IEEE.
  • Guo, Y., Kim, S., Warren, E., Yang, Y.-C., Lakamana, S., Sarker, A., 2023. Automatically detecting victims of intimate partner violence from social media. AMIA Summits on Translational Science Proceedings, 254.
  • Hassan, N., Poudel, A., Hale, J., Hubacek, C., Huq, K. T., Santu, S. K. K., Ahmed, S. I., 2020. A step towards automated sexual violence reporting tracking. Proceedings of the International AAAI Conference on Web and Social Media, 14, 250–259.
  • Kapil, P., Ekbal, A., Das, D., 2020. A study of deep learning approaches for hate speech detection on social media. arXiv:2005.14690.
  • Köksal, A., 2018. Turkish-Word2Vec. GitHub. https://github.com/akoksal/Turkish-Word2Vec
  • Mikolov, T., 2013a. Efficient estimation of word representations in vector space. arXiv:1301.3781.
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J., 2013b. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 26.
  • Okkalı, E., Atamtürk, H., Kilimci, Z. H., 2021. Assessing public response to violence against women in Turkey via Twitter using topic modeling. Kocaeli Journal of Science and Engineering, 4(2), 103–112.
  • Oriola, O., Kotzé, E., 2020. Evaluation of machine learning techniques for detecting aggressive and hate speech in South African tweets. IEEE Access, 8, 21496–21509.
  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ..., Duchesnay, É., 2011. Scikit-learn: machine learning in python. The Journal of Machine Learning Research, 12, 2825–2830.
  • Pestian, J., Santel, D., Sorter, M., Bayram, U., Connolly, B., Glauser, T., DelBello, M., Tamang, S., Cohen, K., 2020. A machine learning approach to detect suicidal language. Suicide and Life-Threatening Behavior, 50(5), 939–947.
  • Perera, A., Fernando, P., 2021. Accurate detection and prevention of cyberbullying on social media. Procedia Computer Science, 181, 605–611.
  • Pitsilis, G. K., Ramampiaro, H., Langseth, H., 2018. Using deep learning to detect aggressive language in tweets. arXiv:1801.04433.
  • Reynolds, K., Kontostathis, A., Edwards, L., 2011. Using machine learning to detect cyberbullying. 10th International Conference on Machine Learning and Applications and Workshops, 2, 241–244. IEEE.
  • Rodríguez, D. A., Díaz-Ramírez, A., Miranda-Vega, J. E., Trujillo, L., Mejía-Alvarez, P., 2021. A systematic review of computer science solutions to address violence against women and children. IEEE Access, 9, 114622–114639.
  • Sen, S., Bolsoy, N., 2017. Violence against women: prevalence and risk factors in the example of Turkey. BMC Women’s Health, 17, 1–9.
  • Soykan, L., Karsak, C., Elkahlout, I. D., Aytan, B., 2022. Comparison of machine learning techniques for offensive language detection in Turkish. Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis, 16–24.
  • Srinivasa-Desikan, B., 2018. Natural language processing and computational linguistics: a practical guide to text analysis with python, gensim, spacy, and keras. Packt Publishing Ltd.
  • Şahi, H., Kılıç, Y., Sağlam, R. B., 2018. Automatic detection of hate speech against women on Twitter. 3rd International Conference on Computer Science and Engineering (UBMK), 533–536. IEEE.
  • Tommasel, A., Rodriguez, J. M., Godoy, D., 2018. Text-based aggression detection: with deep learning. Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), 177–187. Association for Computational Linguistics.
  • Trinh Ha, P., D’Silva, R., Chen, E., Koyutürk, M., Karakurt, G., 2022. Detecting intimate partner violence from free text descriptions on social media. Journal of Computational Social Science, 5(2), 1207–1233.
  • Türkiye İstatistik Kurumu, 2023. İstatistiklerle kadın. https://data.tuik.gov.tr/Bulten/Index?. Erişim tarihi: 28 Temmuz 2024.
  • World Health Organization, 2021. Global and regional estimates of violence against women: prevalence and health effects of intimate partner violence and non-partner sexual violence. W. H. O.
  • Zhao, R., Zhou, A., Mao, K., 2016. Automatic detection of bullying in social networks. Proceedings of the 17th International Conference on Distributed Computing and Networking, 1–6.
  • Zhou, C., Li, Q., Li, C., Yu, J., Liu, Y., Wang, G., ..., Sun, L., 2024. A comprehensive survey on pretrained foundation models: a history from bert to chatgpt. International Journal of Machine Learning and Cybernetics, 1–65.

Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti

Yıl 2025, Cilt: 8 Sayı: 2, 115 - 129, 29.09.2025
https://doi.org/10.38016/jista.1580712

Öz

Ülkemizde kadına yönelik fiziksel ve duygusal şiddet her geçen gün artmaktadır. Fakat bu artışı engelleyici mekanizmaların üretim ve geliştirilmesi aynı hızı yakalayamamaktadır. Gelişen yapay zekâ teknolojisinden faydalanarak kadına yönelik şiddetin önüne geçebilmek için atılabilecek ilk adımlardan biri, sosyal medya paylaşımlarından kadına yönelik şiddeti olumlayıp destekleyenleri tespit edip, bu kişileri sosyal medya mecralarından engellemektir. Bu makale, bahsedilen ilk adımın atılması amacıyla Türkçe sosyal medya paylaşımları üzerine gerçekleştirilmiş bir doğal dil işleme (DDİ) çalışmasını anlatmaktadır. Öncelikle Türkiye’de geçmiş yıllardan beri halen yaygın olarak kullanılan bir sosyal medya forumu veri kaynağı olarak seçilmiş, sonrasında ise kadına yönelik şiddet olumlaması içeren beşten fazla sayıdaki başlık altındaki paylaşımlar toplanıp işaretlenerek yeni bir Türkçe veri seti oluşturulmuştur. Veri seti farklı yöntemlerle analiz edildikten sonra, DDİ literatüründe sık kullanılan öznitelik çıkarma yöntemleriyle paylaşımlar modellenip kelime çantası, Random Forest, Gradient Boosting gibi çeşitli makine öğrenmesi yöntemleriyle şiddet olumlaması tespiti deneyleri gerçekleştirilmiştir. Bulgulara göre sosyal medya ortamlarında kadına yönelik şiddet içeren paylaşım sayılarının, kadınları savunan paylaşımlardan daha az olduğu tespit edilmiş, var olan şiddet paylaşımlarının da psikolojik şiddet ve aşağılama gibi içeriklerden oluştuğu görülmüştür. Model değerlendirme sürecinde hassasiyet, geri çağırma, F1 ve AUC (Area Under Curve) metrikleri kullanılmıştır. Elde edilen sonuçlara göre, kadına yönelik şiddet içerikli paylaşımların %76 AUC ve %77 geri çağırma oranlarıyla tespit edilebildiği ortaya çıkmıştır. Bu bulgular, sosyal medyada kadına yönelik şiddet içeren paylaşımların otomatik tespit edilip engellenmesi gibi hassas çözümlerin ülkemizde uygulanabilirliğini göstermiştir.

Destekleyen Kurum

TÜBİTAK 2209A

Proje Numarası

1919B012303496

Teşekkür

Bu çalışma TÜBİTAK 2209A programı kapsamında desteklenmiştir. Proje numarası: 1919B012303496.

Kaynakça

  • Alshehri, A., Nagoudi, E. M. B., Abdul-Mageed, M., 2020. Understanding and detecting dangerous speech on social media. arXiv:2005.06608.
  • Ashraf, N., Mustafa, R., Sidorov, G., Gelbukh, A., 2020. Classification of individual and group violence threats in online discussions. Companion Proceedings of the Web Conference, 629–633.
  • Aytaç, S., Eteman, F. S., Aydın, G. C., Reçber, B., Sezen, H. K., 2016. Kadına yönelik şiddetin dünü, bugünü, yarını: Kestirim tabanlı bir araştırma. Istanbul Journal of Sociological Studies, (54), 275–297.
  • Bayram, U., 2022. Uncovering the impacts of the pandemic by analyzing COVID-19 related news articles using machine learning and network analysis. Journal of Information Technologies, 15(2), 209–220.
  • Bayram, U., Benhiba, L., 2021. Determining a person's suicide risk based on short-term tweet history. Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access, 81–86.
  • Bayram, U., Benhiba, L., 2022. Emotionally-informed models for detecting moments of change and suicide risk levels in longitudinal social media data. In Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology, 219–225.
  • Bayram, U., Pestian, J., Santel, D., Minai, A. A., 2019. What's in a word? Detecting party affiliation from word usage in congressional speeches. International Joint Conference on Neural Networks (IJCNN), 1–8. IEEE.
  • Blei, D. M., Ng, A. Y., Jordan, M. I., 2003. Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
  • Castorena, C. M., Abundez, I. M., Alejo, R., Granda-Gutiérrez, E. E., Rendón, E., Villegas, O., 2021. Detection of gender-based violence in Twitter messages using deep neural networks. Mathematics, 9(8), 807.
  • Cihan, Ü., Karakaya, H., 2017. Kadın-erkek kavramları bağlamında şiddet ve şiddetle mücadelede sosyal hizmetin rolü. Bolu Abant İzzet Baysal Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, 17(4), 297–324.
  • Deng, K., Bol, P. K., Li, K. J., Liu, J. S., 2016. Unsupervised analysis of domain-specific Chinese texts. Proceedings of the National Academy of Sciences, 113(22), 6154–6159.
  • Desai, A., Kalaskar, S., Kumbhar, O., Dhumal, R., 2021. Detecting cyberbullying on social media using machine learning. ITM Web of Conferences, 40, 03038. EDP Sciences.
  • Ensari, T., Ensari, B., Dağtekin, M., 2022. Violence detection with machine learning: a sociodemographic approach. Journal of European Science and Technology, (44), 104–107.
  • González, G. A. R., Cantu-Ortiz, F. J., 2021. Sentiment analysis and unsupervised learning approach for digital violence against women: the Monterrey case. 4th International Conference on Information and Computer Technologies (ICICT), 18–26. IEEE.
  • Guo, Y., Kim, S., Warren, E., Yang, Y.-C., Lakamana, S., Sarker, A., 2023. Automatically detecting victims of intimate partner violence from social media. AMIA Summits on Translational Science Proceedings, 254.
  • Hassan, N., Poudel, A., Hale, J., Hubacek, C., Huq, K. T., Santu, S. K. K., Ahmed, S. I., 2020. A step towards automated sexual violence reporting tracking. Proceedings of the International AAAI Conference on Web and Social Media, 14, 250–259.
  • Kapil, P., Ekbal, A., Das, D., 2020. A study of deep learning approaches for hate speech detection on social media. arXiv:2005.14690.
  • Köksal, A., 2018. Turkish-Word2Vec. GitHub. https://github.com/akoksal/Turkish-Word2Vec
  • Mikolov, T., 2013a. Efficient estimation of word representations in vector space. arXiv:1301.3781.
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., Dean, J., 2013b. Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, 26.
  • Okkalı, E., Atamtürk, H., Kilimci, Z. H., 2021. Assessing public response to violence against women in Turkey via Twitter using topic modeling. Kocaeli Journal of Science and Engineering, 4(2), 103–112.
  • Oriola, O., Kotzé, E., 2020. Evaluation of machine learning techniques for detecting aggressive and hate speech in South African tweets. IEEE Access, 8, 21496–21509.
  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ..., Duchesnay, É., 2011. Scikit-learn: machine learning in python. The Journal of Machine Learning Research, 12, 2825–2830.
  • Pestian, J., Santel, D., Sorter, M., Bayram, U., Connolly, B., Glauser, T., DelBello, M., Tamang, S., Cohen, K., 2020. A machine learning approach to detect suicidal language. Suicide and Life-Threatening Behavior, 50(5), 939–947.
  • Perera, A., Fernando, P., 2021. Accurate detection and prevention of cyberbullying on social media. Procedia Computer Science, 181, 605–611.
  • Pitsilis, G. K., Ramampiaro, H., Langseth, H., 2018. Using deep learning to detect aggressive language in tweets. arXiv:1801.04433.
  • Reynolds, K., Kontostathis, A., Edwards, L., 2011. Using machine learning to detect cyberbullying. 10th International Conference on Machine Learning and Applications and Workshops, 2, 241–244. IEEE.
  • Rodríguez, D. A., Díaz-Ramírez, A., Miranda-Vega, J. E., Trujillo, L., Mejía-Alvarez, P., 2021. A systematic review of computer science solutions to address violence against women and children. IEEE Access, 9, 114622–114639.
  • Sen, S., Bolsoy, N., 2017. Violence against women: prevalence and risk factors in the example of Turkey. BMC Women’s Health, 17, 1–9.
  • Soykan, L., Karsak, C., Elkahlout, I. D., Aytan, B., 2022. Comparison of machine learning techniques for offensive language detection in Turkish. Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis, 16–24.
  • Srinivasa-Desikan, B., 2018. Natural language processing and computational linguistics: a practical guide to text analysis with python, gensim, spacy, and keras. Packt Publishing Ltd.
  • Şahi, H., Kılıç, Y., Sağlam, R. B., 2018. Automatic detection of hate speech against women on Twitter. 3rd International Conference on Computer Science and Engineering (UBMK), 533–536. IEEE.
  • Tommasel, A., Rodriguez, J. M., Godoy, D., 2018. Text-based aggression detection: with deep learning. Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), 177–187. Association for Computational Linguistics.
  • Trinh Ha, P., D’Silva, R., Chen, E., Koyutürk, M., Karakurt, G., 2022. Detecting intimate partner violence from free text descriptions on social media. Journal of Computational Social Science, 5(2), 1207–1233.
  • Türkiye İstatistik Kurumu, 2023. İstatistiklerle kadın. https://data.tuik.gov.tr/Bulten/Index?. Erişim tarihi: 28 Temmuz 2024.
  • World Health Organization, 2021. Global and regional estimates of violence against women: prevalence and health effects of intimate partner violence and non-partner sexual violence. W. H. O.
  • Zhao, R., Zhou, A., Mao, K., 2016. Automatic detection of bullying in social networks. Proceedings of the 17th International Conference on Distributed Computing and Networking, 1–6.
  • Zhou, C., Li, Q., Li, C., Yu, J., Liu, Y., Wang, G., ..., Sun, L., 2024. A comprehensive survey on pretrained foundation models: a history from bert to chatgpt. International Journal of Machine Learning and Cybernetics, 1–65.
Toplam 38 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Bağlam Öğrenimi, Derin Öğrenme, Veri Madenciliği ve Bilgi Keşfi, Doğal Dil İşleme
Bölüm Araştırma Makalesi
Yazarlar

Merve Kavut 0009-0003-4452-3938

Amina Dzafic 0009-0008-9328-3949

Ulya Bayram 0000-0002-8150-4053

Proje Numarası 1919B012303496
Yayımlanma Tarihi 29 Eylül 2025
Gönderilme Tarihi 6 Kasım 2024
Kabul Tarihi 18 Mayıs 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 8 Sayı: 2

Kaynak Göster

APA Kavut, M., Dzafic, A., & Bayram, U. (2025). Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti. Journal of Intelligent Systems: Theory and Applications, 8(2), 115-129. https://doi.org/10.38016/jista.1580712
AMA Kavut M, Dzafic A, Bayram U. Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti. jista. Eylül 2025;8(2):115-129. doi:10.38016/jista.1580712
Chicago Kavut, Merve, Amina Dzafic, ve Ulya Bayram. “Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti”. Journal of Intelligent Systems: Theory and Applications 8, sy. 2 (Eylül 2025): 115-29. https://doi.org/10.38016/jista.1580712.
EndNote Kavut M, Dzafic A, Bayram U (01 Eylül 2025) Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti. Journal of Intelligent Systems: Theory and Applications 8 2 115–129.
IEEE M. Kavut, A. Dzafic, ve U. Bayram, “Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti”, jista, c. 8, sy. 2, ss. 115–129, 2025, doi: 10.38016/jista.1580712.
ISNAD Kavut, Merve vd. “Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti”. Journal of Intelligent Systems: Theory and Applications 8/2 (Eylül2025), 115-129. https://doi.org/10.38016/jista.1580712.
JAMA Kavut M, Dzafic A, Bayram U. Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti. jista. 2025;8:115–129.
MLA Kavut, Merve vd. “Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti”. Journal of Intelligent Systems: Theory and Applications, c. 8, sy. 2, 2025, ss. 115-29, doi:10.38016/jista.1580712.
Vancouver Kavut M, Dzafic A, Bayram U. Türkçe Sosyal Medya Paylaşımlarında Doğal Dil İşleme ve Makine Öğrenmesi ile Kadına Yönelik Şiddet Tespiti. jista. 2025;8(2):115-29.

Zeki Sistemler Teori ve Uygulamaları Dergisi