Research Article

Creating a New Dataset for the Classification of Cyber Bullying

Volume: 3 Number: 2 October 29, 2023
EN

Creating a New Dataset for the Classification of Cyber Bullying

Abstract

Regardless of young or old, people have quickly stepped into the world of internet with today's communication technologies such as phones, tablets, computers and smart devices. As the place of the Internet in people's lives increases, social media platforms are diversifying and users want to take part in these platforms. With the increase in the number of social media users, some negativities are encountered. The most important problem encountered in social media platforms is cyber bullying. Although cyber bullying seems to be a daily dialogue between social media users or between groups, the situation of encountering is increasing day by day with the diversity of shared information, content and agenda social media environments. With the development of technology, it is necessary to develop a platform that detects bullying with artificial intelligence technologies. One of the biggest difficulties in text classification problems that we encounter during the development of these platforms is the need to train the artificial intelligence algorithm to be used with labeled data. In this study, 21 different people, including journalists, athletes, scientists, doctors, politicians, comedians, social media phenomena, and artists who actively use social media, were selected in order to create the necessary dataset for training the models to be developed to detect cyber bullying situations. The public messages (mentions) of these 21 people sent via Twitter were compiled. After filtering the repetitive and meaningless messages sent by bot accounts out of 10500 tweets compiled, the number of messages in the dataset decreased to 7706. The labeling process, which is necessary for the dataset to be used for training and testing purposes in classification processes, was carried out by three independent people who were given preliminary information about cyberbullying (1=Includes Cyber bullying, 0=Does not include Cyber bullying). The majority of the tags, which were read and assigned by 3 different people, were accepted as the final class of the relevant message. Afterwards, the dataset was preprocessed in accordance with the principles of natural language processing and made suitable for classification algorithms. The findings obtained after the classification processes performed with the basic classification algorithms are shared. When the findings are examined, it is understood that the data set created has the competence to be used in the detection and prevention of cyber bullying. In this context, it is predicted that training specially developed and optimized artificial intelligence algorithms with the relevant dataset for the detection of cyberbullying will greatly increase the success rate.

Keywords

References

  1. Gezgin, D. M., & Çuhadar, C. “Bilgisayar ve öğretim teknolojileri eğitimi bölümü öğrencilerinin siber zorbalığa ilişkin duyarlılık düzeylerinin incelenmesi”, Eğitim Bilimleri Araştırmaları Dergisi, 2(2) (2012), 93-104.
  2. Özdemir, M., & Akar, F. “Lise Öğrencilerinin Siber-Zorbalığa İlişkin Görüşlerinin Bazı Değişkenler Bakımından İncelenmesi”, Kuram ve Uygulamada Eğitim Yönetimi, 4(4) (2011), 605-626.
  3. Eroğlu, Y., Güler, N. “Koşullu Öz-Değer, Riskli İnternet Davranışları ve Siber Zorbalık/Mağduriyet Arasındaki İlişkinin İncelenmesi”, Sakarya University Journal Of Education, 5(3) (2015), 118-129.
  4. Global social media usage report 2021, https://www.statista.com/ (accessed: Apr 10, 2022).
  5. Turkey Internet, social media and Mobile User Statistics According to We Are Social 2020-2021 Report Https://Wearesocial.Com/ (accessed: Jun 15 2022).
  6. Žufić, T. Žajgar, S. Prkić, “Children Online Safety”, 2017 40th International Convention On Information And Communication Technology, Electronics And Microelectronics (MIPRO), 22-26 May 2017, Opatija, Croatia
  7. Ayas, T., & Horzum, M. B. (2011). Exploring The Teachers' Cyber Bullying Perception In Terms Of Various Variables. International Online Journal of Educational Sciences, 3(2).
  8. S. Karabatak, A. Namlı, M. Karabatak, “Perceptions of High School Students Regarding Cyberbullying and Precautions on Coping With Cyberbullying”, 2018 6th International Symposium On Digital Forensic And Security (ISDFS), 22-25 March 2018, Antalya, Turkey.

Details

Primary Language

English

Subjects

Artificial Intelligence

Journal Section

Research Article

Early Pub Date

October 23, 2023

Publication Date

October 29, 2023

Submission Date

November 17, 2022

Acceptance Date

April 22, 2023

Published in Issue

Year 2023 Volume: 3 Number: 2

APA
Koçak, Ç., Yiğit, T., & Bilen, M. (2023). Creating a New Dataset for the Classification of Cyber Bullying. Advances in Artificial Intelligence Research, 3(2), 45-53. https://doi.org/10.54569/aair.1206144
AMA
1.Koçak Ç, Yiğit T, Bilen M. Creating a New Dataset for the Classification of Cyber Bullying. Adv. Artif. Intell. Res. 2023;3(2):45-53. doi:10.54569/aair.1206144
Chicago
Koçak, Çilem, Tuncay Yiğit, and Mehmet Bilen. 2023. “Creating a New Dataset for the Classification of Cyber Bullying”. Advances in Artificial Intelligence Research 3 (2): 45-53. https://doi.org/10.54569/aair.1206144.
EndNote
Koçak Ç, Yiğit T, Bilen M (October 1, 2023) Creating a New Dataset for the Classification of Cyber Bullying. Advances in Artificial Intelligence Research 3 2 45–53.
IEEE
[1]Ç. Koçak, T. Yiğit, and M. Bilen, “Creating a New Dataset for the Classification of Cyber Bullying”, Adv. Artif. Intell. Res., vol. 3, no. 2, pp. 45–53, Oct. 2023, doi: 10.54569/aair.1206144.
ISNAD
Koçak, Çilem - Yiğit, Tuncay - Bilen, Mehmet. “Creating a New Dataset for the Classification of Cyber Bullying”. Advances in Artificial Intelligence Research 3/2 (October 1, 2023): 45-53. https://doi.org/10.54569/aair.1206144.
JAMA
1.Koçak Ç, Yiğit T, Bilen M. Creating a New Dataset for the Classification of Cyber Bullying. Adv. Artif. Intell. Res. 2023;3:45–53.
MLA
Koçak, Çilem, et al. “Creating a New Dataset for the Classification of Cyber Bullying”. Advances in Artificial Intelligence Research, vol. 3, no. 2, Oct. 2023, pp. 45-53, doi:10.54569/aair.1206144.
Vancouver
1.Çilem Koçak, Tuncay Yiğit, Mehmet Bilen. Creating a New Dataset for the Classification of Cyber Bullying. Adv. Artif. Intell. Res. 2023 Oct. 1;3(2):45-53. doi:10.54569/aair.1206144

88x31.png
Advances in Artificial Intelligence Research is an open access journal which means that the content is freely available without charge to the user or his/her institution. All papers are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, which allows users to distribute, remix, adapt, and build upon the material in any medium or format for non-commercial purposes only, and only so long as attribution is given to the creator.

Graphic design @ Özden Işıktaş