Application of Grid Search Parameter Optimized Bayesian Logistic Regression Algorithm to Detect Cyberbullying in Turkish Microblog Data
Abstract
There is a huge interaction between users of various social media platforms. This communication produces enormous amount of user data worth to be analyzed from numerous aspects. One of the research area emerging from the user data is a major security issue known as cyberbullying. Since this problem has been recognized as the source of cybercrimes, design of a system to detect cyberbullying attacks/sources through the micro-blog texts is evident. Most of the academic search of this topic has been conducted in English language. The originality of this paper is that we develop an accurate cyberbullying detection system for Turkish language. We used data from Twitter to develop a supervised machine learning model on top of Bayesian Logistic Regression whose parameters are tuned with the use of grid-search algorithm. Since the text data produces a high dimensional training space for machine learning algorithms, we also used Chi-Squared (CH2) feature selection strategy to obtain best subset of features. The optimized version of the proposed algorithm on top of reduced feature dimension has produced an f-measure value of 0.925. Finally, we also compared the results of the proposed algorithm with the frequently used machine learning methods from literature and we provided the corresponding results in related sections.
Keywords
References
- [1] M. A. Al-garadi, K. D. Varathan, and S. D. Ravana, “Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network”, Computers in Human Behavior, vol. 63, pp. 433–443, 2016.
- [2] N. Tahmasbi and A. Fuchberger, “Challenges and future directions of automated cyberbullying detection”, in Twenty-fourth Americas Conference on Information Systems, New Orleans, USA, (2018).
- [3] M. Arntfield, “Toward a cybervictimology: Cyberbullying, routine activities theory, and the anti-sociality of social media”, Canadian Journal of Communication, vol. 40, pp. 371-388, 2015.
- [4] C. Salmivalli, “Bullying and the peer group: A review”, Aggression and Violent Behavior, vol. 15, pp. 112-120, 2010.
- [5] R. M. Kowalski, G. W. Giumetti, A. N. Schroeder, and M. R. Lattanner, “Bullying in the digital age: A critical review and meta-analysis of cyberbullying research among youth”, Psychological Bulletin, vol. 140, no. 4, pp. 1073-1137, 2014.
- [6] E. Menesini et al., “Cyberbullying definition among adolescents: A comparison across six european countries”, Cyberpsychology, Behavior, and Social Networking, vol. 15, no. 9, pp. 455–463, 2012.
- [7] K. Dinakar, B. Jones, C. Havasi, H. Lieberman, and R. Picard, “Common sense reasoning for detection, prevention, and mitigation of cyberbullying”, ACM Transactions on Interactive Intelligent Systems, vol. 2, no. 3, pp. 1–30, 2012.
- [8] S. Nadali, M. A. A. Murad, N. M. Sharef, A. Mustapha, and S. Shojaee, “A Review of cyberbullying detection . An overview”, in 2013 13th International Conference on Intelligent Systems Design and Applications (ISDA), Kuala Lumpur, Malaysia, 325-330, 2013.
Details
Primary Language
English
Subjects
Engineering
Journal Section
Research Article
Authors
Akın Özçift
This is me
0000-0003-2840-1917
Türkiye
Deniz Kılınç
This is me
0000-0002-2336-8831
Türkiye
Fatma Bozyiğit
*
0000-0002-5898-7464
Türkiye
Publication Date
September 28, 2019
Submission Date
December 12, 2018
Acceptance Date
April 24, 2019
Published in Issue
Year 2019 Volume: 7 Number: 3
Cited By
Detection and cross-domain evaluation of cyberbullying in Facebook activity contents for Turkish
ACM Transactions on Asian and Low-Resource Language Information Processing
https://doi.org/10.1145/3580393Yapım İşlerinde İhale Parametreleri Kullanılarak Makine Öğrenmesi ile Sözleşme Bedeli Tahmini
Karaelmas Science and Engineering Journal
https://doi.org/10.7212/karaelmasfen.1484595