Cyberbullying is defined as an aggressive, intentional action against a defenseless person by using the Internet, or other electronic contents. Researchers have found that many of the bullying cases have tragically ended in suicides; hence automatic detection of cyberbullying has become important. In this study we show the effects of feature extraction, feature selection, and classification methods that are used, on the performance of automatic detection of cyberbullying. To perform the experiments FormSpring.me dataset is used and the effects of preprocessing methods; several classifiers like C4.5, Naïve Bayes, kNN, and SVM; and information gain and chi square feature selection methods are investigated. Experimental results indicate that the best classification results are obtained when alphabetic tokenization, no stemming, and no stopwords removal are applied. Using feature selection also improves cyberbully detection performance. When classifiers are compared, C4.5 performs the best for the used dataset.
Cyberbullying Preprocessing; Feature selection; Classification
Bölüm | Makaleler |
---|---|
Yazarlar | |
Yayımlanma Tarihi | 15 Nisan 2017 |
Yayımlandığı Sayı | Yıl 2017 Cilt: 21 Sayı: 1 |
e-ISSN :1308-6529
Linking ISSN (ISSN-L): 1300-7688
Dergide yayımlanan tüm makalelere ücretiz olarak erişilebilinir ve Creative Commons CC BY-NC Atıf-GayriTicari lisansı ile açık erişime sunulur. Tüm yazarlar ve diğer dergi kullanıcıları bu durumu kabul etmiş sayılırlar. CC BY-NC lisansı hakkında detaylı bilgiye erişmek için tıklayınız.