Research Article
BibTex RIS Cite

Naive Bayes Algoritmasının AdaBoost Topluluk Öğrenme Modeli ile Sosyal Medyada Sahte ve Gerçek Haberlerinin Ayırt Edilmesi

Year 2021, Issue: 28, 459 - 462, 30.11.2021
https://doi.org/10.31590/ejosat.1005577

Abstract

Sosyal medya kullanımında sürekli bir artış yaşanmakta ve kullanıcılar arasında büyük bir etkileşim gerçekleşmektedir. Bu bağlamda yalan haber sirkülasyonu veya yayılımı, sosyal medya kullanıcıları için çeşitli açılardan gerçek bir tehdit haline gelmektedir. Yalan haber, yanıltıcı bilgilerin doğru haber gibi sunulması olarak tanımlanmaktadır. Bu görüşe göre, sahte haber, bir çıkar elde etmek için kamuoyunu manipüle etmeyi amaçlayan uydurma haberlerdir. Örneğin, tıklama tuzakları yoluyla kâr elde etmek için okuyucu sayısını artırmak böyle bir amaçtır. Sosyal medya kullanıcıları, ziyaretçi sayısını artırmak için dikkat çekici başlıklar veya web bağlantıları aracılığıyla manipüle edilmektedir. Bu nedenle, sosyal medya kullanıcıları tarafından yanlışlıkla web trafiğini filtrelemek için otomatik bir sahte haber tanımlama modeli kullanılabilir. Bu amaçla literatürde sahte haber problemine çözüm olarak makine öğrenmesi algoritmaları kullanılmaktadır. Makine öğrenimi literatüründe, temel modellerin performansını geliştirmek kritik öneme sahiptir. Topluluk öğrenimi, model performansını artırmanın temel çözümlerinden biridir. Bu çalışmada, önce bir dizi temel makine öğrenmesi algoritması oluşturulmuş ve bu algoritmalar sahte haber tanımlama yetenekleri bağlamında test edilmiştir. Daha sonra elde edilen sonuçları daha da geliştirmek için topluluk öğrenme stratejisi kullanılmıştır. Diğer bir ifade ile %96.74 doğrulukla en iyi sahte haber tahmincisi olarak Naïve Bayes Multinomial sınıflandırıcısını elde edilmiştir. Daha sonra bir AdaBoost topluluğu öğrenme stratejisi uygulanarak bu tahmin yeteneği daha da geliştirilmiş ve başarım %98,2'ye çıkarılmıştır.

References

  • Akhter, M. P., Zheng, J., Afzal, F., Lin, H., Riaz, S., & Mehmood, A. (2021). Supervised Ensemble Learning Methods Towards Automatically Filtering Urdu Fake News within Social Media. PeerJ Computer Science. https://doi.org/10.7717/peerj-cs.425
  • Al-Yahya, M., Al-Khalifa, H., Al-Baity, H., Alsaeed, D., & Essam, A. (2021). Arabic Fake News Detection: Comparative Study of Neural Networks and Transformer-Based Approaches. Complexity, 2021. https://doi.org/10.1155/2021/5516945
  • Aslam, N., Ullah Khan, I., Alotaibi, F. S., Aldaej, L. A., & Aldubaikil, A. K. (2021). Fake Detect: A Deep Learning Ensemble Model for Fake News Detection. Complexity, 2021. https://doi.org/10.1155/2021/5557784
  • Das, S. D., Basak, A., & Dutta, S. (2021). A Heuristic-Driven Ensemble Framework for COVID-19 Fake News Detection. ArXiv Preprint ArXiv: 2101.03545v1. https://doi.org/10.1007/978-3-030-73696-5_16
  • Github. (2021). GitHub - sfkcvk/TurkishFakeNewsDataset: This is the reporsitory of Turkish fake news dataset which consists of Zaytung posts and Hurriyet news articles. https://github.com/sfkcvk/TurkishFakeNewsDataset
  • Guan, Z. (2021). TSIA team at FakeDeS 2021 : Fake News Detection in Spanish Using Multi-Model Ensemble Learning. Iberian Languages Evaluation Forum (IberLEF 2021).
  • Hakak, S., Alazab, M., Khan, S., Gadekallu, T. R., Maddikunta, P. K. R., & Khan, W. Z. (2021). An Ensemble Machine Learning Approach through Effective Feature Extraction to Classify Fake News. Future Generation Computer Systems, 117, 47–58. https://doi.org/10.1016/j.future.2020.11.022
  • Kadhim, A. I. (2019). Term Weighting for Feature Extraction on Twitter: A Comparison between BM25 and TF-IDF. 2019 International Conference on Advanced Science and Engineering, ICOASE 2019, 124–128. https://doi.org/10.1109/ICOASE.2019.8723825
  • Lekshmiammal, H. R., & Madasamy, A. K. (2021). NITK _ NLP at CheckThat ! 2021 : Ensemble Transformer Model for Fake News Classification. Conference and Labs Ofthe Evaluation Forum (CLEF 2021).
  • Meel, P., & Vishwakarma, D. K. (2021). HAN, Image Captioning, and Forensics Ensemble Multimodal Fake News Detection. Information Sciences, 567, 23–41. https://doi.org/10.1016/j.ins.2021.03.037
  • Noman Qasem, S., Al-Sarem, M., & Saeed, F. (2021). An Ensemble Learning Based Approach for Detecting and Tracking COVID19 Rumors. Computers, Materials & Continua, 70(1), 1721–1747. https://doi.org/10.32604/cmc.2022.018972
  • Olaleye, T. O., Arogundade, O. T., Abayomi-Alli, A., & Adesemowo, A. K. (2021). An Ensemble Predictive Analytics of COVID-19 Infodemic Tweets using Bag of Words. Data Science for COVID-19, 365–380. https://doi.org/10.1016/B978-0-12-824536-1.00004-6
  • Onan, A., & Tocoglu, M. A. (2020). Satire identification in Turkish news articles based on ensemble of classifiers. Turkish Journal of Electrical Engineering and Computer Sciences, 28(2), 1086–1106. https://doi.org/10.3906/elk-1907-11
  • Ribeiro Bezerra, J. F. (2021). Content-Based Fake News Classification through Modified Voting Ensemble. Journal of Information and Telecommunication, 1–15. https://doi.org/10.1080/24751839.2021.1963912
  • Sasikala, B. S., Biju, V. G., & Prashanth, C. M. (2017). Kappa and accuracy evaluations of machine learning classifiers. 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)., 20–23. https://doi.org/10.1109/RTEICT.2017.8256551
  • Shushkevich, E., & Cardiff, J. (2021). TUDublin Team at Constraint@AAAI2021 -- COVID19 Fake News Detection. ArXiv Preprint ArXiv: 2101.05701v1. http://arxiv.org/abs/2101.05701

AdaBoost Ensemble Learning on top of Naive Bayes Algorithm to Discriminate Fake and Genuine News from Social Media

Year 2021, Issue: 28, 459 - 462, 30.11.2021
https://doi.org/10.31590/ejosat.1005577

Abstract

There is a continuous increase in social media usage and a huge interaction takes place between users. In this context, fake news circulation or flood becomes a real thread for social media users from various perspectives. Fake news is defined as presentation of misleading information as true news. In this view, fake news is fabricated news that aims to manipulate public opinion to obtain a benefit. For example, increasing readership for profiting through clickbaits is such an aim. Social media users are manipulated through attention grabbing headlines or web-links to increase number of visitors. Therefore, an automated fake news identification model can be used by social media users to filter inadvertent web-traffic. For this goal machine learning algorithms are used in the literature as a solution for fake news problem. In machine learning literature, advancing performance of the base models is crucial. Ensemble learning is one of the key solutions to enhance model efficiency. In this work, we first generated a set of baseline machine learning algorithms and we tested them in terms of their fake news identification ability. We then made use of ensemble learning strategy to further enhance obtained results. More precisely, we obtained Naïve Bayes Multinomial classifier as the best fake news predictor having 96.74 % accuracy. We then further improved this prediction ability to 98.2 % by applying an AdaBoost ensemble learning strategy.

References

  • Akhter, M. P., Zheng, J., Afzal, F., Lin, H., Riaz, S., & Mehmood, A. (2021). Supervised Ensemble Learning Methods Towards Automatically Filtering Urdu Fake News within Social Media. PeerJ Computer Science. https://doi.org/10.7717/peerj-cs.425
  • Al-Yahya, M., Al-Khalifa, H., Al-Baity, H., Alsaeed, D., & Essam, A. (2021). Arabic Fake News Detection: Comparative Study of Neural Networks and Transformer-Based Approaches. Complexity, 2021. https://doi.org/10.1155/2021/5516945
  • Aslam, N., Ullah Khan, I., Alotaibi, F. S., Aldaej, L. A., & Aldubaikil, A. K. (2021). Fake Detect: A Deep Learning Ensemble Model for Fake News Detection. Complexity, 2021. https://doi.org/10.1155/2021/5557784
  • Das, S. D., Basak, A., & Dutta, S. (2021). A Heuristic-Driven Ensemble Framework for COVID-19 Fake News Detection. ArXiv Preprint ArXiv: 2101.03545v1. https://doi.org/10.1007/978-3-030-73696-5_16
  • Github. (2021). GitHub - sfkcvk/TurkishFakeNewsDataset: This is the reporsitory of Turkish fake news dataset which consists of Zaytung posts and Hurriyet news articles. https://github.com/sfkcvk/TurkishFakeNewsDataset
  • Guan, Z. (2021). TSIA team at FakeDeS 2021 : Fake News Detection in Spanish Using Multi-Model Ensemble Learning. Iberian Languages Evaluation Forum (IberLEF 2021).
  • Hakak, S., Alazab, M., Khan, S., Gadekallu, T. R., Maddikunta, P. K. R., & Khan, W. Z. (2021). An Ensemble Machine Learning Approach through Effective Feature Extraction to Classify Fake News. Future Generation Computer Systems, 117, 47–58. https://doi.org/10.1016/j.future.2020.11.022
  • Kadhim, A. I. (2019). Term Weighting for Feature Extraction on Twitter: A Comparison between BM25 and TF-IDF. 2019 International Conference on Advanced Science and Engineering, ICOASE 2019, 124–128. https://doi.org/10.1109/ICOASE.2019.8723825
  • Lekshmiammal, H. R., & Madasamy, A. K. (2021). NITK _ NLP at CheckThat ! 2021 : Ensemble Transformer Model for Fake News Classification. Conference and Labs Ofthe Evaluation Forum (CLEF 2021).
  • Meel, P., & Vishwakarma, D. K. (2021). HAN, Image Captioning, and Forensics Ensemble Multimodal Fake News Detection. Information Sciences, 567, 23–41. https://doi.org/10.1016/j.ins.2021.03.037
  • Noman Qasem, S., Al-Sarem, M., & Saeed, F. (2021). An Ensemble Learning Based Approach for Detecting and Tracking COVID19 Rumors. Computers, Materials & Continua, 70(1), 1721–1747. https://doi.org/10.32604/cmc.2022.018972
  • Olaleye, T. O., Arogundade, O. T., Abayomi-Alli, A., & Adesemowo, A. K. (2021). An Ensemble Predictive Analytics of COVID-19 Infodemic Tweets using Bag of Words. Data Science for COVID-19, 365–380. https://doi.org/10.1016/B978-0-12-824536-1.00004-6
  • Onan, A., & Tocoglu, M. A. (2020). Satire identification in Turkish news articles based on ensemble of classifiers. Turkish Journal of Electrical Engineering and Computer Sciences, 28(2), 1086–1106. https://doi.org/10.3906/elk-1907-11
  • Ribeiro Bezerra, J. F. (2021). Content-Based Fake News Classification through Modified Voting Ensemble. Journal of Information and Telecommunication, 1–15. https://doi.org/10.1080/24751839.2021.1963912
  • Sasikala, B. S., Biju, V. G., & Prashanth, C. M. (2017). Kappa and accuracy evaluations of machine learning classifiers. 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)., 20–23. https://doi.org/10.1109/RTEICT.2017.8256551
  • Shushkevich, E., & Cardiff, J. (2021). TUDublin Team at Constraint@AAAI2021 -- COVID19 Fake News Detection. ArXiv Preprint ArXiv: 2101.05701v1. http://arxiv.org/abs/2101.05701
There are 16 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Mehmet Bozuyla 0000-0002-7485-6106

Publication Date November 30, 2021
Published in Issue Year 2021 Issue: 28

Cite

APA Bozuyla, M. (2021). AdaBoost Ensemble Learning on top of Naive Bayes Algorithm to Discriminate Fake and Genuine News from Social Media. Avrupa Bilim Ve Teknoloji Dergisi(28), 459-462. https://doi.org/10.31590/ejosat.1005577