AdaBoost Ensemble Learning on top of Naive Bayes Algorithm to Discriminate Fake and Genuine News from Social Media

Mehmet Bozuyla

doi:10.31590/ejosat.1005577

Research Article

Naive Bayes Algoritmasının AdaBoost Topluluk Öğrenme Modeli ile Sosyal Medyada Sahte ve Gerçek Haberlerinin Ayırt Edilmesi

Year 2021, Issue: 28, 459 - 462, 30.11.2021

Mehmet Bozuyla

https://doi.org/10.31590/ejosat.1005577

Cited By: 3

Abstract

Sosyal medya kullanımında sürekli bir artış yaşanmakta ve kullanıcılar arasında büyük bir etkileşim gerçekleşmektedir. Bu bağlamda yalan haber sirkülasyonu veya yayılımı, sosyal medya kullanıcıları için çeşitli açılardan gerçek bir tehdit haline gelmektedir. Yalan haber, yanıltıcı bilgilerin doğru haber gibi sunulması olarak tanımlanmaktadır. Bu görüşe göre, sahte haber, bir çıkar elde etmek için kamuoyunu manipüle etmeyi amaçlayan uydurma haberlerdir. Örneğin, tıklama tuzakları yoluyla kâr elde etmek için okuyucu sayısını artırmak böyle bir amaçtır. Sosyal medya kullanıcıları, ziyaretçi sayısını artırmak için dikkat çekici başlıklar veya web bağlantıları aracılığıyla manipüle edilmektedir. Bu nedenle, sosyal medya kullanıcıları tarafından yanlışlıkla web trafiğini filtrelemek için otomatik bir sahte haber tanımlama modeli kullanılabilir. Bu amaçla literatürde sahte haber problemine çözüm olarak makine öğrenmesi algoritmaları kullanılmaktadır. Makine öğrenimi literatüründe, temel modellerin performansını geliştirmek kritik öneme sahiptir. Topluluk öğrenimi, model performansını artırmanın temel çözümlerinden biridir. Bu çalışmada, önce bir dizi temel makine öğrenmesi algoritması oluşturulmuş ve bu algoritmalar sahte haber tanımlama yetenekleri bağlamında test edilmiştir. Daha sonra elde edilen sonuçları daha da geliştirmek için topluluk öğrenme stratejisi kullanılmıştır. Diğer bir ifade ile %96.74 doğrulukla en iyi sahte haber tahmincisi olarak Naïve Bayes Multinomial sınıflandırıcısını elde edilmiştir. Daha sonra bir AdaBoost topluluğu öğrenme stratejisi uygulanarak bu tahmin yeteneği daha da geliştirilmiş ve başarım %98,2'ye çıkarılmıştır.

Keywords

Sahte Haber Tespiti , Topluluk Öğrenmesi , Makine Öğrenmesi , Metin Madenciliği , Sosyal Medya.

References

Akhter, M. P., Zheng, J., Afzal, F., Lin, H., Riaz, S., & Mehmood, A. (2021). Supervised Ensemble Learning Methods Towards Automatically Filtering Urdu Fake News within Social Media. PeerJ Computer Science. https://doi.org/10.7717/peerj-cs.425
Al-Yahya, M., Al-Khalifa, H., Al-Baity, H., Alsaeed, D., & Essam, A. (2021). Arabic Fake News Detection: Comparative Study of Neural Networks and Transformer-Based Approaches. Complexity, 2021. https://doi.org/10.1155/2021/5516945
Aslam, N., Ullah Khan, I., Alotaibi, F. S., Aldaej, L. A., & Aldubaikil, A. K. (2021). Fake Detect: A Deep Learning Ensemble Model for Fake News Detection. Complexity, 2021. https://doi.org/10.1155/2021/5557784
Das, S. D., Basak, A., & Dutta, S. (2021). A Heuristic-Driven Ensemble Framework for COVID-19 Fake News Detection. ArXiv Preprint ArXiv: 2101.03545v1. https://doi.org/10.1007/978-3-030-73696-5_16
Github. (2021). GitHub - sfkcvk/TurkishFakeNewsDataset: This is the reporsitory of Turkish fake news dataset which consists of Zaytung posts and Hurriyet news articles. https://github.com/sfkcvk/TurkishFakeNewsDataset
Guan, Z. (2021). TSIA team at FakeDeS 2021 : Fake News Detection in Spanish Using Multi-Model Ensemble Learning. Iberian Languages Evaluation Forum (IberLEF 2021).
Hakak, S., Alazab, M., Khan, S., Gadekallu, T. R., Maddikunta, P. K. R., & Khan, W. Z. (2021). An Ensemble Machine Learning Approach through Effective Feature Extraction to Classify Fake News. Future Generation Computer Systems, 117, 47–58. https://doi.org/10.1016/j.future.2020.11.022
Kadhim, A. I. (2019). Term Weighting for Feature Extraction on Twitter: A Comparison between BM25 and TF-IDF. 2019 International Conference on Advanced Science and Engineering, ICOASE 2019, 124–128. https://doi.org/10.1109/ICOASE.2019.8723825
Lekshmiammal, H. R., & Madasamy, A. K. (2021). NITK _ NLP at CheckThat ! 2021 : Ensemble Transformer Model for Fake News Classification. Conference and Labs Ofthe Evaluation Forum (CLEF 2021).
Meel, P., & Vishwakarma, D. K. (2021). HAN, Image Captioning, and Forensics Ensemble Multimodal Fake News Detection. Information Sciences, 567, 23–41. https://doi.org/10.1016/j.ins.2021.03.037
Noman Qasem, S., Al-Sarem, M., & Saeed, F. (2021). An Ensemble Learning Based Approach for Detecting and Tracking COVID19 Rumors. Computers, Materials & Continua, 70(1), 1721–1747. https://doi.org/10.32604/cmc.2022.018972
Olaleye, T. O., Arogundade, O. T., Abayomi-Alli, A., & Adesemowo, A. K. (2021). An Ensemble Predictive Analytics of COVID-19 Infodemic Tweets using Bag of Words. Data Science for COVID-19, 365–380. https://doi.org/10.1016/B978-0-12-824536-1.00004-6
Onan, A., & Tocoglu, M. A. (2020). Satire identification in Turkish news articles based on ensemble of classifiers. Turkish Journal of Electrical Engineering and Computer Sciences, 28(2), 1086–1106. https://doi.org/10.3906/elk-1907-11
Ribeiro Bezerra, J. F. (2021). Content-Based Fake News Classification through Modified Voting Ensemble. Journal of Information and Telecommunication, 1–15. https://doi.org/10.1080/24751839.2021.1963912
Sasikala, B. S., Biju, V. G., & Prashanth, C. M. (2017). Kappa and accuracy evaluations of machine learning classifiers. 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)., 20–23. https://doi.org/10.1109/RTEICT.2017.8256551
Shushkevich, E., & Cardiff, J. (2021). TUDublin Team at Constraint@AAAI2021 -- COVID19 Fake News Detection. ArXiv Preprint ArXiv: 2101.05701v1. http://arxiv.org/abs/2101.05701

AdaBoost Ensemble Learning on top of Naive Bayes Algorithm to Discriminate Fake and Genuine News from Social Media

Year 2021, Issue: 28, 459 - 462, 30.11.2021

Mehmet Bozuyla

https://doi.org/10.31590/ejosat.1005577

Cited By: 3

Abstract

There is a continuous increase in social media usage and a huge interaction takes place between users. In this context, fake news circulation or flood becomes a real thread for social media users from various perspectives. Fake news is defined as presentation of misleading information as true news. In this view, fake news is fabricated news that aims to manipulate public opinion to obtain a benefit. For example, increasing readership for profiting through clickbaits is such an aim. Social media users are manipulated through attention grabbing headlines or web-links to increase number of visitors. Therefore, an automated fake news identification model can be used by social media users to filter inadvertent web-traffic. For this goal machine learning algorithms are used in the literature as a solution for fake news problem. In machine learning literature, advancing performance of the base models is crucial. Ensemble learning is one of the key solutions to enhance model efficiency. In this work, we first generated a set of baseline machine learning algorithms and we tested them in terms of their fake news identification ability. We then made use of ensemble learning strategy to further enhance obtained results. More precisely, we obtained Naïve Bayes Multinomial classifier as the best fake news predictor having 96.74 % accuracy. We then further improved this prediction ability to 98.2 % by applying an AdaBoost ensemble learning strategy.

Keywords

Fake News Identification , Ensemble Learning , Machine Learning , Text Mining , Social Media

References

Akhter, M. P., Zheng, J., Afzal, F., Lin, H., Riaz, S., & Mehmood, A. (2021). Supervised Ensemble Learning Methods Towards Automatically Filtering Urdu Fake News within Social Media. PeerJ Computer Science. https://doi.org/10.7717/peerj-cs.425
Al-Yahya, M., Al-Khalifa, H., Al-Baity, H., Alsaeed, D., & Essam, A. (2021). Arabic Fake News Detection: Comparative Study of Neural Networks and Transformer-Based Approaches. Complexity, 2021. https://doi.org/10.1155/2021/5516945
Aslam, N., Ullah Khan, I., Alotaibi, F. S., Aldaej, L. A., & Aldubaikil, A. K. (2021). Fake Detect: A Deep Learning Ensemble Model for Fake News Detection. Complexity, 2021. https://doi.org/10.1155/2021/5557784
Das, S. D., Basak, A., & Dutta, S. (2021). A Heuristic-Driven Ensemble Framework for COVID-19 Fake News Detection. ArXiv Preprint ArXiv: 2101.03545v1. https://doi.org/10.1007/978-3-030-73696-5_16
Github. (2021). GitHub - sfkcvk/TurkishFakeNewsDataset: This is the reporsitory of Turkish fake news dataset which consists of Zaytung posts and Hurriyet news articles. https://github.com/sfkcvk/TurkishFakeNewsDataset
Guan, Z. (2021). TSIA team at FakeDeS 2021 : Fake News Detection in Spanish Using Multi-Model Ensemble Learning. Iberian Languages Evaluation Forum (IberLEF 2021).
Hakak, S., Alazab, M., Khan, S., Gadekallu, T. R., Maddikunta, P. K. R., & Khan, W. Z. (2021). An Ensemble Machine Learning Approach through Effective Feature Extraction to Classify Fake News. Future Generation Computer Systems, 117, 47–58. https://doi.org/10.1016/j.future.2020.11.022
Kadhim, A. I. (2019). Term Weighting for Feature Extraction on Twitter: A Comparison between BM25 and TF-IDF. 2019 International Conference on Advanced Science and Engineering, ICOASE 2019, 124–128. https://doi.org/10.1109/ICOASE.2019.8723825
Lekshmiammal, H. R., & Madasamy, A. K. (2021). NITK _ NLP at CheckThat ! 2021 : Ensemble Transformer Model for Fake News Classification. Conference and Labs Ofthe Evaluation Forum (CLEF 2021).
Meel, P., & Vishwakarma, D. K. (2021). HAN, Image Captioning, and Forensics Ensemble Multimodal Fake News Detection. Information Sciences, 567, 23–41. https://doi.org/10.1016/j.ins.2021.03.037
Noman Qasem, S., Al-Sarem, M., & Saeed, F. (2021). An Ensemble Learning Based Approach for Detecting and Tracking COVID19 Rumors. Computers, Materials & Continua, 70(1), 1721–1747. https://doi.org/10.32604/cmc.2022.018972
Olaleye, T. O., Arogundade, O. T., Abayomi-Alli, A., & Adesemowo, A. K. (2021). An Ensemble Predictive Analytics of COVID-19 Infodemic Tweets using Bag of Words. Data Science for COVID-19, 365–380. https://doi.org/10.1016/B978-0-12-824536-1.00004-6
Onan, A., & Tocoglu, M. A. (2020). Satire identification in Turkish news articles based on ensemble of classifiers. Turkish Journal of Electrical Engineering and Computer Sciences, 28(2), 1086–1106. https://doi.org/10.3906/elk-1907-11
Ribeiro Bezerra, J. F. (2021). Content-Based Fake News Classification through Modified Voting Ensemble. Journal of Information and Telecommunication, 1–15. https://doi.org/10.1080/24751839.2021.1963912
Sasikala, B. S., Biju, V. G., & Prashanth, C. M. (2017). Kappa and accuracy evaluations of machine learning classifiers. 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)., 20–23. https://doi.org/10.1109/RTEICT.2017.8256551
Shushkevich, E., & Cardiff, J. (2021). TUDublin Team at Constraint@AAAI2021 -- COVID19 Fake News Detection. ArXiv Preprint ArXiv: 2101.05701v1. http://arxiv.org/abs/2101.05701

There are 16 citations in total.

Details

Primary Language	English
Subjects	Engineering
Journal Section	Articles
Authors	Mehmet Bozuyla 0000-0002-7485-6106
Publication Date	November 30, 2021
Published in Issue	Year 2021 Issue: 28

Cite

APA	Bozuyla, M. (2021). AdaBoost Ensemble Learning on top of Naive Bayes Algorithm to Discriminate Fake and Genuine News from Social Media. Avrupa Bilim Ve Teknoloji Dergisi(28), 459-462. https://doi.org/10.31590/ejosat.1005577

Avrupa Bilim ve Teknoloji Dergisi

Naive Bayes Algoritmasının AdaBoost Topluluk Öğrenme Modeli ile Sosyal Medyada Sahte ve Gerçek Haberlerinin Ayırt Edilmesi

Abstract

Keywords

References

AdaBoost Ensemble Learning on top of Naive Bayes Algorithm to Discriminate Fake and Genuine News from Social Media

Abstract

Keywords

References

Details

Cite

Cited By

TÜKETİCİLERİN ONLİNE YEMEK SİPARİŞİ MEMNUNİYETİNİN VERİ MADENCİLİĞİ ALGORİTMALARIYLA SINIFLANDIRILMASI VE PERFORMANSLARININ KARŞILAŞTIRILMASI

International Review of Economics and Management

https://doi.org/10.18825/iremjournal.1478562

Validation and Extraction of Reliable Information Through Automated Scraping and Natural Language Inference

Engineering Applications of Artificial Intelligence

https://doi.org/10.1016/j.engappai.2025.110284

CLASSIFYING LIVER DISEASE WITH BOOSTING MACHINE LEARNING APPROACHES

Eskişehir Osmangazi Üniversitesi Mühendislik ve Mimarlık Fakültesi Dergisi

https://doi.org/10.31796/ogummf.1591951