Fake News Detection on Mainstream Media Using Natural Language Processing

İsa Kulaksız; Ahmet Coşkunçay

doi:10.34248/bsengineering.1527551

TR EN

Fake News Detection on Mainstream Media Using Natural Language Processing

Abstract

In light of recent advances in online journalism, the diversity, abundance, and accessibility of news have increased exponentially. However, the growth of online journalism also brings issues, especially regarding the reliability of the news. Notably, news widely shared on social media during the US presidential election campaign and the UK Brexit referendum caused millions of reactions from the public. This concerning scenario prompted industry and academia to address the pressing issue of fake news. Detecting fake news is a meticulous, time-consuming, and labor-intensive task that requires expert judgment. To mitigate this challenge, this study proposes a linguistic based model for Turkish fake news detection. In this dataset was collected from TRT's RSS service and through web scraping from the Teyit.org platform. It contains news titles and summaries related to significant events in Türkiye between 2015 and 2023. The research compares classical machine learning classifiers including SVM, Logistic Regression, Random Forest, k-NN, Decision Tree, and Naive Bayes, against a neural based sequential learning model such as LSTM using real world datasets. Furthermore, the research investigates the impacts of different word representation techniques, including TF-IDF and CountVectorizer, and also hyperparameter optimization on the classification results. The findings revealed that using hyperparameter tuning, the TF-IDF method yielded the highest accuracy rate of 93.12% on the SVM model and that TF-IDF is more effective.

Keywords

Ethical Statement

Ethics committee approval was not required for this study because of there was no study on animals or humans.

Thanks

This research is based on a master's thesis.

Fake News Detection on Mainstream Media Using Natural Language Processing

Abstract

In light of recent advances in online journalism, the diversity, abundance, and accessibility of news have increased exponentially. However, the growth of online journalism also brings issues, especially regarding the reliability of the news. Notably, news widely shared on social media during the US presidential election campaign and the UK Brexit referendum caused millions of reactions from the public. This concerning scenario prompted industry and academia to address the pressing issue of fake news. Detecting fake news is a meticulous, time-consuming, and labor-intensive task that requires expert judgment. To mitigate this challenge, this study proposes a linguistic based model for Turkish fake news detection. In this dataset was collected from TRT's RSS service and through web scraping from the Teyit.org platform. It contains news titles and summaries related to significant events in Türkiye between 2015 and 2023. The research compares classical machine learning classifiers including SVM, Logistic Regression, Random Forest, k-NN, Decision Tree, and Naive Bayes, against a neural based sequential learning model such as LSTM using real world datasets. Furthermore, the research investigates the impacts of different word representation techniques, including TF-IDF and CountVectorizer, and also hyperparameter optimization on the classification results. The findings revealed that using hyperparameter tuning, the TF-IDF method yielded the highest accuracy rate of 93.12% on the SVM model and that TF-IDF is more effective.

Keywords

Ethical Statement

Ethics committee approval was not required for this study because of there was no study on animals or humans.

Thanks

This research is based on a master's thesis.

References

Ahmad I, Yousaf M, Yousaf S, Ahmad M. 2020. Fake news detection using machine learning ensemble methods. Complexity, 2020: 8885861. https://doi.org/10.1155/2020/8885861
Ahmed H, Traore I, Saad S. 2017. Detection of online fake news using n-gram analysis and machine learning techniques. International Conference On Intelligent, Secure, And Dependable Systems In Distributed And Cloud Environments, 28-30 November; Vancouver, Canada, pp: 127-138.
Akın A, Akın M. 2007. Zemberek, an open source NLP framework for Turkic languages. Structure, 10(2007): 1-5.
Ajao O, Bhowmik D, Zargari S. 2018. Fake news identification on twitter with hybrid cnn and rnn models. SMSociety '18: International Conference on Social Media and Society, July 18-20, New York USA, pp: 226-230.
Aslam N, Khan I, Alotaibi F, Aldaej L, Abdulbaikil A. 2021. Fake detect: A deep learning ensemble model for fake news detection, Complexity, 2021(4): 1-8.
Bozuyla M, Özçift A. 2022. Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data. Turk J Electr Eng Comput Sci, 30(3): 908–926.
Choudhury N. 2014. World wide web and its journey from web 1.0 to web 4.0. Int J Comput Sci Inf Technol, 5(6): 8096–8100.
Çöltekin Ç. 2014. A set of open source tools for Turkish natural language processing. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), 26-31 May, Reykjavik Iceland, pp: 1079–1086.

Falah Z, Suryawan F. 2022. Recommendation system to propose final project supervisors using cosine similarity matrix. Khazanah Informatika: Jurnal Ilmu Komputer dan Informatika, 8(2).
García S, Garcia G, Prieto M, Guerrero A, Jimenez, C. 2020. The impact of term fake news on the scientific community. Scientific performance and mapping in web of science. Soc Sci, 9(5): 73.
Güler G, Gündüz S. 2023. Deep learning based fake news detection on social media. Int J Inf Secur, 12(2): 1-21.
Han J, Kamber M. 2011. Data mining: concepts and techniques. Morgan Kaufmann, Elsevier, Waltham, MA 02451, USA, pp: 47-67.
Hochreiter S, Schmidhuber J. 1997. Long short-term memory. Neural Comput, 9(8), 1735–1780.
Kaliyar R, Goswami A, Narang P. 2021a. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimed Tools Appl, 80(8): 11765-11788.
Kaliyar R, Goswami A, Narang P. 2021b. EchoFakeD: improving fake news detection in social media with an efficient deep neural network. Neural Comput Appl, 33(14): 8597-8613.
Kaliyar R, Goswami A, Narang P. 2020c. FNDNet – A deep convolutional neural network for fake news detection, Cogn Syst Res, 61: 32-44.
Kaur S, Kumar P, Kumaraguru P. 2020. Automating fake news detection system using multi-level voting model. Soft Comput, 24(12): 9049-9069.
Khanam Z, Alwasel B, Sirafi H, Rashid M. 2021. Fake news detection using machine learning approaches. International Conference on Applied Scientific Computational Intelligence using Data Science (ASCI 2020), 22-23 December, Jaipur India.
Korenius T, Laurikkala J, Jarvalin K, Juhola M. 2004. Stemming and lemmatization in the clustering of Finnish text documents. Proceedings of the thirteenth ACM international conference on Information and Knowledge Management, 13 November, New York USA, pp: 625–633.
Koru G, Uluyol Ç. 2024. Detection of Turkish fake news from tweets with BERT models. IEEE, 12: 14918-14931.
Kucharski, A. 2016. Study epidemiology of fake news. Nature, 540(7634): 525-525.
Lazer D, Baum M, Benkler Y, Berinsky A, Greenhill K, Menczer F, Metzger M, Nyhan B, Pennycook G, Rothschild D, Schudson M, Sloman S, Sunstein C. Thorson E, Watts D, Zittrain J. 2018. The science of fake news. Sci, 359(6380): 1094-1096.
Mertoğlu U, Genç B. 2020. Automated fake news detection in the age of digital libraries, Inf Technol Libr, 39(4).
Meyers M, Weiss G, Spanakis G. 2020. Fake news detection on twitter using propagation structures. Disinformation in Open Online Media, 26-27 October, Leiden Netherlands, pp: 138-158.
Monti F, Frasca F, Eynard D, Mannion D, Bronstein M. 2019. Fake news detection on social media using geometric deep learning. URL: https://arxiv.org/abs/1902.06673 (access date: December 17, 2023).
Oflazer K. 2014. Turkish and its challenges for language processing. Lang Resour Eval, 48(4): 639-653.
Rahutomo F, Kitasuka T, Aritsugi M. 2012. Semantic cosine similarity. The 7th International Student Conference on Advanced Science and Technology ICAST 2012, October 29-30, Seoul, South Korea, pp: 54.
Reis J, Correia A, Murai F, Veloso A, Benevenuto F. 2019. Supervised learning for fake news detection. IEEE Intell Syst, 34(2): 76-81.
Shu K, Amy S, Wang S, Tang J, Liu H. 2017. Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor Newsl, 19(1): 22-36.
Siahaan A, Aryza S, Hariyanto E, R, Lubis A. 2018. Combination of Levenshtein distance and Rabin-Karp to improve the accuracy of document equivalence level. Int J Eng Technol, 7(27): 17-21.
Taskin S, Kucuksille E, Topal K. 2022. Detection of Turkish fake news in twitter with machine learning algorithms. Arab J Sci Eng, 47(2): 2359-2379.
Ünver A. 2023. Emerging technologies and automated fact-checking: tools, techniques and algorithms. URL: https://ssrn.com/abstract=4555022 (accessed date: August 29, 2023).
Yamanan E. 2016. Türkçenin güncel söz varliği. Mill Eğ Derg, 45(210): 85-91.
Ying X. 2019. An overview of overfitting and its solutions. J Phy Conf Ser, 1168: 022022.

Details

Primary Language

English

Subjects

Decision Support and Group Support Systems

Journal Section

Research Article

Authors

İsa Kulaksız ^*
0009-0000-1138-7130
Türkiye

Ahmet Coşkunçay
0000-0002-7411-310X
Türkiye

Publication Date

January 15, 2025

Submission Date

August 7, 2024

Acceptance Date

December 19, 2024

Published in Issue

Year 2025 Volume: 8 Number: 1

DOI

https://doi.org/10.34248/bsengineering.1527551

IZ

https://izlik.org/JA53FU68SF

Cite

RIS / Bibtex

APA

Kulaksız, İ., & Coşkunçay, A. (2025). Fake News Detection on Mainstream Media Using Natural Language Processing. Black Sea Journal of Engineering and Science, 8(1), 214-224. https://doi.org/10.34248/bsengineering.1527551

AMA

1.Kulaksız İ, Coşkunçay A. Fake News Detection on Mainstream Media Using Natural Language Processing. BSJ Eng. Sci. 2025;8(1):214-224. doi:10.34248/bsengineering.1527551

Chicago

Kulaksız, İsa, and Ahmet Coşkunçay. 2025. “Fake News Detection on Mainstream Media Using Natural Language Processing”. Black Sea Journal of Engineering and Science 8 (1): 214-24. https://doi.org/10.34248/bsengineering.1527551.

EndNote

Kulaksız İ, Coşkunçay A (January 1, 2025) Fake News Detection on Mainstream Media Using Natural Language Processing. Black Sea Journal of Engineering and Science 8 1 214–224.

IEEE

[1]İ. Kulaksız and A. Coşkunçay, “Fake News Detection on Mainstream Media Using Natural Language Processing”, BSJ Eng. Sci., vol. 8, no. 1, pp. 214–224, Jan. 2025, doi: 10.34248/bsengineering.1527551.

ISNAD

Kulaksız, İsa - Coşkunçay, Ahmet. “Fake News Detection on Mainstream Media Using Natural Language Processing”. Black Sea Journal of Engineering and Science 8/1 (January 1, 2025): 214-224. https://doi.org/10.34248/bsengineering.1527551.

JAMA

1.Kulaksız İ, Coşkunçay A. Fake News Detection on Mainstream Media Using Natural Language Processing. BSJ Eng. Sci. 2025;8:214–224.

MLA

Kulaksız, İsa, and Ahmet Coşkunçay. “Fake News Detection on Mainstream Media Using Natural Language Processing”. Black Sea Journal of Engineering and Science, vol. 8, no. 1, Jan. 2025, pp. 214-2, doi:10.34248/bsengineering.1527551.

Vancouver

1.İsa Kulaksız, Ahmet Coşkunçay. Fake News Detection on Mainstream Media Using Natural Language Processing. BSJ Eng. Sci. 2025 Jan. 1;8(1):214-2. doi:10.34248/bsengineering.1527551

Cited By

LSTM VE BERT MODELLERİ İLE SAHTE HABER TESPİTİ

Uluslararası Sürdürülebilir Mühendislik ve Teknoloji Dergisi

https://doi.org/10.62301/usmtd.1698904