Fake News Detection on Mainstream Media Using Natural Language Processing

İsa Kulaksız; Ahmet Coşkunçay

doi:10.34248/bsengineering.1527551

Araştırma Makalesi

Fake News Detection on Mainstream Media Using Natural Language Processing

Yıl 2025, , 214 - 224, 15.01.2025

İsa Kulaksız , Ahmet Coşkunçay

https://doi.org/10.34248/bsengineering.1527551

Öz

In light of recent advances in online journalism, the diversity, abundance, and accessibility of news have increased exponentially. However, the growth of online journalism also brings issues, especially regarding the reliability of the news. Notably, news widely shared on social media during the US presidential election campaign and the UK Brexit referendum caused millions of reactions from the public. This concerning scenario prompted industry and academia to address the pressing issue of fake news. Detecting fake news is a meticulous, time-consuming, and labor-intensive task that requires expert judgment. To mitigate this challenge, this study proposes a linguistic based model for Turkish fake news detection. In this dataset was collected from TRT's RSS service and through web scraping from the Teyit.org platform. It contains news titles and summaries related to significant events in Türkiye between 2015 and 2023. The research compares classical machine learning classifiers including SVM, Logistic Regression, Random Forest, k-NN, Decision Tree, and Naive Bayes, against a neural based sequential learning model such as LSTM using real world datasets. Furthermore, the research investigates the impacts of different word representation techniques, including TF-IDF and CountVectorizer, and also hyperparameter optimization on the classification results. The findings revealed that using hyperparameter tuning, the TF-IDF method yielded the highest accuracy rate of 93.12% on the SVM model and that TF-IDF is more effective.

Anahtar Kelimeler

Fake news detection, Machine learning, Classification, LSTM, NLP

Etik Beyan

Ethics committee approval was not required for this study because of there was no study on animals or humans.

Teşekkür

This research is based on a master's thesis.

Kaynakça

Ahmad I, Yousaf M, Yousaf S, Ahmad M. 2020. Fake news detection using machine learning ensemble methods. Complexity, 2020: 8885861. https://doi.org/10.1155/2020/8885861
Ahmed H, Traore I, Saad S. 2017. Detection of online fake news using n-gram analysis and machine learning techniques. International Conference On Intelligent, Secure, And Dependable Systems In Distributed And Cloud Environments, 28-30 November; Vancouver, Canada, pp: 127-138.
Akın A, Akın M. 2007. Zemberek, an open source NLP framework for Turkic languages. Structure, 10(2007): 1-5.
Ajao O, Bhowmik D, Zargari S. 2018. Fake news identification on twitter with hybrid cnn and rnn models. SMSociety '18: International Conference on Social Media and Society, July 18-20, New York USA, pp: 226-230.
Aslam N, Khan I, Alotaibi F, Aldaej L, Abdulbaikil A. 2021. Fake detect: A deep learning ensemble model for fake news detection, Complexity, 2021(4): 1-8.
Bozuyla M, Özçift A. 2022. Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data. Turk J Electr Eng Comput Sci, 30(3): 908–926.
Choudhury N. 2014. World wide web and its journey from web 1.0 to web 4.0. Int J Comput Sci Inf Technol, 5(6): 8096–8100.
Çöltekin Ç. 2014. A set of open source tools for Turkish natural language processing. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), 26-31 May, Reykjavik Iceland, pp: 1079–1086.
Falah Z, Suryawan F. 2022. Recommendation system to propose final project supervisors using cosine similarity matrix. Khazanah Informatika: Jurnal Ilmu Komputer dan Informatika, 8(2).
García S, Garcia G, Prieto M, Guerrero A, Jimenez, C. 2020. The impact of term fake news on the scientific community. Scientific performance and mapping in web of science. Soc Sci, 9(5): 73.
Güler G, Gündüz S. 2023. Deep learning based fake news detection on social media. Int J Inf Secur, 12(2): 1-21.
Han J, Kamber M. 2011. Data mining: concepts and techniques. Morgan Kaufmann, Elsevier, Waltham, MA 02451, USA, pp: 47-67.
Hochreiter S, Schmidhuber J. 1997. Long short-term memory. Neural Comput, 9(8), 1735–1780.
Kaliyar R, Goswami A, Narang P. 2021a. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimed Tools Appl, 80(8): 11765-11788.
Kaliyar R, Goswami A, Narang P. 2021b. EchoFakeD: improving fake news detection in social media with an efficient deep neural network. Neural Comput Appl, 33(14): 8597-8613.
Kaliyar R, Goswami A, Narang P. 2020c. FNDNet – A deep convolutional neural network for fake news detection, Cogn Syst Res, 61: 32-44.
Kaur S, Kumar P, Kumaraguru P. 2020. Automating fake news detection system using multi-level voting model. Soft Comput, 24(12): 9049-9069.
Khanam Z, Alwasel B, Sirafi H, Rashid M. 2021. Fake news detection using machine learning approaches. International Conference on Applied Scientific Computational Intelligence using Data Science (ASCI 2020), 22-23 December, Jaipur India.
Korenius T, Laurikkala J, Jarvalin K, Juhola M. 2004. Stemming and lemmatization in the clustering of Finnish text documents. Proceedings of the thirteenth ACM international conference on Information and Knowledge Management, 13 November, New York USA, pp: 625–633.
Koru G, Uluyol Ç. 2024. Detection of Turkish fake news from tweets with BERT models. IEEE, 12: 14918-14931.
Kucharski, A. 2016. Study epidemiology of fake news. Nature, 540(7634): 525-525.
Lazer D, Baum M, Benkler Y, Berinsky A, Greenhill K, Menczer F, Metzger M, Nyhan B, Pennycook G, Rothschild D, Schudson M, Sloman S, Sunstein C. Thorson E, Watts D, Zittrain J. 2018. The science of fake news. Sci, 359(6380): 1094-1096.
Mertoğlu U, Genç B. 2020. Automated fake news detection in the age of digital libraries, Inf Technol Libr, 39(4).
Meyers M, Weiss G, Spanakis G. 2020. Fake news detection on twitter using propagation structures. Disinformation in Open Online Media, 26-27 October, Leiden Netherlands, pp: 138-158.
Monti F, Frasca F, Eynard D, Mannion D, Bronstein M. 2019. Fake news detection on social media using geometric deep learning. URL: https://arxiv.org/abs/1902.06673 (access date: December 17, 2023).
Oflazer K. 2014. Turkish and its challenges for language processing. Lang Resour Eval, 48(4): 639-653.
Rahutomo F, Kitasuka T, Aritsugi M. 2012. Semantic cosine similarity. The 7th International Student Conference on Advanced Science and Technology ICAST 2012, October 29-30, Seoul, South Korea, pp: 54.
Reis J, Correia A, Murai F, Veloso A, Benevenuto F. 2019. Supervised learning for fake news detection. IEEE Intell Syst, 34(2): 76-81.
Shu K, Amy S, Wang S, Tang J, Liu H. 2017. Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor Newsl, 19(1): 22-36.
Siahaan A, Aryza S, Hariyanto E, R, Lubis A. 2018. Combination of Levenshtein distance and Rabin-Karp to improve the accuracy of document equivalence level. Int J Eng Technol, 7(27): 17-21.
Taskin S, Kucuksille E, Topal K. 2022. Detection of Turkish fake news in twitter with machine learning algorithms. Arab J Sci Eng, 47(2): 2359-2379.
Ünver A. 2023. Emerging technologies and automated fact-checking: tools, techniques and algorithms. URL: https://ssrn.com/abstract=4555022 (accessed date: August 29, 2023).
Yamanan E. 2016. Türkçenin güncel söz varliği. Mill Eğ Derg, 45(210): 85-91.
Ying X. 2019. An overview of overfitting and its solutions. J Phy Conf Ser, 1168: 022022.

Fake News Detection on Mainstream Media Using Natural Language Processing

Yıl 2025, , 214 - 224, 15.01.2025

İsa Kulaksız , Ahmet Coşkunçay

https://doi.org/10.34248/bsengineering.1527551

Öz

Anahtar Kelimeler

Fake news detection, Machine learning, Classification, LSTM, NLP

Etik Beyan

Ethics committee approval was not required for this study because of there was no study on animals or humans.

Teşekkür

This research is based on a master's thesis.

Kaynakça

Ahmad I, Yousaf M, Yousaf S, Ahmad M. 2020. Fake news detection using machine learning ensemble methods. Complexity, 2020: 8885861. https://doi.org/10.1155/2020/8885861
Ahmed H, Traore I, Saad S. 2017. Detection of online fake news using n-gram analysis and machine learning techniques. International Conference On Intelligent, Secure, And Dependable Systems In Distributed And Cloud Environments, 28-30 November; Vancouver, Canada, pp: 127-138.
Akın A, Akın M. 2007. Zemberek, an open source NLP framework for Turkic languages. Structure, 10(2007): 1-5.
Ajao O, Bhowmik D, Zargari S. 2018. Fake news identification on twitter with hybrid cnn and rnn models. SMSociety '18: International Conference on Social Media and Society, July 18-20, New York USA, pp: 226-230.
Aslam N, Khan I, Alotaibi F, Aldaej L, Abdulbaikil A. 2021. Fake detect: A deep learning ensemble model for fake news detection, Complexity, 2021(4): 1-8.
Bozuyla M, Özçift A. 2022. Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data. Turk J Electr Eng Comput Sci, 30(3): 908–926.
Choudhury N. 2014. World wide web and its journey from web 1.0 to web 4.0. Int J Comput Sci Inf Technol, 5(6): 8096–8100.
Çöltekin Ç. 2014. A set of open source tools for Turkish natural language processing. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), 26-31 May, Reykjavik Iceland, pp: 1079–1086.
Falah Z, Suryawan F. 2022. Recommendation system to propose final project supervisors using cosine similarity matrix. Khazanah Informatika: Jurnal Ilmu Komputer dan Informatika, 8(2).
García S, Garcia G, Prieto M, Guerrero A, Jimenez, C. 2020. The impact of term fake news on the scientific community. Scientific performance and mapping in web of science. Soc Sci, 9(5): 73.
Güler G, Gündüz S. 2023. Deep learning based fake news detection on social media. Int J Inf Secur, 12(2): 1-21.
Han J, Kamber M. 2011. Data mining: concepts and techniques. Morgan Kaufmann, Elsevier, Waltham, MA 02451, USA, pp: 47-67.
Hochreiter S, Schmidhuber J. 1997. Long short-term memory. Neural Comput, 9(8), 1735–1780.
Kaliyar R, Goswami A, Narang P. 2021a. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimed Tools Appl, 80(8): 11765-11788.
Kaliyar R, Goswami A, Narang P. 2021b. EchoFakeD: improving fake news detection in social media with an efficient deep neural network. Neural Comput Appl, 33(14): 8597-8613.
Kaliyar R, Goswami A, Narang P. 2020c. FNDNet – A deep convolutional neural network for fake news detection, Cogn Syst Res, 61: 32-44.
Kaur S, Kumar P, Kumaraguru P. 2020. Automating fake news detection system using multi-level voting model. Soft Comput, 24(12): 9049-9069.
Khanam Z, Alwasel B, Sirafi H, Rashid M. 2021. Fake news detection using machine learning approaches. International Conference on Applied Scientific Computational Intelligence using Data Science (ASCI 2020), 22-23 December, Jaipur India.
Korenius T, Laurikkala J, Jarvalin K, Juhola M. 2004. Stemming and lemmatization in the clustering of Finnish text documents. Proceedings of the thirteenth ACM international conference on Information and Knowledge Management, 13 November, New York USA, pp: 625–633.
Koru G, Uluyol Ç. 2024. Detection of Turkish fake news from tweets with BERT models. IEEE, 12: 14918-14931.
Kucharski, A. 2016. Study epidemiology of fake news. Nature, 540(7634): 525-525.
Lazer D, Baum M, Benkler Y, Berinsky A, Greenhill K, Menczer F, Metzger M, Nyhan B, Pennycook G, Rothschild D, Schudson M, Sloman S, Sunstein C. Thorson E, Watts D, Zittrain J. 2018. The science of fake news. Sci, 359(6380): 1094-1096.
Mertoğlu U, Genç B. 2020. Automated fake news detection in the age of digital libraries, Inf Technol Libr, 39(4).
Meyers M, Weiss G, Spanakis G. 2020. Fake news detection on twitter using propagation structures. Disinformation in Open Online Media, 26-27 October, Leiden Netherlands, pp: 138-158.
Monti F, Frasca F, Eynard D, Mannion D, Bronstein M. 2019. Fake news detection on social media using geometric deep learning. URL: https://arxiv.org/abs/1902.06673 (access date: December 17, 2023).
Oflazer K. 2014. Turkish and its challenges for language processing. Lang Resour Eval, 48(4): 639-653.
Rahutomo F, Kitasuka T, Aritsugi M. 2012. Semantic cosine similarity. The 7th International Student Conference on Advanced Science and Technology ICAST 2012, October 29-30, Seoul, South Korea, pp: 54.
Reis J, Correia A, Murai F, Veloso A, Benevenuto F. 2019. Supervised learning for fake news detection. IEEE Intell Syst, 34(2): 76-81.
Shu K, Amy S, Wang S, Tang J, Liu H. 2017. Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor Newsl, 19(1): 22-36.
Siahaan A, Aryza S, Hariyanto E, R, Lubis A. 2018. Combination of Levenshtein distance and Rabin-Karp to improve the accuracy of document equivalence level. Int J Eng Technol, 7(27): 17-21.
Taskin S, Kucuksille E, Topal K. 2022. Detection of Turkish fake news in twitter with machine learning algorithms. Arab J Sci Eng, 47(2): 2359-2379.
Ünver A. 2023. Emerging technologies and automated fact-checking: tools, techniques and algorithms. URL: https://ssrn.com/abstract=4555022 (accessed date: August 29, 2023).
Yamanan E. 2016. Türkçenin güncel söz varliği. Mill Eğ Derg, 45(210): 85-91.
Ying X. 2019. An overview of overfitting and its solutions. J Phy Conf Ser, 1168: 022022.

Toplam 34 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Karar Desteği ve Grup Destek Sistemleri
Bölüm	Research Articles
Yazarlar	İsa Kulaksız 0009-0000-1138-7130 Ahmet Coşkunçay 0000-0002-7411-310X
Yayımlanma Tarihi	15 Ocak 2025
Gönderilme Tarihi	7 Ağustos 2024
Kabul Tarihi	19 Aralık 2024
Yayımlandığı Sayı	Yıl 2025

Kaynak Göster

APA	Kulaksız, İ., & Coşkunçay, A. (2025). Fake News Detection on Mainstream Media Using Natural Language Processing. Black Sea Journal of Engineering and Science, 8(1), 214-224. https://doi.org/10.34248/bsengineering.1527551
AMA	Kulaksız İ, Coşkunçay A. Fake News Detection on Mainstream Media Using Natural Language Processing. BSJ Eng. Sci. Ocak 2025;8(1):214-224. doi:10.34248/bsengineering.1527551
Chicago	Kulaksız, İsa, ve Ahmet Coşkunçay. “Fake News Detection on Mainstream Media Using Natural Language Processing”. Black Sea Journal of Engineering and Science 8, sy. 1 (Ocak 2025): 214-24. https://doi.org/10.34248/bsengineering.1527551.
EndNote	Kulaksız İ, Coşkunçay A (01 Ocak 2025) Fake News Detection on Mainstream Media Using Natural Language Processing. Black Sea Journal of Engineering and Science 8 1 214–224.
IEEE	İ. Kulaksız ve A. Coşkunçay, “Fake News Detection on Mainstream Media Using Natural Language Processing”, BSJ Eng. Sci., c. 8, sy. 1, ss. 214–224, 2025, doi: 10.34248/bsengineering.1527551.
ISNAD	Kulaksız, İsa - Coşkunçay, Ahmet. “Fake News Detection on Mainstream Media Using Natural Language Processing”. Black Sea Journal of Engineering and Science 8/1 (Ocak 2025), 214-224. https://doi.org/10.34248/bsengineering.1527551.
JAMA	Kulaksız İ, Coşkunçay A. Fake News Detection on Mainstream Media Using Natural Language Processing. BSJ Eng. Sci. 2025;8:214–224.
MLA	Kulaksız, İsa ve Ahmet Coşkunçay. “Fake News Detection on Mainstream Media Using Natural Language Processing”. Black Sea Journal of Engineering and Science, c. 8, sy. 1, 2025, ss. 214-2, doi:10.34248/bsengineering.1527551.
Vancouver	Kulaksız İ, Coşkunçay A. Fake News Detection on Mainstream Media Using Natural Language Processing. BSJ Eng. Sci. 2025;8(1):214-2.

Makale Dosyaları

Tam Metin

24890