Research Article
BibTex RIS Cite

Fake News Detection on Mainstream Media Using Natural Language Processing

Year 2025, , 214 - 224, 15.01.2025
https://doi.org/10.34248/bsengineering.1527551

Abstract

In light of recent advances in online journalism, the diversity, abundance, and accessibility of news have increased exponentially. However, the growth of online journalism also brings issues, especially regarding the reliability of the news. Notably, news widely shared on social media during the US presidential election campaign and the UK Brexit referendum caused millions of reactions from the public. This concerning scenario prompted industry and academia to address the pressing issue of fake news. Detecting fake news is a meticulous, time-consuming, and labor-intensive task that requires expert judgment. To mitigate this challenge, this study proposes a linguistic based model for Turkish fake news detection. In this dataset was collected from TRT's RSS service and through web scraping from the Teyit.org platform. It contains news titles and summaries related to significant events in Türkiye between 2015 and 2023. The research compares classical machine learning classifiers including SVM, Logistic Regression, Random Forest, k-NN, Decision Tree, and Naive Bayes, against a neural based sequential learning model such as LSTM using real world datasets. Furthermore, the research investigates the impacts of different word representation techniques, including TF-IDF and CountVectorizer, and also hyperparameter optimization on the classification results. The findings revealed that using hyperparameter tuning, the TF-IDF method yielded the highest accuracy rate of 93.12% on the SVM model and that TF-IDF is more effective.

Ethical Statement

Ethics committee approval was not required for this study because of there was no study on animals or humans.

Thanks

This research is based on a master's thesis.

References

  • Ahmad I, Yousaf M, Yousaf S, Ahmad M. 2020. Fake news detection using machine learning ensemble methods. Complexity, 2020: 8885861. https://doi.org/10.1155/2020/8885861
  • Ahmed H, Traore I, Saad S. 2017. Detection of online fake news using n-gram analysis and machine learning techniques. International Conference On Intelligent, Secure, And Dependable Systems In Distributed And Cloud Environments, 28-30 November; Vancouver, Canada, pp: 127-138.
  • Akın A, Akın M. 2007. Zemberek, an open source NLP framework for Turkic languages. Structure, 10(2007): 1-5.
  • Ajao O, Bhowmik D, Zargari S. 2018. Fake news identification on twitter with hybrid cnn and rnn models. SMSociety '18: International Conference on Social Media and Society, July 18-20, New York USA, pp: 226-230.
  • Aslam N, Khan I, Alotaibi F, Aldaej L, Abdulbaikil A. 2021. Fake detect: A deep learning ensemble model for fake news detection, Complexity, 2021(4): 1-8.
  • Bozuyla M, Özçift A. 2022. Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data. Turk J Electr Eng Comput Sci, 30(3): 908–926.
  • Choudhury N. 2014. World wide web and its journey from web 1.0 to web 4.0. Int J Comput Sci Inf Technol, 5(6): 8096–8100.
  • Çöltekin Ç. 2014. A set of open source tools for Turkish natural language processing. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), 26-31 May, Reykjavik Iceland, pp: 1079–1086.
  • Falah Z, Suryawan F. 2022. Recommendation system to propose final project supervisors using cosine similarity matrix. Khazanah Informatika: Jurnal Ilmu Komputer dan Informatika, 8(2).
  • García S, Garcia G, Prieto M, Guerrero A, Jimenez, C. 2020. The impact of term fake news on the scientific community. Scientific performance and mapping in web of science. Soc Sci, 9(5): 73.
  • Güler G, Gündüz S. 2023. Deep learning based fake news detection on social media. Int J Inf Secur, 12(2): 1-21.
  • Han J, Kamber M. 2011. Data mining: concepts and techniques. Morgan Kaufmann, Elsevier, Waltham, MA 02451, USA, pp: 47-67.
  • Hochreiter S, Schmidhuber J. 1997. Long short-term memory. Neural Comput, 9(8), 1735–1780.
  • Kaliyar R, Goswami A, Narang P. 2021a. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimed Tools Appl, 80(8): 11765-11788.
  • Kaliyar R, Goswami A, Narang P. 2021b. EchoFakeD: improving fake news detection in social media with an efficient deep neural network. Neural Comput Appl, 33(14): 8597-8613.
  • Kaliyar R, Goswami A, Narang P. 2020c. FNDNet – A deep convolutional neural network for fake news detection, Cogn Syst Res, 61: 32-44.
  • Kaur S, Kumar P, Kumaraguru P. 2020. Automating fake news detection system using multi-level voting model. Soft Comput, 24(12): 9049-9069.
  • Khanam Z, Alwasel B, Sirafi H, Rashid M. 2021. Fake news detection using machine learning approaches. International Conference on Applied Scientific Computational Intelligence using Data Science (ASCI 2020), 22-23 December, Jaipur India.
  • Korenius T, Laurikkala J, Jarvalin K, Juhola M. 2004. Stemming and lemmatization in the clustering of Finnish text documents. Proceedings of the thirteenth ACM international conference on Information and Knowledge Management, 13 November, New York USA, pp: 625–633.
  • Koru G, Uluyol Ç. 2024. Detection of Turkish fake news from tweets with BERT models. IEEE, 12: 14918-14931.
  • Kucharski, A. 2016. Study epidemiology of fake news. Nature, 540(7634): 525-525.
  • Lazer D, Baum M, Benkler Y, Berinsky A, Greenhill K, Menczer F, Metzger M, Nyhan B, Pennycook G, Rothschild D, Schudson M, Sloman S, Sunstein C. Thorson E, Watts D, Zittrain J. 2018. The science of fake news. Sci, 359(6380): 1094-1096.
  • Mertoğlu U, Genç B. 2020. Automated fake news detection in the age of digital libraries, Inf Technol Libr, 39(4).
  • Meyers M, Weiss G, Spanakis G. 2020. Fake news detection on twitter using propagation structures. Disinformation in Open Online Media, 26-27 October, Leiden Netherlands, pp: 138-158.
  • Monti F, Frasca F, Eynard D, Mannion D, Bronstein M. 2019. Fake news detection on social media using geometric deep learning. URL: https://arxiv.org/abs/1902.06673 (access date: December 17, 2023).
  • Oflazer K. 2014. Turkish and its challenges for language processing. Lang Resour Eval, 48(4): 639-653.
  • Rahutomo F, Kitasuka T, Aritsugi M. 2012. Semantic cosine similarity. The 7th International Student Conference on Advanced Science and Technology ICAST 2012, October 29-30, Seoul, South Korea, pp: 54.
  • Reis J, Correia A, Murai F, Veloso A, Benevenuto F. 2019. Supervised learning for fake news detection. IEEE Intell Syst, 34(2): 76-81.
  • Shu K, Amy S, Wang S, Tang J, Liu H. 2017. Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor Newsl, 19(1): 22-36.
  • Siahaan A, Aryza S, Hariyanto E, R, Lubis A. 2018. Combination of Levenshtein distance and Rabin-Karp to improve the accuracy of document equivalence level. Int J Eng Technol, 7(27): 17-21.
  • Taskin S, Kucuksille E, Topal K. 2022. Detection of Turkish fake news in twitter with machine learning algorithms. Arab J Sci Eng, 47(2): 2359-2379.
  • Ünver A. 2023. Emerging technologies and automated fact-checking: tools, techniques and algorithms. URL: https://ssrn.com/abstract=4555022 (accessed date: August 29, 2023).
  • Yamanan E. 2016. Türkçenin güncel söz varliği. Mill Eğ Derg, 45(210): 85-91.
  • Ying X. 2019. An overview of overfitting and its solutions. J Phy Conf Ser, 1168: 022022.

Fake News Detection on Mainstream Media Using Natural Language Processing

Year 2025, , 214 - 224, 15.01.2025
https://doi.org/10.34248/bsengineering.1527551

Abstract

In light of recent advances in online journalism, the diversity, abundance, and accessibility of news have increased exponentially. However, the growth of online journalism also brings issues, especially regarding the reliability of the news. Notably, news widely shared on social media during the US presidential election campaign and the UK Brexit referendum caused millions of reactions from the public. This concerning scenario prompted industry and academia to address the pressing issue of fake news. Detecting fake news is a meticulous, time-consuming, and labor-intensive task that requires expert judgment. To mitigate this challenge, this study proposes a linguistic based model for Turkish fake news detection. In this dataset was collected from TRT's RSS service and through web scraping from the Teyit.org platform. It contains news titles and summaries related to significant events in Türkiye between 2015 and 2023. The research compares classical machine learning classifiers including SVM, Logistic Regression, Random Forest, k-NN, Decision Tree, and Naive Bayes, against a neural based sequential learning model such as LSTM using real world datasets. Furthermore, the research investigates the impacts of different word representation techniques, including TF-IDF and CountVectorizer, and also hyperparameter optimization on the classification results. The findings revealed that using hyperparameter tuning, the TF-IDF method yielded the highest accuracy rate of 93.12% on the SVM model and that TF-IDF is more effective.

Ethical Statement

Ethics committee approval was not required for this study because of there was no study on animals or humans.

Thanks

This research is based on a master's thesis.

References

  • Ahmad I, Yousaf M, Yousaf S, Ahmad M. 2020. Fake news detection using machine learning ensemble methods. Complexity, 2020: 8885861. https://doi.org/10.1155/2020/8885861
  • Ahmed H, Traore I, Saad S. 2017. Detection of online fake news using n-gram analysis and machine learning techniques. International Conference On Intelligent, Secure, And Dependable Systems In Distributed And Cloud Environments, 28-30 November; Vancouver, Canada, pp: 127-138.
  • Akın A, Akın M. 2007. Zemberek, an open source NLP framework for Turkic languages. Structure, 10(2007): 1-5.
  • Ajao O, Bhowmik D, Zargari S. 2018. Fake news identification on twitter with hybrid cnn and rnn models. SMSociety '18: International Conference on Social Media and Society, July 18-20, New York USA, pp: 226-230.
  • Aslam N, Khan I, Alotaibi F, Aldaej L, Abdulbaikil A. 2021. Fake detect: A deep learning ensemble model for fake news detection, Complexity, 2021(4): 1-8.
  • Bozuyla M, Özçift A. 2022. Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data. Turk J Electr Eng Comput Sci, 30(3): 908–926.
  • Choudhury N. 2014. World wide web and its journey from web 1.0 to web 4.0. Int J Comput Sci Inf Technol, 5(6): 8096–8100.
  • Çöltekin Ç. 2014. A set of open source tools for Turkish natural language processing. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), 26-31 May, Reykjavik Iceland, pp: 1079–1086.
  • Falah Z, Suryawan F. 2022. Recommendation system to propose final project supervisors using cosine similarity matrix. Khazanah Informatika: Jurnal Ilmu Komputer dan Informatika, 8(2).
  • García S, Garcia G, Prieto M, Guerrero A, Jimenez, C. 2020. The impact of term fake news on the scientific community. Scientific performance and mapping in web of science. Soc Sci, 9(5): 73.
  • Güler G, Gündüz S. 2023. Deep learning based fake news detection on social media. Int J Inf Secur, 12(2): 1-21.
  • Han J, Kamber M. 2011. Data mining: concepts and techniques. Morgan Kaufmann, Elsevier, Waltham, MA 02451, USA, pp: 47-67.
  • Hochreiter S, Schmidhuber J. 1997. Long short-term memory. Neural Comput, 9(8), 1735–1780.
  • Kaliyar R, Goswami A, Narang P. 2021a. FakeBERT: Fake news detection in social media with a BERT-based deep learning approach. Multimed Tools Appl, 80(8): 11765-11788.
  • Kaliyar R, Goswami A, Narang P. 2021b. EchoFakeD: improving fake news detection in social media with an efficient deep neural network. Neural Comput Appl, 33(14): 8597-8613.
  • Kaliyar R, Goswami A, Narang P. 2020c. FNDNet – A deep convolutional neural network for fake news detection, Cogn Syst Res, 61: 32-44.
  • Kaur S, Kumar P, Kumaraguru P. 2020. Automating fake news detection system using multi-level voting model. Soft Comput, 24(12): 9049-9069.
  • Khanam Z, Alwasel B, Sirafi H, Rashid M. 2021. Fake news detection using machine learning approaches. International Conference on Applied Scientific Computational Intelligence using Data Science (ASCI 2020), 22-23 December, Jaipur India.
  • Korenius T, Laurikkala J, Jarvalin K, Juhola M. 2004. Stemming and lemmatization in the clustering of Finnish text documents. Proceedings of the thirteenth ACM international conference on Information and Knowledge Management, 13 November, New York USA, pp: 625–633.
  • Koru G, Uluyol Ç. 2024. Detection of Turkish fake news from tweets with BERT models. IEEE, 12: 14918-14931.
  • Kucharski, A. 2016. Study epidemiology of fake news. Nature, 540(7634): 525-525.
  • Lazer D, Baum M, Benkler Y, Berinsky A, Greenhill K, Menczer F, Metzger M, Nyhan B, Pennycook G, Rothschild D, Schudson M, Sloman S, Sunstein C. Thorson E, Watts D, Zittrain J. 2018. The science of fake news. Sci, 359(6380): 1094-1096.
  • Mertoğlu U, Genç B. 2020. Automated fake news detection in the age of digital libraries, Inf Technol Libr, 39(4).
  • Meyers M, Weiss G, Spanakis G. 2020. Fake news detection on twitter using propagation structures. Disinformation in Open Online Media, 26-27 October, Leiden Netherlands, pp: 138-158.
  • Monti F, Frasca F, Eynard D, Mannion D, Bronstein M. 2019. Fake news detection on social media using geometric deep learning. URL: https://arxiv.org/abs/1902.06673 (access date: December 17, 2023).
  • Oflazer K. 2014. Turkish and its challenges for language processing. Lang Resour Eval, 48(4): 639-653.
  • Rahutomo F, Kitasuka T, Aritsugi M. 2012. Semantic cosine similarity. The 7th International Student Conference on Advanced Science and Technology ICAST 2012, October 29-30, Seoul, South Korea, pp: 54.
  • Reis J, Correia A, Murai F, Veloso A, Benevenuto F. 2019. Supervised learning for fake news detection. IEEE Intell Syst, 34(2): 76-81.
  • Shu K, Amy S, Wang S, Tang J, Liu H. 2017. Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor Newsl, 19(1): 22-36.
  • Siahaan A, Aryza S, Hariyanto E, R, Lubis A. 2018. Combination of Levenshtein distance and Rabin-Karp to improve the accuracy of document equivalence level. Int J Eng Technol, 7(27): 17-21.
  • Taskin S, Kucuksille E, Topal K. 2022. Detection of Turkish fake news in twitter with machine learning algorithms. Arab J Sci Eng, 47(2): 2359-2379.
  • Ünver A. 2023. Emerging technologies and automated fact-checking: tools, techniques and algorithms. URL: https://ssrn.com/abstract=4555022 (accessed date: August 29, 2023).
  • Yamanan E. 2016. Türkçenin güncel söz varliği. Mill Eğ Derg, 45(210): 85-91.
  • Ying X. 2019. An overview of overfitting and its solutions. J Phy Conf Ser, 1168: 022022.
There are 34 citations in total.

Details

Primary Language English
Subjects Decision Support and Group Support Systems
Journal Section Research Articles
Authors

İsa Kulaksız 0009-0000-1138-7130

Ahmet Coşkunçay 0000-0002-7411-310X

Publication Date January 15, 2025
Submission Date August 7, 2024
Acceptance Date December 19, 2024
Published in Issue Year 2025

Cite

APA Kulaksız, İ., & Coşkunçay, A. (2025). Fake News Detection on Mainstream Media Using Natural Language Processing. Black Sea Journal of Engineering and Science, 8(1), 214-224. https://doi.org/10.34248/bsengineering.1527551
AMA Kulaksız İ, Coşkunçay A. Fake News Detection on Mainstream Media Using Natural Language Processing. BSJ Eng. Sci. January 2025;8(1):214-224. doi:10.34248/bsengineering.1527551
Chicago Kulaksız, İsa, and Ahmet Coşkunçay. “Fake News Detection on Mainstream Media Using Natural Language Processing”. Black Sea Journal of Engineering and Science 8, no. 1 (January 2025): 214-24. https://doi.org/10.34248/bsengineering.1527551.
EndNote Kulaksız İ, Coşkunçay A (January 1, 2025) Fake News Detection on Mainstream Media Using Natural Language Processing. Black Sea Journal of Engineering and Science 8 1 214–224.
IEEE İ. Kulaksız and A. Coşkunçay, “Fake News Detection on Mainstream Media Using Natural Language Processing”, BSJ Eng. Sci., vol. 8, no. 1, pp. 214–224, 2025, doi: 10.34248/bsengineering.1527551.
ISNAD Kulaksız, İsa - Coşkunçay, Ahmet. “Fake News Detection on Mainstream Media Using Natural Language Processing”. Black Sea Journal of Engineering and Science 8/1 (January 2025), 214-224. https://doi.org/10.34248/bsengineering.1527551.
JAMA Kulaksız İ, Coşkunçay A. Fake News Detection on Mainstream Media Using Natural Language Processing. BSJ Eng. Sci. 2025;8:214–224.
MLA Kulaksız, İsa and Ahmet Coşkunçay. “Fake News Detection on Mainstream Media Using Natural Language Processing”. Black Sea Journal of Engineering and Science, vol. 8, no. 1, 2025, pp. 214-2, doi:10.34248/bsengineering.1527551.
Vancouver Kulaksız İ, Coşkunçay A. Fake News Detection on Mainstream Media Using Natural Language Processing. BSJ Eng. Sci. 2025;8(1):214-2.

                                                24890