Research Article

Classification of News Texts from Different Languages with Machine Learning Algorithms

Volume: 4 Number: 1 June 25, 2023
EN

Classification of News Texts from Different Languages with Machine Learning Algorithms

Abstract

As a result of the developments in technology, the internet is accepted as one of the most important sources of information today. Although it is possible to access a large number of data in a short time thanks to the Internet, it is critical to analyze this data correctly. The need for text mining is increasing day by day by processing and analyzing the increasingly irregular text type data in the digital environment and classifying them in a meaningful way. In this study, news texts obtained from online German, Spanish, English and Turkish news sites were separated according to predetermined world, sports, economy and politics categories. The data set consisting of 4000 news texts was classified using 41 different machine learning algorithms in the Weka program. The highest successful classification was obtained with Naive Bayes Multinominal and Naive Bayes Multinominal Updateable algorithms, and 93.5% for German news texts, 93.3% for English news texts, 82.8% for Spanish news texts and 88.8% for Turkish news texts.

Keywords

Supporting Institution

Bulunmamaktadır.

Project Number

Bulunmamaktadır.

References

  1. Başkaya, F., & Aydın, İ. Haber Metinlerinin Farklı Metin Madenciliği Yöntemleriyle Sınıflandırılması, In 2017 International Artificial Intelligence and Data Processing Symposium (IDAP), 2017, pp. 1-5. IEEE.
  2. Aydemir, E. , Işık, M. & Tuncer, T. Türkçe Haber Metinlerinin Çok Terimli Naive Bayes Algoritması Kullanılarak Sınıflandırılması, Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 2021, 33(2), pp. 519-526. doi: 10.35234/fumbd.871986
  3. Acı, Ç. & Çırak, A. Türkçe Haber Metinlerinin Konvolüsyonel Sinir Ağları ve Word2Vec Kullanılarak Sınıflandırılması, Bilişim Teknolojileri Dergisi, 2019, 12(3), pp. 219-228. doi: 10.17671/gazibtd.457917.
  4. Uslu, O., & Akyol, S. Türkçe Haber Metinlerinin Makine Öğrenmesi Yöntemleri Kullanılarak Sınıflandırılması, ESTUDAM Bilişim Dergisi, 2019, 2(1), pp. 15-20.
  5. Doğan, K., & Arslantekin, S. Büyük Veri: Önemi, Yapısı Ve Günümüzdeki Durum, Ankara Üniversitesi Dil ve Tarih-Coğrafya Fakültesi Dergisi, 2016, 56(1), pp.15-36.
  6. Bach, M. P., Krstić, Ž., Seljan, S., & Turulja, L. Text mining for big data analysis in financial sector: A literature review, Sustainability, 2019, 11(5), pp. 1-27.
  7. Tan, A. H. Text mining: The state of the art and the challenges, In Proceedings of the pakdd 1999 workshop on knowledge disocovery from advanced databases, 1999, pp. 65-70.
  8. Coşkun, C., & Baykal, A. Veri Madenciliğinde Sınıflandırma Algoritmalarının Bir Örnek Üzerinde Karşılaştırılması. Akademik Bilişim, 2011, 11, pp. 51-58.

Details

Primary Language

English

Subjects

Computer Software

Journal Section

Research Article

Early Pub Date

June 30, 2023

Publication Date

June 25, 2023

Submission Date

June 8, 2023

Acceptance Date

June 19, 2023

Published in Issue

Year 2023 Volume: 4 Number: 1

APA
Ağduk, S., Aydemir, E., & Polat, A. (2023). Classification of News Texts from Different Languages with Machine Learning Algorithms. Journal of Soft Computing and Artificial Intelligence, 4(1), 29-37. https://doi.org/10.55195/jscai.1311380
AMA
1.Ağduk S, Aydemir E, Polat A. Classification of News Texts from Different Languages with Machine Learning Algorithms. JSCAI. 2023;4(1):29-37. doi:10.55195/jscai.1311380
Chicago
Ağduk, Sidar, Emrah Aydemir, and Ayfer Polat. 2023. “Classification of News Texts from Different Languages With Machine Learning Algorithms”. Journal of Soft Computing and Artificial Intelligence 4 (1): 29-37. https://doi.org/10.55195/jscai.1311380.
EndNote
Ağduk S, Aydemir E, Polat A (June 1, 2023) Classification of News Texts from Different Languages with Machine Learning Algorithms. Journal of Soft Computing and Artificial Intelligence 4 1 29–37.
IEEE
[1]S. Ağduk, E. Aydemir, and A. Polat, “Classification of News Texts from Different Languages with Machine Learning Algorithms”, JSCAI, vol. 4, no. 1, pp. 29–37, June 2023, doi: 10.55195/jscai.1311380.
ISNAD
Ağduk, Sidar - Aydemir, Emrah - Polat, Ayfer. “Classification of News Texts from Different Languages With Machine Learning Algorithms”. Journal of Soft Computing and Artificial Intelligence 4/1 (June 1, 2023): 29-37. https://doi.org/10.55195/jscai.1311380.
JAMA
1.Ağduk S, Aydemir E, Polat A. Classification of News Texts from Different Languages with Machine Learning Algorithms. JSCAI. 2023;4:29–37.
MLA
Ağduk, Sidar, et al. “Classification of News Texts from Different Languages With Machine Learning Algorithms”. Journal of Soft Computing and Artificial Intelligence, vol. 4, no. 1, June 2023, pp. 29-37, doi:10.55195/jscai.1311380.
Vancouver
1.Sidar Ağduk, Emrah Aydemir, Ayfer Polat. Classification of News Texts from Different Languages with Machine Learning Algorithms. JSCAI. 2023 Jun. 1;4(1):29-37. doi:10.55195/jscai.1311380

Cited By

COPE Logo           Crossref Logo                DergiPark Logo               Creative Commons Logo