Araştırma Makalesi
BibTex RIS Kaynak Göster

Metin temsili ve model seçiminin sınıflandırma performansına etkisi: Covid19-FNIR veri seti üzerinde TF-IDF, BoW ve Transformatör tabanlı yöntemlerin kapsamlı bir karşılaştırması

Yıl 2025, Cilt: 14 Sayı: 4, 1447 - 1461, 15.10.2025
https://doi.org/10.28948/ngumuh.1694988

Öz

Bu çalışmada, Terim Frekansı-Ters Doküman Frekansı (TF-IDF) ve Bag of Words (BoW) metin vektörleştirmesi kullanılarak %80 eğitim ve %20 teste ayrılmış bir veri kümesi üzerinde çeşitli makine öğrenimi (ML) modellerinin performansı değerlendirilmiştir. DistilBERT, RoBERTa ve alBERT gibi dönüştürücü tabanlı modeller, klasik makine öğrenimi algoritmaları ve Stacking, Hard Voting ve Soft Voting gibi topluluk yöntemleriyle entegre edilmiştir. Yığınlama her iki yöntemle de en yüksek performansı elde etmiştir- TF-IDF ile %92.62 Doğruluk ve %92.51 F1, BoW ile %92.29 Doğruluk ve %92.41 F1. BoW ile Hard Voting en yüksek geri çağırmayı (%95,23) vermiştir. Lojistik Regresyon ve DVM gibi klasik modeller BoW ile daha iyi performans göstererek sırasıyla %90.98 ve %90.51 Doğruluğa ulaşmıştır. Genel olarak, TF-IDF dengeli sonuçlar üretirken, BoW belirli durumlarda daha yüksek geri çağırma ve kesinlik sunmuştur. Bu sonuçlar, optimum sınıflandırma performansına ulaşmada hem model hem de metin temsili seçimlerinin önemini vurgulamaktadır.

Proje Numarası

yok

Kaynakça

  • J. A. Saenz, S. R. Kalathur Gopal and D. Shukla, Covid-19 fake news infodemic research dataset (CoVID19-FNIR Dataset), IEEE Dataport, 2021. https://dx.doi.org/10.21227/b5bt-5244
  •    M. Sikosana, O. Ajao and S. Maudsley-Barton, A comparative study of hybrid models in health misinformation text classification. OASIS ’24: 4th Int. Workshop on Open Challenges in Online Social Networks, pp. 18–25. Poznań, Poland, 9-13 October 2024. https://doi.org/10.1145/3677117.3685007
  •    R. Vinay, B. Premjith, D. Shukla, and K. P. Soman, Feature engineering and selection for the identification of fake news in social media, 2nd Int. Conf. on Signal and Data Processing, Bhopal, India, 10-11 June 2022. https://doi.org/10.1007/978-981-99-1410-4_24.
  •    M. Qadees and A. Hannan, Cross comparison of COVID-19 fake news detection machine learning models, 17th Int. Conf. on Open Source Systems and Technologies, Lahore, Pakistan, pp. 1–7, 20–21 December2023.https://doi.org/10.1109/ICOSST60641.2023.10414227
  •    M. Bozuyla and A. Özçift, Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data, Turkish Journal of Electrical Engineering and Computer Sciences, 30, 3, 908–926, 2022, https://doi.org/10.55730/1300-0632.3818.
  •    S. N. Başa and M. S. Basarslan, Sentiment analysis using machine learning techniques on IMDB dataset, 7th Int. Symp. on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, pp. 1–5, 26-28 October, 2023, https://doi.org/10.1109/ISMSIT58785.2023.10304923
  •    H. P. Luhn, A statistical approach to mechanized encoding and searching of literary information, IBM Journal of Research and Development, 1, 4, 309–317, October 1957, https://doi.org/10.1147/rd.14.0309.
  •    M. B. Çaki and M. Sinan Başarslan, Classification of fake news using machine learning and deep learning, Journal of Artificial Intelligence and Data Science, 4, 1, 22–32, 2024, https://dergipark.org.tr/pub/jaida
  •    R. Sjögren, K. Stridh, T. Skotare, and J. Trygg, Multivariate patent analysis—Using chemometrics to analyze collections of chemical and pharmaceutical patents, Journal of Chemometrics, 34, 1, 2020, https://doi.org/10.1002/cem.3041.
  • D. Cournapeau, Scikit-Learn, https://scikit-learn.org/stable/about.html, Accessed 1 March 2003
  • M. Tezgider, B. Yildiz, and G. Aydin, Improving word representation by tuning Word2Vec parameters with deep learning model, 2018 Int. Conf. on Artificial Intelligence and Data Processing (IDAP 2018), Malatya, Turkey, pp. 1–7, 28–30 September 2018, https://doi.org/10.1109/IDAP.2018.8620919
  • A. Onan, Mining opinions from instructor evaluation reviews: A deep learning approach, Computer Applications in Engineering Education, 28, 1, 117–138, 2020, https://doi.org/10.1002/cae.22179.
  • A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need, Advances in Neural Information Processing Systems 30 (NeurIPS 2017), Long Beach, California, USA, pp. 5999–6010, 4–9 December 2017.
  • D. M. A. S. Elkahwagy, C. J. Kiriacos, and M. Mansour, Logistic regression and other statistical tools in diagnostic biomarker studies, Clinical and Translational Oncology, 26, 9, 2172–2180, 2024, https://doi.org/10.1007/s12094-024-03413-8.
  • H. Mu and H. Nie, Research on the evaluation and enhancement strategies of college students’ health human capital in ‘Healthy Hunan’ under the background of big data, Applied Mathematics and Nonlinear Sciences, 9 (1), 2024, https://doi.org/10.2478/amns-2024-0400.
  • W. Mao et al., Power transformers fault diagnosis using graph neural networks based on dissolved gas data, Journal of Physics: Conference Series, 2387, 1, 012029, November, 2022, https://doi.org/10.1088/1742-6596/2387/1/012029.
  • Ö. Bezek Güre, Classification of liver disorders Diagnosis using Naïve Bayes method, Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, 13(1), 153–160, 2024, https://doi.org/10.17798/bitlisfen.1361016.
  • F. F. Hasibuan, M. H. Dar, and G. J. Yanris, Implementation of the Naïve Bayes method to determine the Level of Consumer Satisfaction, SinkrOn, 8 (2), 1000–1011, 2023, https://doi.org/10.33395/sinkron.v8i2.12349.
  • H. A. Abdulqader and A. M. Abdulazeez, Review on Decision Tree Algorithm in Healthcare Applications, Indonesian Journal of Computer Science, vol. 13, no. 3, Jun. 2024, https://doi.org/10.33022/ijcs.v13i3.4026.
  • R. Rahim and A. S. Ahmar, Cross-Validation and Validation Set Methods for Choosing K in KNN Algorithm for Healthcare Case Study, JINAV: Journal of Information and Visualization, 3(1), 57–61, 2022, https://doi.org/10.35877/454RI.jinav1557.
  • F. Aldi, I. Nozomi, and S. Soeheri, Comparison of Drug Type Classification Performance Using KNN Algorithm, SinkrOn, 7(3), 1028–1034, 2022, https://doi.org/10.33395/sinkron.v7i3.11487.
  • C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, 20, (3) 273–297, 1995, https://doi.org/10.1007/BF00994018.
  • T. S. Eswar and V. Karthick, Realtime visual object recognition using support vector machine comparing with K-Nearest Neighbor algorithm for improving accuracy, Journal of Pharmaceutical Negative Results, 13(SO4),2022,https://doi.org/10.47750/pnr.2022.13.S04.097.
  • J. Cai, M. Wang, and Y. Wu, Research on pedestrian crossing decision models and predictions based on machine learning, Sensors, 24 (1), 258, 2024, https://doi.org/10.3390/s24010258.
  • M. A. M. Mohammed and F. Türk, A Research: investigation of financial applications with blockchain technology, Hittite Journal of Science and Engineering, 11 (1), 33–40, 2024, https://doi.org/10.17350/HJSE19030000329.
  • Y. Chen, S. Chen, Y. Yang, and S. Lu, Comparison of decision tree and ensemble algorithms, Applied and Computational Engineering, 55 (1), 241–248, 2024, https://doi.org/10.54254/2755-2721/55/20241535.
  • M. Riansyah, S. Suwilo, and M. Zarlis, Improved accuracy in data mining decision tree classification using adaptive boosting, SinkrOn, 8 (2), 617–622, 2023, https://doi.org/10.33395/sinkron.v8i2.12055.
  • A. AlMohimeed, H. Saleh, S. Mostafa, R. M. A. Saad, and A. S. Talaat, Cervical cancer diagnosis using stacked ensemble model and optimized feature selection: an explainable artificial intelligence approach, Computers, 12 (10), 200, 2023, https://doi.org/10.3390/computers12100200.
  • S. Imangaliyev, J. Schlötterer, F. Meyer, and C. Seifert, Diagnosis of inflammatory bowel disease and colorectal cancer through multi-view stacked generalization applied on gut microbiome data, Diagnostics, 12 (10), 2514, 2022. https://doi.org/10.3390/diagnostics12102514.
  • M. Hasanah, R. A. Putri, M. A. R. Putra, and T. Ahmad, Analysis of Weight-Based Voting Classifier for Intrusion Detection System, International Journal of Intelligent Engineering and Systems, 17 (2), 190–200. 2024, https://doi.org/10.22266/ijies2024.0430.17.
  • B. Fieri and D. Suhartono, Offensive language detection using soft voting ensemble model, Mendel, 29 (1), 1–6, 2023. https://doi.org/10.13164/mendel.2023.1.001.
  • O. Octavian, A. Badruzzaman, Muhammand Yusuf Ridho, and B. D. Trisedya, Enhancing Weighted Averaging for CNN Model Ensemble in Plant Diseases Image Classification, Jurnal Resti, 8 (2), 272–279, 2024, https://doi.org/10.29207/resti.v8i2.5669.
  • B. Hasan, Zubair, S. A. Shaikh, A. Khaliq, and G. Nadeem, Data-Driven decision-making: accurate customer churn prediction with Cat-Boost, The Asian Bulletin of Big Data Management, 4 (02), 2024. https://doi.org/10.62019/abbdm.v4i02.175.
  • T. Suresh, T. A. Assegie, S. Ganesan, R. L. Tulasi, R. Mothukuri, and A. O. Salau, Explainable extreme boosting model for breast cancer diagnosis, International Journal of Electrical and Computer Engineering, 13(5), 5764, 2023.
  • J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186, Minnesota, USA, 2-7 June 2019.
  • Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, RoBERTa: A robustly optimized BERT pretraining approach, arXiv preprint arXiv:1907.11692, July 26, 2019. https://doi.org/10.48550/arXiv.1907.11692
  • V. Sanh, L. Debut, J. Chaumond, and T. Wolf, DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108, October 2, 2019. https://doi.org/10.48550/arXiv.1910.01108
  • Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, ALBERT: A Lite BERT for self-supervised learning of language representations, arXiv preprint arXiv:1909.11942, September 26, 2019. https://doi.org/10.48550/arXiv.1909.11942

The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset

Yıl 2025, Cilt: 14 Sayı: 4, 1447 - 1461, 15.10.2025
https://doi.org/10.28948/ngumuh.1694988

Öz

This study evaluates the performance of various machine learning (ML) models on a dataset split into 80% training and 20% testing using Term Frequency-Inverse Document Frequency (TF-IDF) and Bag of Words (BoW) text vectorization. Transformer-based models like DistilBERT, RoBERTa, and alBERT were integrated with classical ML algorithms and ensemble methods such as Stacking, Hard Voting, and Soft Voting. Stacking achieved the highest performance with both methods—92.62% Accuracy (Acc) and 92.51% F1-score (F1) with TF-IDF, and 92.29% Acc and 92.41% F1 with BoW. Hard Voting with BoW yielded the highest Recall (95.23%). Classical models like Logistic Regression (LR) and Support Vector Machine (SVM) performed better with BoW, reaching 90.98% and 90.51% Acc, respectively. Overall, TF-IDF produced balanced outcomes, while BoW offered higher Recall and Precision in specific cases. These results highlight the significance of both model and text representation choices in achieving optimal classification performance.

Proje Numarası

yok

Kaynakça

  • J. A. Saenz, S. R. Kalathur Gopal and D. Shukla, Covid-19 fake news infodemic research dataset (CoVID19-FNIR Dataset), IEEE Dataport, 2021. https://dx.doi.org/10.21227/b5bt-5244
  •    M. Sikosana, O. Ajao and S. Maudsley-Barton, A comparative study of hybrid models in health misinformation text classification. OASIS ’24: 4th Int. Workshop on Open Challenges in Online Social Networks, pp. 18–25. Poznań, Poland, 9-13 October 2024. https://doi.org/10.1145/3677117.3685007
  •    R. Vinay, B. Premjith, D. Shukla, and K. P. Soman, Feature engineering and selection for the identification of fake news in social media, 2nd Int. Conf. on Signal and Data Processing, Bhopal, India, 10-11 June 2022. https://doi.org/10.1007/978-981-99-1410-4_24.
  •    M. Qadees and A. Hannan, Cross comparison of COVID-19 fake news detection machine learning models, 17th Int. Conf. on Open Source Systems and Technologies, Lahore, Pakistan, pp. 1–7, 20–21 December2023.https://doi.org/10.1109/ICOSST60641.2023.10414227
  •    M. Bozuyla and A. Özçift, Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data, Turkish Journal of Electrical Engineering and Computer Sciences, 30, 3, 908–926, 2022, https://doi.org/10.55730/1300-0632.3818.
  •    S. N. Başa and M. S. Basarslan, Sentiment analysis using machine learning techniques on IMDB dataset, 7th Int. Symp. on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, pp. 1–5, 26-28 October, 2023, https://doi.org/10.1109/ISMSIT58785.2023.10304923
  •    H. P. Luhn, A statistical approach to mechanized encoding and searching of literary information, IBM Journal of Research and Development, 1, 4, 309–317, October 1957, https://doi.org/10.1147/rd.14.0309.
  •    M. B. Çaki and M. Sinan Başarslan, Classification of fake news using machine learning and deep learning, Journal of Artificial Intelligence and Data Science, 4, 1, 22–32, 2024, https://dergipark.org.tr/pub/jaida
  •    R. Sjögren, K. Stridh, T. Skotare, and J. Trygg, Multivariate patent analysis—Using chemometrics to analyze collections of chemical and pharmaceutical patents, Journal of Chemometrics, 34, 1, 2020, https://doi.org/10.1002/cem.3041.
  • D. Cournapeau, Scikit-Learn, https://scikit-learn.org/stable/about.html, Accessed 1 March 2003
  • M. Tezgider, B. Yildiz, and G. Aydin, Improving word representation by tuning Word2Vec parameters with deep learning model, 2018 Int. Conf. on Artificial Intelligence and Data Processing (IDAP 2018), Malatya, Turkey, pp. 1–7, 28–30 September 2018, https://doi.org/10.1109/IDAP.2018.8620919
  • A. Onan, Mining opinions from instructor evaluation reviews: A deep learning approach, Computer Applications in Engineering Education, 28, 1, 117–138, 2020, https://doi.org/10.1002/cae.22179.
  • A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need, Advances in Neural Information Processing Systems 30 (NeurIPS 2017), Long Beach, California, USA, pp. 5999–6010, 4–9 December 2017.
  • D. M. A. S. Elkahwagy, C. J. Kiriacos, and M. Mansour, Logistic regression and other statistical tools in diagnostic biomarker studies, Clinical and Translational Oncology, 26, 9, 2172–2180, 2024, https://doi.org/10.1007/s12094-024-03413-8.
  • H. Mu and H. Nie, Research on the evaluation and enhancement strategies of college students’ health human capital in ‘Healthy Hunan’ under the background of big data, Applied Mathematics and Nonlinear Sciences, 9 (1), 2024, https://doi.org/10.2478/amns-2024-0400.
  • W. Mao et al., Power transformers fault diagnosis using graph neural networks based on dissolved gas data, Journal of Physics: Conference Series, 2387, 1, 012029, November, 2022, https://doi.org/10.1088/1742-6596/2387/1/012029.
  • Ö. Bezek Güre, Classification of liver disorders Diagnosis using Naïve Bayes method, Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, 13(1), 153–160, 2024, https://doi.org/10.17798/bitlisfen.1361016.
  • F. F. Hasibuan, M. H. Dar, and G. J. Yanris, Implementation of the Naïve Bayes method to determine the Level of Consumer Satisfaction, SinkrOn, 8 (2), 1000–1011, 2023, https://doi.org/10.33395/sinkron.v8i2.12349.
  • H. A. Abdulqader and A. M. Abdulazeez, Review on Decision Tree Algorithm in Healthcare Applications, Indonesian Journal of Computer Science, vol. 13, no. 3, Jun. 2024, https://doi.org/10.33022/ijcs.v13i3.4026.
  • R. Rahim and A. S. Ahmar, Cross-Validation and Validation Set Methods for Choosing K in KNN Algorithm for Healthcare Case Study, JINAV: Journal of Information and Visualization, 3(1), 57–61, 2022, https://doi.org/10.35877/454RI.jinav1557.
  • F. Aldi, I. Nozomi, and S. Soeheri, Comparison of Drug Type Classification Performance Using KNN Algorithm, SinkrOn, 7(3), 1028–1034, 2022, https://doi.org/10.33395/sinkron.v7i3.11487.
  • C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, 20, (3) 273–297, 1995, https://doi.org/10.1007/BF00994018.
  • T. S. Eswar and V. Karthick, Realtime visual object recognition using support vector machine comparing with K-Nearest Neighbor algorithm for improving accuracy, Journal of Pharmaceutical Negative Results, 13(SO4),2022,https://doi.org/10.47750/pnr.2022.13.S04.097.
  • J. Cai, M. Wang, and Y. Wu, Research on pedestrian crossing decision models and predictions based on machine learning, Sensors, 24 (1), 258, 2024, https://doi.org/10.3390/s24010258.
  • M. A. M. Mohammed and F. Türk, A Research: investigation of financial applications with blockchain technology, Hittite Journal of Science and Engineering, 11 (1), 33–40, 2024, https://doi.org/10.17350/HJSE19030000329.
  • Y. Chen, S. Chen, Y. Yang, and S. Lu, Comparison of decision tree and ensemble algorithms, Applied and Computational Engineering, 55 (1), 241–248, 2024, https://doi.org/10.54254/2755-2721/55/20241535.
  • M. Riansyah, S. Suwilo, and M. Zarlis, Improved accuracy in data mining decision tree classification using adaptive boosting, SinkrOn, 8 (2), 617–622, 2023, https://doi.org/10.33395/sinkron.v8i2.12055.
  • A. AlMohimeed, H. Saleh, S. Mostafa, R. M. A. Saad, and A. S. Talaat, Cervical cancer diagnosis using stacked ensemble model and optimized feature selection: an explainable artificial intelligence approach, Computers, 12 (10), 200, 2023, https://doi.org/10.3390/computers12100200.
  • S. Imangaliyev, J. Schlötterer, F. Meyer, and C. Seifert, Diagnosis of inflammatory bowel disease and colorectal cancer through multi-view stacked generalization applied on gut microbiome data, Diagnostics, 12 (10), 2514, 2022. https://doi.org/10.3390/diagnostics12102514.
  • M. Hasanah, R. A. Putri, M. A. R. Putra, and T. Ahmad, Analysis of Weight-Based Voting Classifier for Intrusion Detection System, International Journal of Intelligent Engineering and Systems, 17 (2), 190–200. 2024, https://doi.org/10.22266/ijies2024.0430.17.
  • B. Fieri and D. Suhartono, Offensive language detection using soft voting ensemble model, Mendel, 29 (1), 1–6, 2023. https://doi.org/10.13164/mendel.2023.1.001.
  • O. Octavian, A. Badruzzaman, Muhammand Yusuf Ridho, and B. D. Trisedya, Enhancing Weighted Averaging for CNN Model Ensemble in Plant Diseases Image Classification, Jurnal Resti, 8 (2), 272–279, 2024, https://doi.org/10.29207/resti.v8i2.5669.
  • B. Hasan, Zubair, S. A. Shaikh, A. Khaliq, and G. Nadeem, Data-Driven decision-making: accurate customer churn prediction with Cat-Boost, The Asian Bulletin of Big Data Management, 4 (02), 2024. https://doi.org/10.62019/abbdm.v4i02.175.
  • T. Suresh, T. A. Assegie, S. Ganesan, R. L. Tulasi, R. Mothukuri, and A. O. Salau, Explainable extreme boosting model for breast cancer diagnosis, International Journal of Electrical and Computer Engineering, 13(5), 5764, 2023.
  • J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186, Minnesota, USA, 2-7 June 2019.
  • Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, RoBERTa: A robustly optimized BERT pretraining approach, arXiv preprint arXiv:1907.11692, July 26, 2019. https://doi.org/10.48550/arXiv.1907.11692
  • V. Sanh, L. Debut, J. Chaumond, and T. Wolf, DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108, October 2, 2019. https://doi.org/10.48550/arXiv.1910.01108
  • Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, ALBERT: A Lite BERT for self-supervised learning of language representations, arXiv preprint arXiv:1909.11942, September 26, 2019. https://doi.org/10.48550/arXiv.1909.11942
Toplam 38 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Yapay Görme, Doğal Dil İşleme
Bölüm Araştırma Makalesi
Yazarlar

Muhammet Sinan Başarslan 0000-0002-7996-9169

Fatih Bal 0000-0002-7179-1634

Proje Numarası yok
Erken Görünüm Tarihi 30 Eylül 2025
Yayımlanma Tarihi 15 Ekim 2025
Gönderilme Tarihi 7 Mayıs 2025
Kabul Tarihi 3 Eylül 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 14 Sayı: 4

Kaynak Göster

APA Başarslan, M. S., & Bal, F. (2025). The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi, 14(4), 1447-1461. https://doi.org/10.28948/ngumuh.1694988
AMA Başarslan MS, Bal F. The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. NÖHÜ Müh. Bilim. Derg. Ekim 2025;14(4):1447-1461. doi:10.28948/ngumuh.1694988
Chicago Başarslan, Muhammet Sinan, ve Fatih Bal. “The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset”. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi 14, sy. 4 (Ekim 2025): 1447-61. https://doi.org/10.28948/ngumuh.1694988.
EndNote Başarslan MS, Bal F (01 Ekim 2025) The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi 14 4 1447–1461.
IEEE M. S. Başarslan ve F. Bal, “The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset”, NÖHÜ Müh. Bilim. Derg., c. 14, sy. 4, ss. 1447–1461, 2025, doi: 10.28948/ngumuh.1694988.
ISNAD Başarslan, Muhammet Sinan - Bal, Fatih. “The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset”. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi 14/4 (Ekim2025), 1447-1461. https://doi.org/10.28948/ngumuh.1694988.
JAMA Başarslan MS, Bal F. The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. NÖHÜ Müh. Bilim. Derg. 2025;14:1447–1461.
MLA Başarslan, Muhammet Sinan ve Fatih Bal. “The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset”. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi, c. 14, sy. 4, 2025, ss. 1447-61, doi:10.28948/ngumuh.1694988.
Vancouver Başarslan MS, Bal F. The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. NÖHÜ Müh. Bilim. Derg. 2025;14(4):1447-61.

download