The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset

Muhammet Sinan Başarslan; Fatih Bal

doi:10.28948/ngumuh.1694988

Research Article

Metin temsili ve model seçiminin sınıflandırma performansına etkisi: Covid19-FNIR veri seti üzerinde TF-IDF, BoW ve Transformatör tabanlı yöntemlerin kapsamlı bir karşılaştırması

Year 2025, Volume: 14 Issue: 4, 1447 - 1461, 15.10.2025

Muhammet Sinan Başarslan , Fatih Bal

Abstract

Bu çalışmada, Terim Frekansı-Ters Doküman Frekansı (TF-IDF) ve Bag of Words (BoW) metin vektörleştirmesi kullanılarak %80 eğitim ve %20 teste ayrılmış bir veri kümesi üzerinde çeşitli makine öğrenimi (ML) modellerinin performansı değerlendirilmiştir. DistilBERT, RoBERTa ve alBERT gibi dönüştürücü tabanlı modeller, klasik makine öğrenimi algoritmaları ve Stacking, Hard Voting ve Soft Voting gibi topluluk yöntemleriyle entegre edilmiştir. Yığınlama her iki yöntemle de en yüksek performansı elde etmiştir- TF-IDF ile %92.62 Doğruluk ve %92.51 F1, BoW ile %92.29 Doğruluk ve %92.41 F1. BoW ile Hard Voting en yüksek geri çağırmayı (%95,23) vermiştir. Lojistik Regresyon ve DVM gibi klasik modeller BoW ile daha iyi performans göstererek sırasıyla %90.98 ve %90.51 Doğruluğa ulaşmıştır. Genel olarak, TF-IDF dengeli sonuçlar üretirken, BoW belirli durumlarda daha yüksek geri çağırma ve kesinlik sunmuştur. Bu sonuçlar, optimum sınıflandırma performansına ulaşmada hem model hem de metin temsili seçimlerinin önemini vurgulamaktadır.

Keywords

Sahte haber , ML , Metin Gösterimi , Önceden eğitilmiş

Project Number

yok

References

J. A. Saenz, S. R. Kalathur Gopal and D. Shukla, Covid-19 fake news infodemic research dataset (CoVID19-FNIR Dataset), IEEE Dataport, 2021. https://dx.doi.org/10.21227/b5bt-5244
M. Sikosana, O. Ajao and S. Maudsley-Barton, A comparative study of hybrid models in health misinformation text classification. OASIS ’24: 4th Int. Workshop on Open Challenges in Online Social Networks, pp. 18–25. Poznań, Poland, 9-13 October 2024. https://doi.org/10.1145/3677117.3685007
R. Vinay, B. Premjith, D. Shukla, and K. P. Soman, Feature engineering and selection for the identification of fake news in social media, 2nd Int. Conf. on Signal and Data Processing, Bhopal, India, 10-11 June 2022. https://doi.org/10.1007/978-981-99-1410-4_24.
M. Qadees and A. Hannan, Cross comparison of COVID-19 fake news detection machine learning models, 17th Int. Conf. on Open Source Systems and Technologies, Lahore, Pakistan, pp. 1–7, 20–21 December2023.https://doi.org/10.1109/ICOSST60641.2023.10414227
M. Bozuyla and A. Özçift, Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data, Turkish Journal of Electrical Engineering and Computer Sciences, 30, 3, 908–926, 2022, https://doi.org/10.55730/1300-0632.3818.
S. N. Başa and M. S. Basarslan, Sentiment analysis using machine learning techniques on IMDB dataset, 7th Int. Symp. on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, pp. 1–5, 26-28 October, 2023, https://doi.org/10.1109/ISMSIT58785.2023.10304923
H. P. Luhn, A statistical approach to mechanized encoding and searching of literary information, IBM Journal of Research and Development, 1, 4, 309–317, October 1957, https://doi.org/10.1147/rd.14.0309.
M. B. Çaki and M. Sinan Başarslan, Classification of fake news using machine learning and deep learning, Journal of Artificial Intelligence and Data Science, 4, 1, 22–32, 2024, https://dergipark.org.tr/pub/jaida
R. Sjögren, K. Stridh, T. Skotare, and J. Trygg, Multivariate patent analysis—Using chemometrics to analyze collections of chemical and pharmaceutical patents, Journal of Chemometrics, 34, 1, 2020, https://doi.org/10.1002/cem.3041.
D. Cournapeau, Scikit-Learn, https://scikit-learn.org/stable/about.html, Accessed 1 March 2003
M. Tezgider, B. Yildiz, and G. Aydin, Improving word representation by tuning Word2Vec parameters with deep learning model, 2018 Int. Conf. on Artificial Intelligence and Data Processing (IDAP 2018), Malatya, Turkey, pp. 1–7, 28–30 September 2018, https://doi.org/10.1109/IDAP.2018.8620919
A. Onan, Mining opinions from instructor evaluation reviews: A deep learning approach, Computer Applications in Engineering Education, 28, 1, 117–138, 2020, https://doi.org/10.1002/cae.22179.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need, Advances in Neural Information Processing Systems 30 (NeurIPS 2017), Long Beach, California, USA, pp. 5999–6010, 4–9 December 2017.
D. M. A. S. Elkahwagy, C. J. Kiriacos, and M. Mansour, Logistic regression and other statistical tools in diagnostic biomarker studies, Clinical and Translational Oncology, 26, 9, 2172–2180, 2024, https://doi.org/10.1007/s12094-024-03413-8.
H. Mu and H. Nie, Research on the evaluation and enhancement strategies of college students’ health human capital in ‘Healthy Hunan’ under the background of big data, Applied Mathematics and Nonlinear Sciences, 9 (1), 2024, https://doi.org/10.2478/amns-2024-0400.
W. Mao et al., Power transformers fault diagnosis using graph neural networks based on dissolved gas data, Journal of Physics: Conference Series, 2387, 1, 012029, November, 2022, https://doi.org/10.1088/1742-6596/2387/1/012029.
Ö. Bezek Güre, Classification of liver disorders Diagnosis using Naïve Bayes method, Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, 13(1), 153–160, 2024, https://doi.org/10.17798/bitlisfen.1361016.
F. F. Hasibuan, M. H. Dar, and G. J. Yanris, Implementation of the Naïve Bayes method to determine the Level of Consumer Satisfaction, SinkrOn, 8 (2), 1000–1011, 2023, https://doi.org/10.33395/sinkron.v8i2.12349.
H. A. Abdulqader and A. M. Abdulazeez, Review on Decision Tree Algorithm in Healthcare Applications, Indonesian Journal of Computer Science, vol. 13, no. 3, Jun. 2024, https://doi.org/10.33022/ijcs.v13i3.4026.
R. Rahim and A. S. Ahmar, Cross-Validation and Validation Set Methods for Choosing K in KNN Algorithm for Healthcare Case Study, JINAV: Journal of Information and Visualization, 3(1), 57–61, 2022, https://doi.org/10.35877/454RI.jinav1557.
F. Aldi, I. Nozomi, and S. Soeheri, Comparison of Drug Type Classification Performance Using KNN Algorithm, SinkrOn, 7(3), 1028–1034, 2022, https://doi.org/10.33395/sinkron.v7i3.11487.
C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, 20, (3) 273–297, 1995, https://doi.org/10.1007/BF00994018.
T. S. Eswar and V. Karthick, Realtime visual object recognition using support vector machine comparing with K-Nearest Neighbor algorithm for improving accuracy, Journal of Pharmaceutical Negative Results, 13(SO4),2022,https://doi.org/10.47750/pnr.2022.13.S04.097.
J. Cai, M. Wang, and Y. Wu, Research on pedestrian crossing decision models and predictions based on machine learning, Sensors, 24 (1), 258, 2024, https://doi.org/10.3390/s24010258.
M. A. M. Mohammed and F. Türk, A Research: investigation of financial applications with blockchain technology, Hittite Journal of Science and Engineering, 11 (1), 33–40, 2024, https://doi.org/10.17350/HJSE19030000329.
Y. Chen, S. Chen, Y. Yang, and S. Lu, Comparison of decision tree and ensemble algorithms, Applied and Computational Engineering, 55 (1), 241–248, 2024, https://doi.org/10.54254/2755-2721/55/20241535.
M. Riansyah, S. Suwilo, and M. Zarlis, Improved accuracy in data mining decision tree classification using adaptive boosting, SinkrOn, 8 (2), 617–622, 2023, https://doi.org/10.33395/sinkron.v8i2.12055.
A. AlMohimeed, H. Saleh, S. Mostafa, R. M. A. Saad, and A. S. Talaat, Cervical cancer diagnosis using stacked ensemble model and optimized feature selection: an explainable artificial intelligence approach, Computers, 12 (10), 200, 2023, https://doi.org/10.3390/computers12100200.
S. Imangaliyev, J. Schlötterer, F. Meyer, and C. Seifert, Diagnosis of inflammatory bowel disease and colorectal cancer through multi-view stacked generalization applied on gut microbiome data, Diagnostics, 12 (10), 2514, 2022. https://doi.org/10.3390/diagnostics12102514.
M. Hasanah, R. A. Putri, M. A. R. Putra, and T. Ahmad, Analysis of Weight-Based Voting Classifier for Intrusion Detection System, International Journal of Intelligent Engineering and Systems, 17 (2), 190–200. 2024, https://doi.org/10.22266/ijies2024.0430.17.
B. Fieri and D. Suhartono, Offensive language detection using soft voting ensemble model, Mendel, 29 (1), 1–6, 2023. https://doi.org/10.13164/mendel.2023.1.001.
O. Octavian, A. Badruzzaman, Muhammand Yusuf Ridho, and B. D. Trisedya, Enhancing Weighted Averaging for CNN Model Ensemble in Plant Diseases Image Classification, Jurnal Resti, 8 (2), 272–279, 2024, https://doi.org/10.29207/resti.v8i2.5669.
B. Hasan, Zubair, S. A. Shaikh, A. Khaliq, and G. Nadeem, Data-Driven decision-making: accurate customer churn prediction with Cat-Boost, The Asian Bulletin of Big Data Management, 4 (02), 2024. https://doi.org/10.62019/abbdm.v4i02.175.
T. Suresh, T. A. Assegie, S. Ganesan, R. L. Tulasi, R. Mothukuri, and A. O. Salau, Explainable extreme boosting model for breast cancer diagnosis, International Journal of Electrical and Computer Engineering, 13(5), 5764, 2023.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186, Minnesota, USA, 2-7 June 2019.
Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, RoBERTa: A robustly optimized BERT pretraining approach, arXiv preprint arXiv:1907.11692, July 26, 2019. https://doi.org/10.48550/arXiv.1907.11692
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108, October 2, 2019. https://doi.org/10.48550/arXiv.1910.01108
Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, ALBERT: A Lite BERT for self-supervised learning of language representations, arXiv preprint arXiv:1909.11942, September 26, 2019. https://doi.org/10.48550/arXiv.1909.11942

The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset

Year 2025, Volume: 14 Issue: 4, 1447 - 1461, 15.10.2025

Muhammet Sinan Başarslan , Fatih Bal

Abstract

This study evaluates the performance of various machine learning (ML) models on a dataset split into 80% training and 20% testing using Term Frequency-Inverse Document Frequency (TF-IDF) and Bag of Words (BoW) text vectorization. Transformer-based models like DistilBERT, RoBERTa, and alBERT were integrated with classical ML algorithms and ensemble methods such as Stacking, Hard Voting, and Soft Voting. Stacking achieved the highest performance with both methods—92.62% Accuracy (Acc) and 92.51% F1-score (F1) with TF-IDF, and 92.29% Acc and 92.41% F1 with BoW. Hard Voting with BoW yielded the highest Recall (95.23%). Classical models like Logistic Regression (LR) and Support Vector Machine (SVM) performed better with BoW, reaching 90.98% and 90.51% Acc, respectively. Overall, TF-IDF produced balanced outcomes, while BoW offered higher Recall and Precision in specific cases. These results highlight the significance of both model and text representation choices in achieving optimal classification performance.

Keywords

Fake news , ML , Text Representation , Pre-trained

Project Number

yok

References

J. A. Saenz, S. R. Kalathur Gopal and D. Shukla, Covid-19 fake news infodemic research dataset (CoVID19-FNIR Dataset), IEEE Dataport, 2021. https://dx.doi.org/10.21227/b5bt-5244
M. Sikosana, O. Ajao and S. Maudsley-Barton, A comparative study of hybrid models in health misinformation text classification. OASIS ’24: 4th Int. Workshop on Open Challenges in Online Social Networks, pp. 18–25. Poznań, Poland, 9-13 October 2024. https://doi.org/10.1145/3677117.3685007
R. Vinay, B. Premjith, D. Shukla, and K. P. Soman, Feature engineering and selection for the identification of fake news in social media, 2nd Int. Conf. on Signal and Data Processing, Bhopal, India, 10-11 June 2022. https://doi.org/10.1007/978-981-99-1410-4_24.
M. Qadees and A. Hannan, Cross comparison of COVID-19 fake news detection machine learning models, 17th Int. Conf. on Open Source Systems and Technologies, Lahore, Pakistan, pp. 1–7, 20–21 December2023.https://doi.org/10.1109/ICOSST60641.2023.10414227
M. Bozuyla and A. Özçift, Developing a fake news identification model with advanced deep language transformers for Turkish COVID-19 misinformation data, Turkish Journal of Electrical Engineering and Computer Sciences, 30, 3, 908–926, 2022, https://doi.org/10.55730/1300-0632.3818.
S. N. Başa and M. S. Basarslan, Sentiment analysis using machine learning techniques on IMDB dataset, 7th Int. Symp. on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, pp. 1–5, 26-28 October, 2023, https://doi.org/10.1109/ISMSIT58785.2023.10304923
H. P. Luhn, A statistical approach to mechanized encoding and searching of literary information, IBM Journal of Research and Development, 1, 4, 309–317, October 1957, https://doi.org/10.1147/rd.14.0309.
M. B. Çaki and M. Sinan Başarslan, Classification of fake news using machine learning and deep learning, Journal of Artificial Intelligence and Data Science, 4, 1, 22–32, 2024, https://dergipark.org.tr/pub/jaida
R. Sjögren, K. Stridh, T. Skotare, and J. Trygg, Multivariate patent analysis—Using chemometrics to analyze collections of chemical and pharmaceutical patents, Journal of Chemometrics, 34, 1, 2020, https://doi.org/10.1002/cem.3041.
D. Cournapeau, Scikit-Learn, https://scikit-learn.org/stable/about.html, Accessed 1 March 2003
M. Tezgider, B. Yildiz, and G. Aydin, Improving word representation by tuning Word2Vec parameters with deep learning model, 2018 Int. Conf. on Artificial Intelligence and Data Processing (IDAP 2018), Malatya, Turkey, pp. 1–7, 28–30 September 2018, https://doi.org/10.1109/IDAP.2018.8620919
A. Onan, Mining opinions from instructor evaluation reviews: A deep learning approach, Computer Applications in Engineering Education, 28, 1, 117–138, 2020, https://doi.org/10.1002/cae.22179.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need, Advances in Neural Information Processing Systems 30 (NeurIPS 2017), Long Beach, California, USA, pp. 5999–6010, 4–9 December 2017.
D. M. A. S. Elkahwagy, C. J. Kiriacos, and M. Mansour, Logistic regression and other statistical tools in diagnostic biomarker studies, Clinical and Translational Oncology, 26, 9, 2172–2180, 2024, https://doi.org/10.1007/s12094-024-03413-8.
H. Mu and H. Nie, Research on the evaluation and enhancement strategies of college students’ health human capital in ‘Healthy Hunan’ under the background of big data, Applied Mathematics and Nonlinear Sciences, 9 (1), 2024, https://doi.org/10.2478/amns-2024-0400.
W. Mao et al., Power transformers fault diagnosis using graph neural networks based on dissolved gas data, Journal of Physics: Conference Series, 2387, 1, 012029, November, 2022, https://doi.org/10.1088/1742-6596/2387/1/012029.
Ö. Bezek Güre, Classification of liver disorders Diagnosis using Naïve Bayes method, Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, 13(1), 153–160, 2024, https://doi.org/10.17798/bitlisfen.1361016.
F. F. Hasibuan, M. H. Dar, and G. J. Yanris, Implementation of the Naïve Bayes method to determine the Level of Consumer Satisfaction, SinkrOn, 8 (2), 1000–1011, 2023, https://doi.org/10.33395/sinkron.v8i2.12349.
H. A. Abdulqader and A. M. Abdulazeez, Review on Decision Tree Algorithm in Healthcare Applications, Indonesian Journal of Computer Science, vol. 13, no. 3, Jun. 2024, https://doi.org/10.33022/ijcs.v13i3.4026.
R. Rahim and A. S. Ahmar, Cross-Validation and Validation Set Methods for Choosing K in KNN Algorithm for Healthcare Case Study, JINAV: Journal of Information and Visualization, 3(1), 57–61, 2022, https://doi.org/10.35877/454RI.jinav1557.
F. Aldi, I. Nozomi, and S. Soeheri, Comparison of Drug Type Classification Performance Using KNN Algorithm, SinkrOn, 7(3), 1028–1034, 2022, https://doi.org/10.33395/sinkron.v7i3.11487.
C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, 20, (3) 273–297, 1995, https://doi.org/10.1007/BF00994018.
T. S. Eswar and V. Karthick, Realtime visual object recognition using support vector machine comparing with K-Nearest Neighbor algorithm for improving accuracy, Journal of Pharmaceutical Negative Results, 13(SO4),2022,https://doi.org/10.47750/pnr.2022.13.S04.097.
J. Cai, M. Wang, and Y. Wu, Research on pedestrian crossing decision models and predictions based on machine learning, Sensors, 24 (1), 258, 2024, https://doi.org/10.3390/s24010258.
M. A. M. Mohammed and F. Türk, A Research: investigation of financial applications with blockchain technology, Hittite Journal of Science and Engineering, 11 (1), 33–40, 2024, https://doi.org/10.17350/HJSE19030000329.
Y. Chen, S. Chen, Y. Yang, and S. Lu, Comparison of decision tree and ensemble algorithms, Applied and Computational Engineering, 55 (1), 241–248, 2024, https://doi.org/10.54254/2755-2721/55/20241535.
M. Riansyah, S. Suwilo, and M. Zarlis, Improved accuracy in data mining decision tree classification using adaptive boosting, SinkrOn, 8 (2), 617–622, 2023, https://doi.org/10.33395/sinkron.v8i2.12055.
A. AlMohimeed, H. Saleh, S. Mostafa, R. M. A. Saad, and A. S. Talaat, Cervical cancer diagnosis using stacked ensemble model and optimized feature selection: an explainable artificial intelligence approach, Computers, 12 (10), 200, 2023, https://doi.org/10.3390/computers12100200.
S. Imangaliyev, J. Schlötterer, F. Meyer, and C. Seifert, Diagnosis of inflammatory bowel disease and colorectal cancer through multi-view stacked generalization applied on gut microbiome data, Diagnostics, 12 (10), 2514, 2022. https://doi.org/10.3390/diagnostics12102514.
M. Hasanah, R. A. Putri, M. A. R. Putra, and T. Ahmad, Analysis of Weight-Based Voting Classifier for Intrusion Detection System, International Journal of Intelligent Engineering and Systems, 17 (2), 190–200. 2024, https://doi.org/10.22266/ijies2024.0430.17.
B. Fieri and D. Suhartono, Offensive language detection using soft voting ensemble model, Mendel, 29 (1), 1–6, 2023. https://doi.org/10.13164/mendel.2023.1.001.
O. Octavian, A. Badruzzaman, Muhammand Yusuf Ridho, and B. D. Trisedya, Enhancing Weighted Averaging for CNN Model Ensemble in Plant Diseases Image Classification, Jurnal Resti, 8 (2), 272–279, 2024, https://doi.org/10.29207/resti.v8i2.5669.
B. Hasan, Zubair, S. A. Shaikh, A. Khaliq, and G. Nadeem, Data-Driven decision-making: accurate customer churn prediction with Cat-Boost, The Asian Bulletin of Big Data Management, 4 (02), 2024. https://doi.org/10.62019/abbdm.v4i02.175.
T. Suresh, T. A. Assegie, S. Ganesan, R. L. Tulasi, R. Mothukuri, and A. O. Salau, Explainable extreme boosting model for breast cancer diagnosis, International Journal of Electrical and Computer Engineering, 13(5), 5764, 2023.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186, Minnesota, USA, 2-7 June 2019.
Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, RoBERTa: A robustly optimized BERT pretraining approach, arXiv preprint arXiv:1907.11692, July 26, 2019. https://doi.org/10.48550/arXiv.1907.11692
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, arXiv preprint arXiv:1910.01108, October 2, 2019. https://doi.org/10.48550/arXiv.1910.01108
Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut, ALBERT: A Lite BERT for self-supervised learning of language representations, arXiv preprint arXiv:1909.11942, September 26, 2019. https://doi.org/10.48550/arXiv.1909.11942

There are 38 citations in total.

Details

Primary Language	English
Subjects	Machine Vision , Natural Language Processing
Journal Section	Research Articles
Authors	Muhammet Sinan Başarslan 0000-0002-7996-9169 Fatih Bal 0000-0002-7179-1634
Project Number	yok
Early Pub Date	September 30, 2025
Publication Date	October 15, 2025
Submission Date	May 7, 2025
Acceptance Date	September 3, 2025
Published in Issue	Year 2025 Volume: 14 Issue: 4

Cite

APA	Başarslan, M. S., & Bal, F. (2025). The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi, 14(4), 1447-1461. https://doi.org/10.28948/ngumuh.1694988
AMA	Başarslan MS, Bal F. The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. NOHU J. Eng. Sci. October 2025;14(4):1447-1461. doi:10.28948/ngumuh.1694988
Chicago	Başarslan, Muhammet Sinan, and Fatih Bal. “The Effect of Text Representation and Model Selection on Classification Performance: A Comprehensive Comparison of TF-IDF, Bow and Transformer-Based Methods on the Covid19-FNIR Dataset”. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi 14, no. 4 (October 2025): 1447-61. https://doi.org/10.28948/ngumuh.1694988.
EndNote	Başarslan MS, Bal F (October 1, 2025) The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi 14 4 1447–1461.
IEEE	M. S. Başarslan and F. Bal, “The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset”, NOHU J. Eng. Sci., vol. 14, no. 4, pp. 1447–1461, 2025, doi: 10.28948/ngumuh.1694988.
ISNAD	Başarslan, Muhammet Sinan - Bal, Fatih. “The Effect of Text Representation and Model Selection on Classification Performance: A Comprehensive Comparison of TF-IDF, Bow and Transformer-Based Methods on the Covid19-FNIR Dataset”. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi 14/4 (October2025), 1447-1461. https://doi.org/10.28948/ngumuh.1694988.
JAMA	Başarslan MS, Bal F. The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. NOHU J. Eng. Sci. 2025;14:1447–1461.
MLA	Başarslan, Muhammet Sinan and Fatih Bal. “The Effect of Text Representation and Model Selection on Classification Performance: A Comprehensive Comparison of TF-IDF, Bow and Transformer-Based Methods on the Covid19-FNIR Dataset”. Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi, vol. 14, no. 4, 2025, pp. 1447-61, doi:10.28948/ngumuh.1694988.
Vancouver	Başarslan MS, Bal F. The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset. NOHU J. Eng. Sci. 2025;14(4):1447-61.

Download Cover Image

Article Files

Full Text

download