Multi-Parametrized Cross Validation Method based on Order, Size, Weight, and Missing Data Robustness: A New Cross Validation Model MP-OSW-CV
Öz
Unless test data is given, the performance of a classifier is calculated with the help of cross validation methods. The existing models focus on different parameters like number of folds, balanced class distribution, test data size etc. However, these models can not overcome bias-variance tradeoff. In this paper, we propose the multi-parametrized cross validation method based on order of an instance, test fold size, weight, and missing data robustness (MP-OSW-CV). This method is composed of four parameters: order, size, weight, and missing data. Firstly, it divides dataset into different parts concerning data indexes and chooses randomly equal number of samples from each part instead of selecting random samples from the whole dataset. Secondly, the test fold size is varied. The accuracy results generated by different test sizes is reflected to the overall performance either with same weights or two different types of inversely proportionally calculated weights. Finally, train size is determined by the last parameter after creating the test fold if missing data robustness is to be analyzed. The proposed method is compared to conventional methods with some datasets from UCI ML Repository. MP-OSW-CV generated more representative data splits, leading to more dependable model assessments.
Anahtar Kelimeler
Kaynakça
- [1] Khan S.S., Karg M.E., Kulić D., and Hoey J., “Detecting falls with X-Factor Hidden Markov Models”, Applied Soft Computing, 55:168–177, (2017).
- [2] Wang Z., Yang D., Cai M., and Liu H., “Resistant effect of HLA-DRB1∗09 on hepatitis C virus infection identified by a new cross-validation method”, Clinical Laboratory, 62:1367–1370, (2016).
- [3] Agyapong D., Propster J.R., Marks J., and Hocking, T.D., “Cross-validation for training and testing co-occurrence network inference algorithms”, BMC Bioinformatics, 26:74, (2025).
- [4] Ablain M., Lalau N., Meyssignac B., Fraudeau R., Barnoud A., Dibarboure G., Egido A., and Donlon C., “Benefits of a second tandem flight phase between two successive satellite altimetry missions for assessing instrumental stability”, Ocean Science, 21:343–358, (2025).
- [5] Kaneko H. and Funatsu K., “Automatic determination method based on cross-validation for optimal intervals of time difference”, Journal of Chemical Engineering of Japan, 46:219–225, (2013).
- [6] Tamaddoni-Nezhad A., Milani G.A., Raybould A., Muggleton S., and Bohan D.A., “Construction and validation of food webs using logic-based machine learning and text mining”, Advances in Ecological Research, 49:225–289, (2013).
- [7] Yaghini M., Khoshraftar M.M., and Fallahi M., “A hybrid algorithm for artificial neural network training”, Engineering Applications of Artificial Intelligence, 26(1):293–301, (2013).
- [8] Zhao Q., Li W., Li C., Chu P.W., Kornak J., Lang T.F., Fang J., and Lu Y., “A statistical method (cross-validation) for bone loss region detection after spaceflight”, Australasian Physical & Engineering Sciences in Medicine, 33(2):163–169, (2010).
Ayrıntılar
Birincil Dil
İngilizce
Konular
Makine Öğrenme (Diğer)
Bölüm
Araştırma Makalesi
Yazarlar
Alican Doğan
*
0000-0002-0553-2888
Türkiye
Erken Görünüm Tarihi
23 Eylül 2025
Yayımlanma Tarihi
15 Mart 2026
Gönderilme Tarihi
9 Nisan 2025
Kabul Tarihi
29 Temmuz 2025
Yayımlandığı Sayı
Yıl 2026 Cilt: 29 Sayı: 2