Araştırma Makalesi
BibTex RIS Kaynak Göster

Bootstrap-Driven Feature Weighting For Stable k-NN Performance

Yıl 2026, Cilt: 9 Sayı: 1, 78 - 86, 15.01.2026
https://doi.org/10.34248/bsengineering.1796638
https://izlik.org/JA98CM63NC

Öz

The algorithm of k-Nearest Neighbors (k-NN) is continuous to be applied because of its simplicity, easy to understand, and its flexibility. However, it often does not work well when the dataset includes irrelevant features. This article proposes a data driven lightweight feature weighting method which uses the bootstrap sample and applies stability and predictive relevance estimates for each feature. It obtained aggregate mutual information (MI) values over resampled subsets, from which feature weights are derived and used to improve the distance metric in the k-NN, without the need for a complicated model training. The results of experiment demonstrate a significant improvement 11.19% in F1 value under noisy conditions, along with higher accuracy 1.36% (96.01% vs 94.65%) and reduced the performance variance (±5.33% vs ±7.16%) compared to the standard k-NN. The proposed method is easy to interpret and can be applied within the structure of the conventional k-NN.

Etik Beyan

Since no studies involving humans or animals were conducted, ethical committee approval was not required for this study.

Kaynakça

  • Abdalla, H. I., & Amer, A. A. (2025). Enhancing data classification using locally informed weighted k-nearest neighbor algorithm. Expert Systems with Applications, 276, 126942. https://doi.org/10.1016/j.eswa.2025.126942
  • Ali, N., Neagu, D., & Trundle, P. (2019). Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets. SN Applied Sciences, 1, 1559. https://doi.org/10.1007/s42452-019-1356-9
  • Altman, N. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46, 175–185.
  • Aurélien, G. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. Concepts, tools, and techniques to build intelligent systems, 2nd ednn. Jordan University of Science and Technology.
  • Biswas, N., Chakraborty, S., Mullick, S. S., & Das, S. (2018). A parameter independent fuzzy weighted k-nearest neighbor classifier. Pattern Recognition Letters, 101, 80-87. https://doi.org/10.1016/j.patrec.2017.11.003
  • Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2015). Recent advances and emerging challenges of feature selection in the context of big data. Knowledge-Based Systems, 86, 33-45. https://doi.org/10.1016/j.knosys.2015.05.014
  • Bommert, A., Sun, X., Bischl, B., Rahnenführer, J., & Lang, M. (2020). Benchmark for filter methods for feature selection in high-dimensional classification data. Computational Statistics & Data Analysis, 143, 106839. https://doi.org/10.1016/j.csda.2019.106839
  • Chen, Y., & Hao, Y. (2017). A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction. Expert Systems with Applications, 80, 340-355. https://doi.org/10.1016/j.eswa.2017.02.044
  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13, 21–27. https://doi.org/10.1109/TIT.1967.1053964
  • Fisher, R. (1936). Iris [Dataset]. UCI Machine Learning Repository.
  • Gul, N., Mashwani, W. K., Aamir, M., Aldahmani, S., & Khan, Z. (2023). Optimal model selection for k-nearest neighbours ensemble via sub-bagging and sub-sampling with feature weighting. Alexandria Engineering Journal, 72, 157-168. https://doi.org/10.1016/j.aej.2023.03.075
  • Halder, R. K., Uddin, M. N., Uddin, M. A., Aryal, S., & Khraisat, A. (2024). Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. Journal of Big Data, 11(1), 113. https://doi.org/10.1186/s40537-024-00973-y
  • Ircio, J., Lojo, A., Mori, U., & Lozano, J. A. (2020). Mutual information based feature subset selection in multivariate time series classification. Pattern Recognition, 108, 107525.
  • Łukaszuk, T., Krawczuk, J., Żyła, K., & Kęsik, J. (2024). Stability of feature selection in multi-omics data analysis. Applied Sciences, 14, 11103. https://doi.org/10.3390/app142311103
  • Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226-1238. https://doi.org/10.1109/TPAMI.2005.159
  • Popoola, G., Fuhnwi, G., Agbaje, J., & Fesomade, K. (2024). A modified ant colony optimization with KNN for high-dimensional data classification. In: Arai, K. (eds) Intelligent Computing. SAI 2024. Lecture Notes in Networks and Systems, vol 1018. Springer, Cham. https://doi.org/10.1007/978-3-031-62269-4_19
  • Pudjihartono, N., Fadason, T., Kempa-Liehr, A. W., & O'Sullivan, J. M. (2022). A review of feature selection methods for machine learning-based disease risk prediction. Frontiers in Bioinformatics, 2, 927312. https://doi.org/10.3389/fbinf.2022.927312
  • Saeys, Y., Inza, I., & Larranaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2507-2517. https://doi.org/10.1093/bioinformatics/btm344
  • Salman, R., Alzaatreh, A., Sulieman, H., & Faisal, S. (2021). A bootstrap framework for aggregating within and between feature selection methods. Entropy, 23, 200. https://doi.org/10.3390/e23020200
  • Uddin, S., Haque, I., Lu, H., Moni, M. A., & Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1), 6256. https://doi.org/10.1038/s41598-022-10358-x
  • Venkateswara, H., Lade, P., Lin, B., Ye, J., & Panchanathan, S. (2015). Efficient approximate solutions to mutual information based global feature selection. IEEE International Conference on Data Mining (pp. 1009–1014). IEEE. https://doi.org/10.1109/ICDM.2015.140
  • Vergara, J. R., & Estévez, P. A. (2014). A review of feature selection methods based on mutual information. Neural Computing and Applications, 24, 175–186. https://doi.org/10.1007/s00521-013-1368-0

Bootstrap-Driven Feature Weighting For Stable k-NN Performance

Yıl 2026, Cilt: 9 Sayı: 1, 78 - 86, 15.01.2026
https://doi.org/10.34248/bsengineering.1796638
https://izlik.org/JA98CM63NC

Öz

The algorithm of k-Nearest Neighbors (k-NN) is continuous to be applied because of its simplicity, easy to understand, and its flexibility. However, it often does not work well when the dataset includes irrelevant features. This article proposes a data driven lightweight feature weighting method which uses the bootstrap sample and applies stability and predictive relevance estimates for each feature. It obtained aggregate mutual information (MI) values over resampled subsets, from which feature weights are derived and used to improve the distance metric in the k-NN, without the need for a complicated model training. The results of experiment demonstrate a significant improvement 11.19% in F1 value under noisy conditions, along with higher accuracy 1.36% (96.01% vs 94.65%) and reduced the performance variance (±5.33% vs ±7.16%) compared to the standard k-NN. The proposed method is easy to interpret and can be applied within the structure of the conventional k-NN.

Etik Beyan

Since no studies involving humans or animals were conducted, ethical committee approval was not required for this study.

Kaynakça

  • Abdalla, H. I., & Amer, A. A. (2025). Enhancing data classification using locally informed weighted k-nearest neighbor algorithm. Expert Systems with Applications, 276, 126942. https://doi.org/10.1016/j.eswa.2025.126942
  • Ali, N., Neagu, D., & Trundle, P. (2019). Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets. SN Applied Sciences, 1, 1559. https://doi.org/10.1007/s42452-019-1356-9
  • Altman, N. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46, 175–185.
  • Aurélien, G. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. Concepts, tools, and techniques to build intelligent systems, 2nd ednn. Jordan University of Science and Technology.
  • Biswas, N., Chakraborty, S., Mullick, S. S., & Das, S. (2018). A parameter independent fuzzy weighted k-nearest neighbor classifier. Pattern Recognition Letters, 101, 80-87. https://doi.org/10.1016/j.patrec.2017.11.003
  • Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2015). Recent advances and emerging challenges of feature selection in the context of big data. Knowledge-Based Systems, 86, 33-45. https://doi.org/10.1016/j.knosys.2015.05.014
  • Bommert, A., Sun, X., Bischl, B., Rahnenführer, J., & Lang, M. (2020). Benchmark for filter methods for feature selection in high-dimensional classification data. Computational Statistics & Data Analysis, 143, 106839. https://doi.org/10.1016/j.csda.2019.106839
  • Chen, Y., & Hao, Y. (2017). A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction. Expert Systems with Applications, 80, 340-355. https://doi.org/10.1016/j.eswa.2017.02.044
  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13, 21–27. https://doi.org/10.1109/TIT.1967.1053964
  • Fisher, R. (1936). Iris [Dataset]. UCI Machine Learning Repository.
  • Gul, N., Mashwani, W. K., Aamir, M., Aldahmani, S., & Khan, Z. (2023). Optimal model selection for k-nearest neighbours ensemble via sub-bagging and sub-sampling with feature weighting. Alexandria Engineering Journal, 72, 157-168. https://doi.org/10.1016/j.aej.2023.03.075
  • Halder, R. K., Uddin, M. N., Uddin, M. A., Aryal, S., & Khraisat, A. (2024). Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. Journal of Big Data, 11(1), 113. https://doi.org/10.1186/s40537-024-00973-y
  • Ircio, J., Lojo, A., Mori, U., & Lozano, J. A. (2020). Mutual information based feature subset selection in multivariate time series classification. Pattern Recognition, 108, 107525.
  • Łukaszuk, T., Krawczuk, J., Żyła, K., & Kęsik, J. (2024). Stability of feature selection in multi-omics data analysis. Applied Sciences, 14, 11103. https://doi.org/10.3390/app142311103
  • Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226-1238. https://doi.org/10.1109/TPAMI.2005.159
  • Popoola, G., Fuhnwi, G., Agbaje, J., & Fesomade, K. (2024). A modified ant colony optimization with KNN for high-dimensional data classification. In: Arai, K. (eds) Intelligent Computing. SAI 2024. Lecture Notes in Networks and Systems, vol 1018. Springer, Cham. https://doi.org/10.1007/978-3-031-62269-4_19
  • Pudjihartono, N., Fadason, T., Kempa-Liehr, A. W., & O'Sullivan, J. M. (2022). A review of feature selection methods for machine learning-based disease risk prediction. Frontiers in Bioinformatics, 2, 927312. https://doi.org/10.3389/fbinf.2022.927312
  • Saeys, Y., Inza, I., & Larranaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2507-2517. https://doi.org/10.1093/bioinformatics/btm344
  • Salman, R., Alzaatreh, A., Sulieman, H., & Faisal, S. (2021). A bootstrap framework for aggregating within and between feature selection methods. Entropy, 23, 200. https://doi.org/10.3390/e23020200
  • Uddin, S., Haque, I., Lu, H., Moni, M. A., & Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1), 6256. https://doi.org/10.1038/s41598-022-10358-x
  • Venkateswara, H., Lade, P., Lin, B., Ye, J., & Panchanathan, S. (2015). Efficient approximate solutions to mutual information based global feature selection. IEEE International Conference on Data Mining (pp. 1009–1014). IEEE. https://doi.org/10.1109/ICDM.2015.140
  • Vergara, J. R., & Estévez, P. A. (2014). A review of feature selection methods based on mutual information. Neural Computing and Applications, 24, 175–186. https://doi.org/10.1007/s00521-013-1368-0
Toplam 22 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Elektrik Mühendisliği (Diğer)
Bölüm Araştırma Makalesi
Yazarlar

Noor Baha Aldin 0000-0002-7351-4083

Gönderilme Tarihi 3 Ekim 2025
Kabul Tarihi 7 Kasım 2025
Erken Görünüm Tarihi 9 Aralık 2025
Yayımlanma Tarihi 15 Ocak 2026
DOI https://doi.org/10.34248/bsengineering.1796638
IZ https://izlik.org/JA98CM63NC
Yayımlandığı Sayı Yıl 2026 Cilt: 9 Sayı: 1

Kaynak Göster

APA Baha Aldin, N. (2026). Bootstrap-Driven Feature Weighting For Stable k-NN Performance. Black Sea Journal of Engineering and Science, 9(1), 78-86. https://doi.org/10.34248/bsengineering.1796638
AMA 1.Baha Aldin N. Bootstrap-Driven Feature Weighting For Stable k-NN Performance. BSJ Eng. Sci. 2026;9(1):78-86. doi:10.34248/bsengineering.1796638
Chicago Baha Aldin, Noor. 2026. “Bootstrap-Driven Feature Weighting For Stable k-NN Performance”. Black Sea Journal of Engineering and Science 9 (1): 78-86. https://doi.org/10.34248/bsengineering.1796638.
EndNote Baha Aldin N (01 Ocak 2026) Bootstrap-Driven Feature Weighting For Stable k-NN Performance. Black Sea Journal of Engineering and Science 9 1 78–86.
IEEE [1]N. Baha Aldin, “Bootstrap-Driven Feature Weighting For Stable k-NN Performance”, BSJ Eng. Sci., c. 9, sy 1, ss. 78–86, Oca. 2026, doi: 10.34248/bsengineering.1796638.
ISNAD Baha Aldin, Noor. “Bootstrap-Driven Feature Weighting For Stable k-NN Performance”. Black Sea Journal of Engineering and Science 9/1 (01 Ocak 2026): 78-86. https://doi.org/10.34248/bsengineering.1796638.
JAMA 1.Baha Aldin N. Bootstrap-Driven Feature Weighting For Stable k-NN Performance. BSJ Eng. Sci. 2026;9:78–86.
MLA Baha Aldin, Noor. “Bootstrap-Driven Feature Weighting For Stable k-NN Performance”. Black Sea Journal of Engineering and Science, c. 9, sy 1, Ocak 2026, ss. 78-86, doi:10.34248/bsengineering.1796638.
Vancouver 1.Noor Baha Aldin. Bootstrap-Driven Feature Weighting For Stable k-NN Performance. BSJ Eng. Sci. 01 Ocak 2026;9(1):78-86. doi:10.34248/bsengineering.1796638

                           24890