Research Article
BibTex RIS Cite

Bootstrap-Driven Feature Weighting For Stable k-NN Performance

Year 2026, Volume: 9 Issue: 1, 78 - 86, 15.01.2026
https://doi.org/10.34248/bsengineering.1796638
https://izlik.org/JA98CM63NC

Abstract

The algorithm of k-Nearest Neighbors (k-NN) is continuous to be applied because of its simplicity, easy to understand, and its flexibility. However, it often does not work well when the dataset includes irrelevant features. This article proposes a data driven lightweight feature weighting method which uses the bootstrap sample and applies stability and predictive relevance estimates for each feature. It obtained aggregate mutual information (MI) values over resampled subsets, from which feature weights are derived and used to improve the distance metric in the k-NN, without the need for a complicated model training. The results of experiment demonstrate a significant improvement 11.19% in F1 value under noisy conditions, along with higher accuracy 1.36% (96.01% vs 94.65%) and reduced the performance variance (±5.33% vs ±7.16%) compared to the standard k-NN. The proposed method is easy to interpret and can be applied within the structure of the conventional k-NN.

Ethical Statement

Since no studies involving humans or animals were conducted, ethical committee approval was not required for this study.

References

  • Abdalla, H. I., & Amer, A. A. (2025). Enhancing data classification using locally informed weighted k-nearest neighbor algorithm. Expert Systems with Applications, 276, 126942. https://doi.org/10.1016/j.eswa.2025.126942
  • Ali, N., Neagu, D., & Trundle, P. (2019). Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets. SN Applied Sciences, 1, 1559. https://doi.org/10.1007/s42452-019-1356-9
  • Altman, N. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46, 175–185.
  • Aurélien, G. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. Concepts, tools, and techniques to build intelligent systems, 2nd ednn. Jordan University of Science and Technology.
  • Biswas, N., Chakraborty, S., Mullick, S. S., & Das, S. (2018). A parameter independent fuzzy weighted k-nearest neighbor classifier. Pattern Recognition Letters, 101, 80-87. https://doi.org/10.1016/j.patrec.2017.11.003
  • Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2015). Recent advances and emerging challenges of feature selection in the context of big data. Knowledge-Based Systems, 86, 33-45. https://doi.org/10.1016/j.knosys.2015.05.014
  • Bommert, A., Sun, X., Bischl, B., Rahnenführer, J., & Lang, M. (2020). Benchmark for filter methods for feature selection in high-dimensional classification data. Computational Statistics & Data Analysis, 143, 106839. https://doi.org/10.1016/j.csda.2019.106839
  • Chen, Y., & Hao, Y. (2017). A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction. Expert Systems with Applications, 80, 340-355. https://doi.org/10.1016/j.eswa.2017.02.044
  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13, 21–27. https://doi.org/10.1109/TIT.1967.1053964
  • Fisher, R. (1936). Iris [Dataset]. UCI Machine Learning Repository.
  • Gul, N., Mashwani, W. K., Aamir, M., Aldahmani, S., & Khan, Z. (2023). Optimal model selection for k-nearest neighbours ensemble via sub-bagging and sub-sampling with feature weighting. Alexandria Engineering Journal, 72, 157-168. https://doi.org/10.1016/j.aej.2023.03.075
  • Halder, R. K., Uddin, M. N., Uddin, M. A., Aryal, S., & Khraisat, A. (2024). Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. Journal of Big Data, 11(1), 113. https://doi.org/10.1186/s40537-024-00973-y
  • Ircio, J., Lojo, A., Mori, U., & Lozano, J. A. (2020). Mutual information based feature subset selection in multivariate time series classification. Pattern Recognition, 108, 107525.
  • Łukaszuk, T., Krawczuk, J., Żyła, K., & Kęsik, J. (2024). Stability of feature selection in multi-omics data analysis. Applied Sciences, 14, 11103. https://doi.org/10.3390/app142311103
  • Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226-1238. https://doi.org/10.1109/TPAMI.2005.159
  • Popoola, G., Fuhnwi, G., Agbaje, J., & Fesomade, K. (2024). A modified ant colony optimization with KNN for high-dimensional data classification. In: Arai, K. (eds) Intelligent Computing. SAI 2024. Lecture Notes in Networks and Systems, vol 1018. Springer, Cham. https://doi.org/10.1007/978-3-031-62269-4_19
  • Pudjihartono, N., Fadason, T., Kempa-Liehr, A. W., & O'Sullivan, J. M. (2022). A review of feature selection methods for machine learning-based disease risk prediction. Frontiers in Bioinformatics, 2, 927312. https://doi.org/10.3389/fbinf.2022.927312
  • Saeys, Y., Inza, I., & Larranaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2507-2517. https://doi.org/10.1093/bioinformatics/btm344
  • Salman, R., Alzaatreh, A., Sulieman, H., & Faisal, S. (2021). A bootstrap framework for aggregating within and between feature selection methods. Entropy, 23, 200. https://doi.org/10.3390/e23020200
  • Uddin, S., Haque, I., Lu, H., Moni, M. A., & Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1), 6256. https://doi.org/10.1038/s41598-022-10358-x
  • Venkateswara, H., Lade, P., Lin, B., Ye, J., & Panchanathan, S. (2015). Efficient approximate solutions to mutual information based global feature selection. IEEE International Conference on Data Mining (pp. 1009–1014). IEEE. https://doi.org/10.1109/ICDM.2015.140
  • Vergara, J. R., & Estévez, P. A. (2014). A review of feature selection methods based on mutual information. Neural Computing and Applications, 24, 175–186. https://doi.org/10.1007/s00521-013-1368-0

Bootstrap-Driven Feature Weighting For Stable k-NN Performance

Year 2026, Volume: 9 Issue: 1, 78 - 86, 15.01.2026
https://doi.org/10.34248/bsengineering.1796638
https://izlik.org/JA98CM63NC

Abstract

The algorithm of k-Nearest Neighbors (k-NN) is continuous to be applied because of its simplicity, easy to understand, and its flexibility. However, it often does not work well when the dataset includes irrelevant features. This article proposes a data driven lightweight feature weighting method which uses the bootstrap sample and applies stability and predictive relevance estimates for each feature. It obtained aggregate mutual information (MI) values over resampled subsets, from which feature weights are derived and used to improve the distance metric in the k-NN, without the need for a complicated model training. The results of experiment demonstrate a significant improvement 11.19% in F1 value under noisy conditions, along with higher accuracy 1.36% (96.01% vs 94.65%) and reduced the performance variance (±5.33% vs ±7.16%) compared to the standard k-NN. The proposed method is easy to interpret and can be applied within the structure of the conventional k-NN.

Ethical Statement

Since no studies involving humans or animals were conducted, ethical committee approval was not required for this study.

References

  • Abdalla, H. I., & Amer, A. A. (2025). Enhancing data classification using locally informed weighted k-nearest neighbor algorithm. Expert Systems with Applications, 276, 126942. https://doi.org/10.1016/j.eswa.2025.126942
  • Ali, N., Neagu, D., & Trundle, P. (2019). Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets. SN Applied Sciences, 1, 1559. https://doi.org/10.1007/s42452-019-1356-9
  • Altman, N. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46, 175–185.
  • Aurélien, G. (2019). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. Concepts, tools, and techniques to build intelligent systems, 2nd ednn. Jordan University of Science and Technology.
  • Biswas, N., Chakraborty, S., Mullick, S. S., & Das, S. (2018). A parameter independent fuzzy weighted k-nearest neighbor classifier. Pattern Recognition Letters, 101, 80-87. https://doi.org/10.1016/j.patrec.2017.11.003
  • Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2015). Recent advances and emerging challenges of feature selection in the context of big data. Knowledge-Based Systems, 86, 33-45. https://doi.org/10.1016/j.knosys.2015.05.014
  • Bommert, A., Sun, X., Bischl, B., Rahnenführer, J., & Lang, M. (2020). Benchmark for filter methods for feature selection in high-dimensional classification data. Computational Statistics & Data Analysis, 143, 106839. https://doi.org/10.1016/j.csda.2019.106839
  • Chen, Y., & Hao, Y. (2017). A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction. Expert Systems with Applications, 80, 340-355. https://doi.org/10.1016/j.eswa.2017.02.044
  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13, 21–27. https://doi.org/10.1109/TIT.1967.1053964
  • Fisher, R. (1936). Iris [Dataset]. UCI Machine Learning Repository.
  • Gul, N., Mashwani, W. K., Aamir, M., Aldahmani, S., & Khan, Z. (2023). Optimal model selection for k-nearest neighbours ensemble via sub-bagging and sub-sampling with feature weighting. Alexandria Engineering Journal, 72, 157-168. https://doi.org/10.1016/j.aej.2023.03.075
  • Halder, R. K., Uddin, M. N., Uddin, M. A., Aryal, S., & Khraisat, A. (2024). Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. Journal of Big Data, 11(1), 113. https://doi.org/10.1186/s40537-024-00973-y
  • Ircio, J., Lojo, A., Mori, U., & Lozano, J. A. (2020). Mutual information based feature subset selection in multivariate time series classification. Pattern Recognition, 108, 107525.
  • Łukaszuk, T., Krawczuk, J., Żyła, K., & Kęsik, J. (2024). Stability of feature selection in multi-omics data analysis. Applied Sciences, 14, 11103. https://doi.org/10.3390/app142311103
  • Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1226-1238. https://doi.org/10.1109/TPAMI.2005.159
  • Popoola, G., Fuhnwi, G., Agbaje, J., & Fesomade, K. (2024). A modified ant colony optimization with KNN for high-dimensional data classification. In: Arai, K. (eds) Intelligent Computing. SAI 2024. Lecture Notes in Networks and Systems, vol 1018. Springer, Cham. https://doi.org/10.1007/978-3-031-62269-4_19
  • Pudjihartono, N., Fadason, T., Kempa-Liehr, A. W., & O'Sullivan, J. M. (2022). A review of feature selection methods for machine learning-based disease risk prediction. Frontiers in Bioinformatics, 2, 927312. https://doi.org/10.3389/fbinf.2022.927312
  • Saeys, Y., Inza, I., & Larranaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2507-2517. https://doi.org/10.1093/bioinformatics/btm344
  • Salman, R., Alzaatreh, A., Sulieman, H., & Faisal, S. (2021). A bootstrap framework for aggregating within and between feature selection methods. Entropy, 23, 200. https://doi.org/10.3390/e23020200
  • Uddin, S., Haque, I., Lu, H., Moni, M. A., & Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1), 6256. https://doi.org/10.1038/s41598-022-10358-x
  • Venkateswara, H., Lade, P., Lin, B., Ye, J., & Panchanathan, S. (2015). Efficient approximate solutions to mutual information based global feature selection. IEEE International Conference on Data Mining (pp. 1009–1014). IEEE. https://doi.org/10.1109/ICDM.2015.140
  • Vergara, J. R., & Estévez, P. A. (2014). A review of feature selection methods based on mutual information. Neural Computing and Applications, 24, 175–186. https://doi.org/10.1007/s00521-013-1368-0
There are 22 citations in total.

Details

Primary Language English
Subjects Electrical Engineering (Other)
Journal Section Research Article
Authors

Noor Baha Aldin 0000-0002-7351-4083

Submission Date October 3, 2025
Acceptance Date November 7, 2025
Early Pub Date December 9, 2025
Publication Date January 15, 2026
DOI https://doi.org/10.34248/bsengineering.1796638
IZ https://izlik.org/JA98CM63NC
Published in Issue Year 2026 Volume: 9 Issue: 1

Cite

APA Baha Aldin, N. (2026). Bootstrap-Driven Feature Weighting For Stable k-NN Performance. Black Sea Journal of Engineering and Science, 9(1), 78-86. https://doi.org/10.34248/bsengineering.1796638
AMA 1.Baha Aldin N. Bootstrap-Driven Feature Weighting For Stable k-NN Performance. BSJ Eng. Sci. 2026;9(1):78-86. doi:10.34248/bsengineering.1796638
Chicago Baha Aldin, Noor. 2026. “Bootstrap-Driven Feature Weighting For Stable K-NN Performance”. Black Sea Journal of Engineering and Science 9 (1): 78-86. https://doi.org/10.34248/bsengineering.1796638.
EndNote Baha Aldin N (January 1, 2026) Bootstrap-Driven Feature Weighting For Stable k-NN Performance. Black Sea Journal of Engineering and Science 9 1 78–86.
IEEE [1]N. Baha Aldin, “Bootstrap-Driven Feature Weighting For Stable k-NN Performance”, BSJ Eng. Sci., vol. 9, no. 1, pp. 78–86, Jan. 2026, doi: 10.34248/bsengineering.1796638.
ISNAD Baha Aldin, Noor. “Bootstrap-Driven Feature Weighting For Stable K-NN Performance”. Black Sea Journal of Engineering and Science 9/1 (January 1, 2026): 78-86. https://doi.org/10.34248/bsengineering.1796638.
JAMA 1.Baha Aldin N. Bootstrap-Driven Feature Weighting For Stable k-NN Performance. BSJ Eng. Sci. 2026;9:78–86.
MLA Baha Aldin, Noor. “Bootstrap-Driven Feature Weighting For Stable K-NN Performance”. Black Sea Journal of Engineering and Science, vol. 9, no. 1, Jan. 2026, pp. 78-86, doi:10.34248/bsengineering.1796638.
Vancouver 1.Noor Baha Aldin. Bootstrap-Driven Feature Weighting For Stable k-NN Performance. BSJ Eng. Sci. 2026 Jan. 1;9(1):78-86. doi:10.34248/bsengineering.1796638

                            24890