Research Article

PWFS: A scalable parallel Python module for wrapper feature selection

Volume: 5 Number: 2 July 31, 2025
TR EN

PWFS: A scalable parallel Python module for wrapper feature selection

Abstract

In the field of machine learning, the feature selection process is a crucial step, and it can significantly impact the performance of predictive models. Despite the existence of various time-efficient algorithms, the only method that guarantees problem optimization is exhaustive search, but it requires an enormous computational load. Although the exhaustive search ensures the best feature selection, a lifetime would not be enough after certain large feature counts. This study proposes a generic, scalable open-source parallel Python module to find the best wrapper feature subset in a fully optimized execution time, especially for reasonable feature counts. This parallel wrapper feature selection module, PWFS, is independent of machine learning algorithms and can function with user-defined methods. The framework promises maximum benefit on the machine learning side by empowering parallel performance and efficiency. The system design is built on the most efficient message-passing communication, where the framework distributes the computational load equally among the parallel agents via feature masking. The module is validated on two workstations, one of which is hyper-threading capable. An overall performance gain of 19.77% is achieved with hyper-threading. Various scenarios and experiments yield different speedups and efficiencies up to 96.74%, validating the flexible design of the proposed parallel framework. The source code of the module is available at https://github.com/haeren/parallel-feature-selector and https://pypi.org/project/parallel-feature-selector/.

Keywords

References

  1. Okyay S, Adar N (2018) Parallel 3D brain modeling & feature extraction: ADNI dataset case study. 14th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET), Lviv-Slavske, Ukraine, Feb. 20-24. https://doi.org/10.1109/TCSET.2018.8336172
  2. Jovi A, Brki K, Bogunovi N (2015) A review of feature selection methods with applications. 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, May 25-29. https://doi.org/10.1109/MIPRO.2015.7160458
  3. Nersisyan S, Novosad V, Galatenko A, Sokolov A, Bokov G, Konovalov A et al (2022) ExhauFS: exhaustive search-based feature selection for classification and survival regression. PeerJ 10:e13200. https://doi.org/10.7717/peerj.13200
  4. Okyay S, Adar N (2021) Filter Feature Selection Analysis to Determine the Characteristics of Dementia. Journal of Engineering and Architecture Faculty of Eskisehir Osmangazi University 29(1):20–7. https://doi.org/10.31796/ogummf.768872
  5. Bolón-Canedo V, Sánchez-Marono N, Cervino-Rabunal J (2014) Toward parallel feature selection from vertically partitioned data. ESANN 2014 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, Apr. 23-25.
  6. Roffo G (2016) Feature selection library (MATLAB toolbox). arXiv preprint arXiv:160701327.
  7. Yu K, Ding W, Wu X (2016) LOFS: A library of online streaming feature selection. Knowledge-Based Systems 113:1–3. https://doi.org/10.1016/j.knosys.2016.08.026
  8. Horn F, Pack R, Rieger M (2019) The autofeat python library for automated feature engineering and selection. Machine Learning and Knowledge Discovery in Databases: International Workshops of ECML PKDD 2019, Würzburg, Germany, Sep. 16-20.

Details

Primary Language

English

Subjects

High Performance Computing, Machine Learning Algorithms, Data Mining and Knowledge Discovery, Computer Software

Journal Section

Research Article

Publication Date

July 31, 2025

Submission Date

February 14, 2025

Acceptance Date

April 27, 2025

Published in Issue

Year 2025 Volume: 5 Number: 2

APA
Eren, H. A., Okyay, S., & Adar, N. (2025). PWFS: A scalable parallel Python module for wrapper feature selection. Journal of Innovative Engineering and Natural Science, 5(2), 704-719. https://doi.org/10.61112/jiens.1639780
AMA
1.Eren HA, Okyay S, Adar N. PWFS: A scalable parallel Python module for wrapper feature selection. JIENS. 2025;5(2):704-719. doi:10.61112/jiens.1639780
Chicago
Eren, Hakan Alp, Savaş Okyay, and Nihat Adar. 2025. “PWFS: A Scalable Parallel Python Module for Wrapper Feature Selection”. Journal of Innovative Engineering and Natural Science 5 (2): 704-19. https://doi.org/10.61112/jiens.1639780.
EndNote
Eren HA, Okyay S, Adar N (July 1, 2025) PWFS: A scalable parallel Python module for wrapper feature selection. Journal of Innovative Engineering and Natural Science 5 2 704–719.
IEEE
[1]H. A. Eren, S. Okyay, and N. Adar, “PWFS: A scalable parallel Python module for wrapper feature selection”, JIENS, vol. 5, no. 2, pp. 704–719, July 2025, doi: 10.61112/jiens.1639780.
ISNAD
Eren, Hakan Alp - Okyay, Savaş - Adar, Nihat. “PWFS: A Scalable Parallel Python Module for Wrapper Feature Selection”. Journal of Innovative Engineering and Natural Science 5/2 (July 1, 2025): 704-719. https://doi.org/10.61112/jiens.1639780.
JAMA
1.Eren HA, Okyay S, Adar N. PWFS: A scalable parallel Python module for wrapper feature selection. JIENS. 2025;5:704–719.
MLA
Eren, Hakan Alp, et al. “PWFS: A Scalable Parallel Python Module for Wrapper Feature Selection”. Journal of Innovative Engineering and Natural Science, vol. 5, no. 2, July 2025, pp. 704-19, doi:10.61112/jiens.1639780.
Vancouver
1.Hakan Alp Eren, Savaş Okyay, Nihat Adar. PWFS: A scalable parallel Python module for wrapper feature selection. JIENS. 2025 Jul. 1;5(2):704-19. doi:10.61112/jiens.1639780


by.png
Journal of Innovative Engineering and Natural Science by İdris Karagöz is licensed under CC BY 4.0