Research Article
BibTex RIS Cite

PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods

Year 2024, Volume: 7 Issue: 5, 971 - 981, 15.09.2024
https://doi.org/10.34248/bsengineering.1467132

Abstract

Feature selection is a significant data mining and machine learning technique that enhances model performance by identifying important features within a dataset, reducing the risk of overfitting while aiding the model in making faster and more accurate predictions. Pyallffs is a Python library developed to optimize the feature selection process, offering rich content and low dependency requirements. With 19 different filtering methods, pyallffs assists in analyzing dataset features to determine the most relevant ones. Users can apply custom filtering methods to their datasets using pyallffs, thereby achieving faster and more effective results in data analytics and machine learning projects. The source codes, supplementary materials, and guidance is publicly available on GitHub: https://github.com/tohid-yousefi/pyallffs.

References

  • Ali Khan S, Hussain A, Basit A, Akram S. 2014. Kruskal-Wallis-based computationally efficient feature selection for face recognition. Sci World J, 2014: 1-6.
  • Ali SI, Shahzad W. 2012. A feature subset selection method based on symmetric uncertainty and ant colony optimization. In: 2012 Inter Conference on Emerging Technologies, 8-9 October, 2012, Islamabad, Pakistan, pp: 1-6.
  • Arauzo-Azofra A, Benitez JM, Castro JL. 2004. A feature set measure based on relief. In: Proceedings of the fifth Inter conference on Recent Advances in Soft Computing, April 27-28, Copenhagen, Denmark pp: 104-109.
  • Battiti R. 1994. Using mutual information for selecting features in supervised neural net learning. IEEE Transact Neural Networks, 4: 537-550.
  • Belkin M, Niyogi P. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inform Proces Systems, 2001: 14.
  • Beraha M, Metelli AM, Papini M, Tirinzoni A, Restelli M. 2019. Feature selection via mutual information: New theoretical insights. In: 2019 Inter Joint Conference on Neural Networks (IJCNN), 14-19 July 2019, Budapest, Hungary pp: 1-9.
  • Bolón-Canedo V, Sánchez-Marono N, Alonso-Betanzos A, Benítez JM,Herrera F. 2014. A review of microarray datasets and applied feature selection methods. Inform Sci, 282: 111-135.
  • Bryant FB, Satorra A. 2012. Principles and practice of scaled difference chi-square testing. Struct Equation Model: A Multidisciplin J, 3: 372-398.
  • Budak H, Taşabat SE. 2016. A modified t-score for feature selection. Anadolu Univ J Sci Technol A-Applied Sci Engin, 5: 845-852.
  • Carey JJ, Delaney MF. 2010. T-scores and Z-scores. Clinical Rev Bone Mineral Metabol, 8: 113-121.
  • Chandra B, Gupta M. 2011. An efficient statistical feature selection approach for classification of gene expression data. J Biomed Inform, 4: 529-535.
  • Chandrashekar G, Sahin F. 2014. A survey on feature selection methods. Comput Elect Engin, 1: 16-28.
  • Cover TM. 1999. Elements of information theory. John Wiley & Sons, London, UK, pp: 54.
  • Dash M, Liu H. 2003. Consistency-based search in feature selection. Artificial Intel, 1-2: 155-176.
  • Delacre M, Lakens D, Leys C. 2017. Why psychologists should by default use Welch's t-test instead of Student's t-test. Inter Rev Soc Psychol, 1: 92-101.
  • Ding C, Peng H. 2005. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol, 2: 185-205.
  • Doquire G, Verleysen M. 2011. Feature selection with mutual information for uncertain data. In: Data Warehousing and Knowledge Discovery: 13th Inter Conference, DaWaK 2011, Toulouse, France, August 29-September 2, pp: 330-341.
  • Esmael B, Arnaout A, Fruhwirth R, Thonhauser G. 2012. A statistical feature-based approach for operations recognition in drilling time series. Inter J Comput Inform Systems Industrial Manage Applicat, 4(6): 100-108.
  • Faulkner KG. 2005. The tale of the T-score: review and perspective. Osteoporosis Inter, 16, 347-352.
  • François D, Rossi F, Wertz V, Verleysen M. 2007. Resampling methods for parameter-free and robust feature selection with mutual information. Neurocomput, 70(7-9): 1276-1288.
  • Goswami S, Chakrabarti A. 2014. Feature selection: A practitioner view. Inter J Inform Technol Comput Sci (IJITCS), 6(11): 66
  • Gu Q, Li Z, Han J. 2012. Generalized fisher score for feature selection. arXiv preprint arXiv:1202.3725.
  • Hall MA, Holmes G. 2000. Benchmarking attribute selection techniques for data mining. IEEE Trans Knowl Data Eng, 15 (2003): 1437-1447.
  • Hall MA, Smith LA. 1998. Practical feature subset selection for machine learning. In: Computer Science Proceedings of the 21st Australasian Computer Science Conference ACSC’98, Perth, 4-6 February, Berlin, Germany, pp: 181-191.
  • He X, Cai D, Niyogi P. 2005. Laplacian score for feature selection. Adv Neural Inform Proces Systems, 2005: 18.
  • He X, Niyogi P. 2003. Locality preserving projections. Adv Neural Inform Proces Systems, 2003: 16.
  • Hernández-Torruco J, Canul-Reich J, Frausto-Solís J, Méndez-Castillo JJ. 2014. Feature selection for better identification of subtypes of Guillain-Barré syndrome. Comput Math Methods Med, 2014: 432109.
  • Kabir MM, Islam MM, Murase K. 2010. A new wrapper feature selection approach using neural network. Neurocomput, 73(16-18): 3273-3283.
  • Kalousis A, Prados J, Hilario M. 2007. Stability of feature selection algorithms: a study on high-dimensional spaces. Knowledge Inform Systems, 12: 95-116.
  • Kannan SS, Ramaraj N. 2010. A novel hybrid feature selection via symmetrical uncertainty ranking based local memetic search algorithm. Knowledge-Based Systems, 23(6): 580-585.
  • Kass GV. 1980. An exploratory technique for investigating large quantities of categorical data. J Royal Stat Soc: Series C (Applied Stat), 29(2): 119-127.
  • Kira K, Rendell LA. 1992. The feature selection problem: Traditional methods and a new algorithm. In: Proceedings of the Tenth National Conference on Artificial intelligence, July 12–16, California, USA, pp: 129-134.
  • Kohavi R, John GH. 1997. Wrappers for feature subset selection. Artificial Intel, 97(1-2): 273-324.
  • Koller D, Sahami M. 1996. Toward optimal feature selection. In: ICML, 292.
  • Kononenko I. 1994. Estimating attributes: Analysis and extensions of RELIEF. In: European Conference on Machine Learning, April 6-8, Catania, Italy, pp:71-182.
  • Kraskov A, Stögbauer H, Grassberger P. 2004. Estimating mutual information. Physical Rev E, 69(6): 066138.
  • Kullback S, Leibler RA. 1951. On information and sufficiency. Annals Math Stat, 22(1): 79-86.
  • Ladha L, Deepa T. 2011. Feature selection methods and algorithms. Inter J Comput Sci Engin, 3(5): 1787-1797.
  • Liu H, Motoda H, Setiono R,Zhao Z. 2010. Feature selection: An ever evolving frontier in data mining. Feature Select Data Min, 2010: 4-13.
  • Lu D, Weng Q. 2007. A survey of image classification methods and techniques for improving classification performance. Inter J Remote Sensing, 28(5): 823-870.
  • Lun Gaoa TL, Yaob L, Wenb F. 2013. Research and application of data mining feature selection based on relief algorithm. Work, 2013: 515.
  • Mani K, Kalpana P. 2016. A review on filter based feature selection. Inter J Innov Res Computer Communicat Engin (IJIRCCE), pp: 2320-9801.
  • Martínez Casasnovas JA, Klaasse A, Nogués Navarro J, Ramos Martín MC. 2008. Comparison between land suitability and actual crop distribution in an irrigation district of the Ebro valley (Spain). Spanish J Agri Res, 6(4): 700-713.
  • Miao J, Niu L. 2016. A survey on feature selection. Procedia Comput Sci, 91: 919-926.
  • Naik A, Rangwala H. 2016. Embedding feature selection for large-scale hierarchical classification. In: 2016 IEEE Inter Conference on Big Data (Big Data), December 5-8, Washington DC, USA, pp: 1212-1221.
  • Nilsson R. 2007. Statistical feature selection: with applications in life science. Institutionen för fysik, kemi och biologi, Berlin, Germany, pp: 54.
  • Novaković J. 2016. Toward optimal feature selection using ranking methods and classification algorithms. Yugoslav J Operat Res, 21: 1.
  • Opitz D, Maclin R. 1999. Popular ensemble methods: An empirical study. J Artific Intel Res, 11: 169-198.
  • Pearl J. 1988. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan kaufmann.
  • Peng H, Fan Y. 2015. Direct l_ (2, p)-Norm learning for feature selection. arXiv preprint arXiv: 1504.00430.
  • Priyadarsini RP, Valarmathi M, Sivakumari S. 2011. Gain ratio based feature selection method for privacy preservation. ICTACT J Soft Comput, 1(4): 201-205.
  • Radovic M, Ghalwash M, Filipovic N, Obradovic Z. 2017. Minimum redundancy maximum relevance feature selection approach for temporal gene expression data. BMC Bioinform, 18: 1-14.
  • Rossi F, Lendasse A, François D, Wertz V,Verleysen M. 2006. Mutual information for the selection of relevant variables in spectrometric nonlinear modelling. Chemometrics Intel Lab Systems, 80(2): 215-226.
  • Saeys Y, Inza I, Larranaga P. 2007. A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19): 2507-2517.
  • Sedgwick P. 2012. Pearson’s correlation coefficient. BMJ, 2012: 345.
  • Shannon CE. 1948. A mathematical theory of communication. Bell System Technic J, 27(3): 379-423.
  • Shardlow M. 2016. An analysis of feature selection techniques. J Univ Manchester, 2016: 1-7.
  • Shen J, Li L, Wong W-K. 2008. Markov Blanket Feature Selection for Support Vector Machines. AAAI, 2008: 696-701.
  • Singh B, Kushwaha N, Vyas OP. 2014. A feature subset selection technique for high dimensional data using symmetric uncertainty. J Data Analysis Inform Proces, 2(4): 95-105.
  • Suebsing A, Hiransakolwong N. 2009. Feature selection using euclidean distance and cosine similarity for intrusion detection model. In: 2009 First Asian Conference on Intelligent Information and Database Systems, April 1-3, Dong Hoi, Quang Binh, Vietnam, pp: 86-91.
  • Suzuki T, Sugiyama M, Sese J, Kanamori T. 2008. Approximating mutual information by maximum likelihood density ratio estimation. PMLR, 2008: 5-20.
  • Suzuki T, Sugiyama M, Tanaka T. 2009. Mutual information approximation via maximum likelihood estimation of density ratio. In: 2009 IEEE Inter Symposium on Information Theory, 28 June - 3 July, Seoul, Korea, pp: 463-467.
  • Tsamardinos I, Aliferis CF, Statnikov A. 2003. Time and sample efficient discovery of Markov blankets and direct causal relations. In: Proceedings of the ninth ACM SIGKDD Inter Conference on Knowledge Discovery and Data Mining, August 24-27, Washington, DC, USA, pp: 673-678.
  • Tsamardinos I, Aliferis CF, Statnikov AR, Statnikov E. 2003. Algorithms for large scale Markov blanket discovery. FLAIRS, 2003: 376-381.
  • Ugoni A, Walker BF. 1995. The Chi square test: an introduction. COMSIG Rev, 4(3): 61.
  • Urbanowicz RJ, Olson RS, Schmitt P, Meeker M, Moore JH. 2018. Benchmarking relief-based feature selection methods for bioinformatics data mining. J Biomed Inform, 85: 168-188
  • Vergara JR, Estévez PA. 2014. A review of feature selection methods based on mutual information. Neural Comput Applicat, 24, 175-186
  • Von Luxburg U. 2007. A tutorial on spectral clustering. Stat Comput, 17: 395-416.
  • Vora S, Yang H. 2017. A comprehensive study of eleven feature selection algorithms and their impact on text classification. In: 2017 Computing Conference, 18-20 July, Kensington, London, UK, pp: 440-449.
  • Welch BL. 1947. The generalization of ‘STUDENT'S’problem when several different population varlances are involved. Biometrika, 34(1-2): 28-35.
  • Witten IH, Frank E. 2002. Data mining: practical machine learning tools and techniques with Java implementations. Acm Sigmod Rec, 31(1): 76-77.
  • Witten IH, Frank E, Hall MA, Pal CJ, Data M. 2005. Practical machine learning tools and techniques. Data Mining, 2005: 403-413.
  • Xiang S, Nie F, Meng G, Pan C, Zhang C. 2012. Discriminative least squares regression for multiclass classification and feature selection. IEEE Transact Neural Networks Learn Systems, 23(11): 1738-1754.
  • Yousefi T, Aktaş ÖV. 2024. Predicting Customer Satisfaction with Hybrid Basic Filter-Based Feature Selection Method.
  • Yousefi T, Varlıklar Ö. 2024. Breast cancer prediction with hybrid filter-wrapper feature selection. Inter J Adv Nat Sci Engin Res, 8: 411-419.
  • Zheng A, Casari A. 2018. Feature engineering for machine learning: principles and techniques for data scientists. O'Reilly Media, London, UK, pp: 263.

PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods

Year 2024, Volume: 7 Issue: 5, 971 - 981, 15.09.2024
https://doi.org/10.34248/bsengineering.1467132

Abstract

Feature selection is a significant data mining and machine learning technique that enhances model performance by identifying important features within a dataset, reducing the risk of overfitting while aiding the model in making faster and more accurate predictions. Pyallffs is a Python library developed to optimize the feature selection process, offering rich content and low dependency requirements. With 19 different filtering methods, pyallffs assists in analyzing dataset features to determine the most relevant ones. Users can apply custom filtering methods to their datasets using pyallffs, thereby achieving faster and more effective results in data analytics and machine learning projects. The source codes, supplementary materials, and guidance is publicly available on GitHub: https://github.com/tohid-yousefi/pyallffs.

References

  • Ali Khan S, Hussain A, Basit A, Akram S. 2014. Kruskal-Wallis-based computationally efficient feature selection for face recognition. Sci World J, 2014: 1-6.
  • Ali SI, Shahzad W. 2012. A feature subset selection method based on symmetric uncertainty and ant colony optimization. In: 2012 Inter Conference on Emerging Technologies, 8-9 October, 2012, Islamabad, Pakistan, pp: 1-6.
  • Arauzo-Azofra A, Benitez JM, Castro JL. 2004. A feature set measure based on relief. In: Proceedings of the fifth Inter conference on Recent Advances in Soft Computing, April 27-28, Copenhagen, Denmark pp: 104-109.
  • Battiti R. 1994. Using mutual information for selecting features in supervised neural net learning. IEEE Transact Neural Networks, 4: 537-550.
  • Belkin M, Niyogi P. 2001. Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inform Proces Systems, 2001: 14.
  • Beraha M, Metelli AM, Papini M, Tirinzoni A, Restelli M. 2019. Feature selection via mutual information: New theoretical insights. In: 2019 Inter Joint Conference on Neural Networks (IJCNN), 14-19 July 2019, Budapest, Hungary pp: 1-9.
  • Bolón-Canedo V, Sánchez-Marono N, Alonso-Betanzos A, Benítez JM,Herrera F. 2014. A review of microarray datasets and applied feature selection methods. Inform Sci, 282: 111-135.
  • Bryant FB, Satorra A. 2012. Principles and practice of scaled difference chi-square testing. Struct Equation Model: A Multidisciplin J, 3: 372-398.
  • Budak H, Taşabat SE. 2016. A modified t-score for feature selection. Anadolu Univ J Sci Technol A-Applied Sci Engin, 5: 845-852.
  • Carey JJ, Delaney MF. 2010. T-scores and Z-scores. Clinical Rev Bone Mineral Metabol, 8: 113-121.
  • Chandra B, Gupta M. 2011. An efficient statistical feature selection approach for classification of gene expression data. J Biomed Inform, 4: 529-535.
  • Chandrashekar G, Sahin F. 2014. A survey on feature selection methods. Comput Elect Engin, 1: 16-28.
  • Cover TM. 1999. Elements of information theory. John Wiley & Sons, London, UK, pp: 54.
  • Dash M, Liu H. 2003. Consistency-based search in feature selection. Artificial Intel, 1-2: 155-176.
  • Delacre M, Lakens D, Leys C. 2017. Why psychologists should by default use Welch's t-test instead of Student's t-test. Inter Rev Soc Psychol, 1: 92-101.
  • Ding C, Peng H. 2005. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol, 2: 185-205.
  • Doquire G, Verleysen M. 2011. Feature selection with mutual information for uncertain data. In: Data Warehousing and Knowledge Discovery: 13th Inter Conference, DaWaK 2011, Toulouse, France, August 29-September 2, pp: 330-341.
  • Esmael B, Arnaout A, Fruhwirth R, Thonhauser G. 2012. A statistical feature-based approach for operations recognition in drilling time series. Inter J Comput Inform Systems Industrial Manage Applicat, 4(6): 100-108.
  • Faulkner KG. 2005. The tale of the T-score: review and perspective. Osteoporosis Inter, 16, 347-352.
  • François D, Rossi F, Wertz V, Verleysen M. 2007. Resampling methods for parameter-free and robust feature selection with mutual information. Neurocomput, 70(7-9): 1276-1288.
  • Goswami S, Chakrabarti A. 2014. Feature selection: A practitioner view. Inter J Inform Technol Comput Sci (IJITCS), 6(11): 66
  • Gu Q, Li Z, Han J. 2012. Generalized fisher score for feature selection. arXiv preprint arXiv:1202.3725.
  • Hall MA, Holmes G. 2000. Benchmarking attribute selection techniques for data mining. IEEE Trans Knowl Data Eng, 15 (2003): 1437-1447.
  • Hall MA, Smith LA. 1998. Practical feature subset selection for machine learning. In: Computer Science Proceedings of the 21st Australasian Computer Science Conference ACSC’98, Perth, 4-6 February, Berlin, Germany, pp: 181-191.
  • He X, Cai D, Niyogi P. 2005. Laplacian score for feature selection. Adv Neural Inform Proces Systems, 2005: 18.
  • He X, Niyogi P. 2003. Locality preserving projections. Adv Neural Inform Proces Systems, 2003: 16.
  • Hernández-Torruco J, Canul-Reich J, Frausto-Solís J, Méndez-Castillo JJ. 2014. Feature selection for better identification of subtypes of Guillain-Barré syndrome. Comput Math Methods Med, 2014: 432109.
  • Kabir MM, Islam MM, Murase K. 2010. A new wrapper feature selection approach using neural network. Neurocomput, 73(16-18): 3273-3283.
  • Kalousis A, Prados J, Hilario M. 2007. Stability of feature selection algorithms: a study on high-dimensional spaces. Knowledge Inform Systems, 12: 95-116.
  • Kannan SS, Ramaraj N. 2010. A novel hybrid feature selection via symmetrical uncertainty ranking based local memetic search algorithm. Knowledge-Based Systems, 23(6): 580-585.
  • Kass GV. 1980. An exploratory technique for investigating large quantities of categorical data. J Royal Stat Soc: Series C (Applied Stat), 29(2): 119-127.
  • Kira K, Rendell LA. 1992. The feature selection problem: Traditional methods and a new algorithm. In: Proceedings of the Tenth National Conference on Artificial intelligence, July 12–16, California, USA, pp: 129-134.
  • Kohavi R, John GH. 1997. Wrappers for feature subset selection. Artificial Intel, 97(1-2): 273-324.
  • Koller D, Sahami M. 1996. Toward optimal feature selection. In: ICML, 292.
  • Kononenko I. 1994. Estimating attributes: Analysis and extensions of RELIEF. In: European Conference on Machine Learning, April 6-8, Catania, Italy, pp:71-182.
  • Kraskov A, Stögbauer H, Grassberger P. 2004. Estimating mutual information. Physical Rev E, 69(6): 066138.
  • Kullback S, Leibler RA. 1951. On information and sufficiency. Annals Math Stat, 22(1): 79-86.
  • Ladha L, Deepa T. 2011. Feature selection methods and algorithms. Inter J Comput Sci Engin, 3(5): 1787-1797.
  • Liu H, Motoda H, Setiono R,Zhao Z. 2010. Feature selection: An ever evolving frontier in data mining. Feature Select Data Min, 2010: 4-13.
  • Lu D, Weng Q. 2007. A survey of image classification methods and techniques for improving classification performance. Inter J Remote Sensing, 28(5): 823-870.
  • Lun Gaoa TL, Yaob L, Wenb F. 2013. Research and application of data mining feature selection based on relief algorithm. Work, 2013: 515.
  • Mani K, Kalpana P. 2016. A review on filter based feature selection. Inter J Innov Res Computer Communicat Engin (IJIRCCE), pp: 2320-9801.
  • Martínez Casasnovas JA, Klaasse A, Nogués Navarro J, Ramos Martín MC. 2008. Comparison between land suitability and actual crop distribution in an irrigation district of the Ebro valley (Spain). Spanish J Agri Res, 6(4): 700-713.
  • Miao J, Niu L. 2016. A survey on feature selection. Procedia Comput Sci, 91: 919-926.
  • Naik A, Rangwala H. 2016. Embedding feature selection for large-scale hierarchical classification. In: 2016 IEEE Inter Conference on Big Data (Big Data), December 5-8, Washington DC, USA, pp: 1212-1221.
  • Nilsson R. 2007. Statistical feature selection: with applications in life science. Institutionen för fysik, kemi och biologi, Berlin, Germany, pp: 54.
  • Novaković J. 2016. Toward optimal feature selection using ranking methods and classification algorithms. Yugoslav J Operat Res, 21: 1.
  • Opitz D, Maclin R. 1999. Popular ensemble methods: An empirical study. J Artific Intel Res, 11: 169-198.
  • Pearl J. 1988. Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan kaufmann.
  • Peng H, Fan Y. 2015. Direct l_ (2, p)-Norm learning for feature selection. arXiv preprint arXiv: 1504.00430.
  • Priyadarsini RP, Valarmathi M, Sivakumari S. 2011. Gain ratio based feature selection method for privacy preservation. ICTACT J Soft Comput, 1(4): 201-205.
  • Radovic M, Ghalwash M, Filipovic N, Obradovic Z. 2017. Minimum redundancy maximum relevance feature selection approach for temporal gene expression data. BMC Bioinform, 18: 1-14.
  • Rossi F, Lendasse A, François D, Wertz V,Verleysen M. 2006. Mutual information for the selection of relevant variables in spectrometric nonlinear modelling. Chemometrics Intel Lab Systems, 80(2): 215-226.
  • Saeys Y, Inza I, Larranaga P. 2007. A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19): 2507-2517.
  • Sedgwick P. 2012. Pearson’s correlation coefficient. BMJ, 2012: 345.
  • Shannon CE. 1948. A mathematical theory of communication. Bell System Technic J, 27(3): 379-423.
  • Shardlow M. 2016. An analysis of feature selection techniques. J Univ Manchester, 2016: 1-7.
  • Shen J, Li L, Wong W-K. 2008. Markov Blanket Feature Selection for Support Vector Machines. AAAI, 2008: 696-701.
  • Singh B, Kushwaha N, Vyas OP. 2014. A feature subset selection technique for high dimensional data using symmetric uncertainty. J Data Analysis Inform Proces, 2(4): 95-105.
  • Suebsing A, Hiransakolwong N. 2009. Feature selection using euclidean distance and cosine similarity for intrusion detection model. In: 2009 First Asian Conference on Intelligent Information and Database Systems, April 1-3, Dong Hoi, Quang Binh, Vietnam, pp: 86-91.
  • Suzuki T, Sugiyama M, Sese J, Kanamori T. 2008. Approximating mutual information by maximum likelihood density ratio estimation. PMLR, 2008: 5-20.
  • Suzuki T, Sugiyama M, Tanaka T. 2009. Mutual information approximation via maximum likelihood estimation of density ratio. In: 2009 IEEE Inter Symposium on Information Theory, 28 June - 3 July, Seoul, Korea, pp: 463-467.
  • Tsamardinos I, Aliferis CF, Statnikov A. 2003. Time and sample efficient discovery of Markov blankets and direct causal relations. In: Proceedings of the ninth ACM SIGKDD Inter Conference on Knowledge Discovery and Data Mining, August 24-27, Washington, DC, USA, pp: 673-678.
  • Tsamardinos I, Aliferis CF, Statnikov AR, Statnikov E. 2003. Algorithms for large scale Markov blanket discovery. FLAIRS, 2003: 376-381.
  • Ugoni A, Walker BF. 1995. The Chi square test: an introduction. COMSIG Rev, 4(3): 61.
  • Urbanowicz RJ, Olson RS, Schmitt P, Meeker M, Moore JH. 2018. Benchmarking relief-based feature selection methods for bioinformatics data mining. J Biomed Inform, 85: 168-188
  • Vergara JR, Estévez PA. 2014. A review of feature selection methods based on mutual information. Neural Comput Applicat, 24, 175-186
  • Von Luxburg U. 2007. A tutorial on spectral clustering. Stat Comput, 17: 395-416.
  • Vora S, Yang H. 2017. A comprehensive study of eleven feature selection algorithms and their impact on text classification. In: 2017 Computing Conference, 18-20 July, Kensington, London, UK, pp: 440-449.
  • Welch BL. 1947. The generalization of ‘STUDENT'S’problem when several different population varlances are involved. Biometrika, 34(1-2): 28-35.
  • Witten IH, Frank E. 2002. Data mining: practical machine learning tools and techniques with Java implementations. Acm Sigmod Rec, 31(1): 76-77.
  • Witten IH, Frank E, Hall MA, Pal CJ, Data M. 2005. Practical machine learning tools and techniques. Data Mining, 2005: 403-413.
  • Xiang S, Nie F, Meng G, Pan C, Zhang C. 2012. Discriminative least squares regression for multiclass classification and feature selection. IEEE Transact Neural Networks Learn Systems, 23(11): 1738-1754.
  • Yousefi T, Aktaş ÖV. 2024. Predicting Customer Satisfaction with Hybrid Basic Filter-Based Feature Selection Method.
  • Yousefi T, Varlıklar Ö. 2024. Breast cancer prediction with hybrid filter-wrapper feature selection. Inter J Adv Nat Sci Engin Res, 8: 411-419.
  • Zheng A, Casari A. 2018. Feature engineering for machine learning: principles and techniques for data scientists. O'Reilly Media, London, UK, pp: 263.
There are 76 citations in total.

Details

Primary Language English
Subjects Information Systems (Other), Applied Statistics, Applied Mathematics (Other)
Journal Section Research Articles
Authors

Tohid Yousefi 0000-0003-4288-8194

Özlem Varlıklar 0000-0001-6415-0698

Early Pub Date September 5, 2024
Publication Date September 15, 2024
Submission Date April 9, 2024
Acceptance Date September 3, 2024
Published in Issue Year 2024 Volume: 7 Issue: 5

Cite

APA Yousefi, T., & Varlıklar, Ö. (2024). PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods. Black Sea Journal of Engineering and Science, 7(5), 971-981. https://doi.org/10.34248/bsengineering.1467132
AMA Yousefi T, Varlıklar Ö. PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods. BSJ Eng. Sci. September 2024;7(5):971-981. doi:10.34248/bsengineering.1467132
Chicago Yousefi, Tohid, and Özlem Varlıklar. “PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods”. Black Sea Journal of Engineering and Science 7, no. 5 (September 2024): 971-81. https://doi.org/10.34248/bsengineering.1467132.
EndNote Yousefi T, Varlıklar Ö (September 1, 2024) PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods. Black Sea Journal of Engineering and Science 7 5 971–981.
IEEE T. Yousefi and Ö. Varlıklar, “PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods”, BSJ Eng. Sci., vol. 7, no. 5, pp. 971–981, 2024, doi: 10.34248/bsengineering.1467132.
ISNAD Yousefi, Tohid - Varlıklar, Özlem. “PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods”. Black Sea Journal of Engineering and Science 7/5 (September 2024), 971-981. https://doi.org/10.34248/bsengineering.1467132.
JAMA Yousefi T, Varlıklar Ö. PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods. BSJ Eng. Sci. 2024;7:971–981.
MLA Yousefi, Tohid and Özlem Varlıklar. “PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods”. Black Sea Journal of Engineering and Science, vol. 7, no. 5, 2024, pp. 971-8, doi:10.34248/bsengineering.1467132.
Vancouver Yousefi T, Varlıklar Ö. PYALLFFS: An Open-Source Library for All Filter Feature Selection Methods. BSJ Eng. Sci. 2024;7(5):971-8.

                                                24890