ELKARTEK 2021
Automated machine learning aims to optimize machine learning pipelines automatically given a dataset, task type and a target variable. This research analyzes the use of genetic programming to perform automated feature engineering in regression problems. It introduces a methodology to perform feature selection and to construct new features departing from the original feature set by combining and selecting features in the leaf nodes of the genetic programming tree. A multiple feature generation technique is proposed, where three different feature sets are tested with linear regression, Random Forest regressor and Gradient Boosting regressor. The proposed approach is applied to an industrial process dataset where the target variable is an indicator of the performance of the process. The experimental results reveal the ability of the method to reduce the cardinality of the original feature set while maintaining the performance of the learning models. Moreover, they show the ability of the newly constructed feature to better discriminate the target variable.
Automated machine learning Feature engineering Genetic programming Predictive analytics
Basque Government
ELKARTEK 2021
This work was supported by the Project BISUM under Grant ELKARTEK 2021 through Basque Government.
Birincil Dil | İngilizce |
---|---|
Konular | Yapay Zeka |
Bölüm | Research Articles |
Yazarlar | |
Proje Numarası | ELKARTEK 2021 |
Yayımlanma Tarihi | 30 Haziran 2023 |
Kabul Tarihi | 20 Şubat 2023 |
Yayımlandığı Sayı | Yıl 2023 Cilt: 3 Sayı: 1 |
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.