ELKARTEK 2021
Automated machine learning aims to optimize machine learning pipelines automatically given a dataset, task type and a target variable. This research analyzes the use of genetic programming to perform automated feature engineering in regression problems. It introduces a methodology to perform feature selection and to construct new features departing from the original feature set by combining and selecting features in the leaf nodes of the genetic programming tree. A multiple feature generation technique is proposed, where three different feature sets are tested with linear regression, Random Forest regressor and Gradient Boosting regressor. The proposed approach is applied to an industrial process dataset where the target variable is an indicator of the performance of the process. The experimental results reveal the ability of the method to reduce the cardinality of the original feature set while maintaining the performance of the learning models. Moreover, they show the ability of the newly constructed feature to better discriminate the target variable.
Basque Government
ELKARTEK 2021
This work was supported by the Project BISUM under Grant ELKARTEK 2021 through Basque Government.
Primary Language | English |
---|---|
Subjects | Artificial Intelligence |
Journal Section | Research Articles |
Authors | |
Project Number | ELKARTEK 2021 |
Publication Date | June 30, 2023 |
Acceptance Date | February 20, 2023 |
Published in Issue | Year 2023 Volume: 3 Issue: 1 |
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.