Comprehensive Benchmarking Analysis for Evaluating Effectiveness of Transfer Learning-based Feature Engineering in AutoML

Merve Sırt; Can Eyüpoğlu

doi:10.52876/jcs.1604889

Research Article

Year 2024, Volume: 9 Issue: 2, 30 - 37, 01.02.2025

Merve Sırt , Can Eyüpoğlu

https://doi.org/10.52876/jcs.1604889

Abstract

References

[1] X. He, K. Zhao, X. Chu. “Automl: A Survey of the State-of-the-Art.” Knowledge-Based Systems 212 (January 2021): 106622. https://doi.org/10.1016/j.knosys.2020.106622.
[2] K. Chauhan et al., "Automated Machine Learning: The New Wave of Machine Learning," 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Bangalore, India, 2020, pp. 205-212, . https://doi: 10.1109/ICIMIA48430.2020.9074859.
[3] T. Nagarajah and G. Poravi, "A Review on Automated Machine Learning (AutoML) Systems," 2019 IEEE 5th International Conference for Convergence in Technology (I2CT), Bombay, India, 2019, pp. 1-6, . https://doi: 10.1109/I2CT45611.2019.9033810.
[4] M. Baratchi, C. Wang, S. Limmer, J. N. van Rijn, H. Hoos, T. Bäck and M. Olhofer. 2024. “Automated Machine Learning: Past, Present and Future.” Artificial Intelligence Review 57 (5). https://doi.org/10.1007/s10462-024-10726-1.
[5] V. K. Harikrishnan, M. Vijarania, and A. Gambhir. “Diabetic Retinopathy Identification Using Automl.” Computational Intelligence and Its Applications in Healthcare, 2020, 175–88. https://doi.org/10.1016/b978-0-12-820604-1.00012-1.
[6] D. Salinas and N. Erickson. 2023. “TabRepo: A Large Scale Repository of Tabular Model Evaluations and its AutoML Applications.” arXiv preprint arXiv:2311.02971.
[7] P. Malakar, P. Balaprakash, V. Vishwanath, V. Morozov and K. Kumaran, "Benchmarking Machine Learning Methods for Performance Modeling of Scientific Applications," 2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), Dallas, TX, USA, 2018, pp. 33-44, doi: 10.1109/PMBS.2018.8641686.
[8] D.V. Anand, Q. Xu, J. Wee. Topological feature engineering for machine learning based halide perovskite materials design. npj Comput Mater 8, 203 (2022). https://doi.org/10.1038/s41524-022-00883-8
[9] M. Zöller, H. F. Huber (2021). Benchmark and survey of automated machine learning frameworks. https://arxiv.org/abs/1904.12054
[10] Y. Abouelnaga, O. S. Ali, H. Rady, M. Moustafa. (2016). CIFAR-10: KNN-based ensemble of classifiers. https://arxiv.org/abs/1611.04905
[11] Z. Yan, J. Zhou, W. Wong. (2021). Near Lossless Transfer Learning for Spiking Neural Networks. AAAI Conference on Artificial Intelligence.
[12] R. Istrate, F. Scheidegger, G. Mariani, D. Nikolopoulos, C. Bekas, A. Cristiano Innocenza Malossi. Tapas: Train-less accuracy predictor for architecture search. In Proceedings of the AAAI conference on artificial intelligence, pp. 3927–3934, 2019.
[13] Y. You and J. Demmel, "Runtime Data Layout Scheduling for Machine Learning Dataset," 2017 46th International Conference on Parallel Processing (ICPP), Bristol, UK, 2017, pp. 452-461, doi: 10.1109/ICPP.2017.54.
[14] P. M. Radiuk. Impact of training set batch size on the performance of convolutional neural networks for diverse datasets. (2017)
[15] M. S. Başarslan, F. Kayaalp, “Sentiment analysis with ensemble and machine learning methods in multi-domain datasets”, TUJE, vol. 7, no. 2, pp. 141–148, 2023, doi: 10.31127/tuje.1079698.
[16] Weblink-1: https://archive.ics.uci.edu/dataset/691/cifar+10
[17] Weblink-2: https://www.tensorflow.org/datasets/catalog/imdb_reviews
[18] Weblink-3: https://archive.ics.uci.edu/dataset/2/adult

Comprehensive Benchmarking Analysis for Evaluating Effectiveness of Transfer Learning-based Feature Engineering in AutoML

Year 2024, Volume: 9 Issue: 2, 30 - 37, 01.02.2025

Merve Sırt , Can Eyüpoğlu

https://doi.org/10.52876/jcs.1604889

Abstract

This study conducts a comprehensive benchmarking analysis to evaluate the effectiveness of transfer learning-based feature engineering in Automated Machine Learning (AutoML) systems. The research compares traditional manual feature engineering, standard AutoML approaches, and transfer learning-enhanced AutoML across diverse data modalities, including images, text, and tabular data. Experimental evaluations were carried out using CIFAR-10, IMDB Reviews, and Adult Census Income datasets, focusing on assessing each approach in terms of model performance, training time, and resource utilization. The findings reveal that transfer learning-enhanced AutoML significantly reduces training time by up to 45% while improving model accuracy by up to 20%, particularly for image and text datasets. Furthermore, scenarios with high feature reuse rates demonstrated memory utilization improvements of up to 30%. These results underscore the substantial advantages of integrating transfer learning into AutoML systems for optimizing feature engineering processes.

Keywords

AutoML , Transfer Learning , Feature Engineering , Machine Learning Optimization

References

[1] X. He, K. Zhao, X. Chu. “Automl: A Survey of the State-of-the-Art.” Knowledge-Based Systems 212 (January 2021): 106622. https://doi.org/10.1016/j.knosys.2020.106622.
[2] K. Chauhan et al., "Automated Machine Learning: The New Wave of Machine Learning," 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Bangalore, India, 2020, pp. 205-212, . https://doi: 10.1109/ICIMIA48430.2020.9074859.
[3] T. Nagarajah and G. Poravi, "A Review on Automated Machine Learning (AutoML) Systems," 2019 IEEE 5th International Conference for Convergence in Technology (I2CT), Bombay, India, 2019, pp. 1-6, . https://doi: 10.1109/I2CT45611.2019.9033810.
[4] M. Baratchi, C. Wang, S. Limmer, J. N. van Rijn, H. Hoos, T. Bäck and M. Olhofer. 2024. “Automated Machine Learning: Past, Present and Future.” Artificial Intelligence Review 57 (5). https://doi.org/10.1007/s10462-024-10726-1.
[5] V. K. Harikrishnan, M. Vijarania, and A. Gambhir. “Diabetic Retinopathy Identification Using Automl.” Computational Intelligence and Its Applications in Healthcare, 2020, 175–88. https://doi.org/10.1016/b978-0-12-820604-1.00012-1.
[6] D. Salinas and N. Erickson. 2023. “TabRepo: A Large Scale Repository of Tabular Model Evaluations and its AutoML Applications.” arXiv preprint arXiv:2311.02971.
[7] P. Malakar, P. Balaprakash, V. Vishwanath, V. Morozov and K. Kumaran, "Benchmarking Machine Learning Methods for Performance Modeling of Scientific Applications," 2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), Dallas, TX, USA, 2018, pp. 33-44, doi: 10.1109/PMBS.2018.8641686.
[8] D.V. Anand, Q. Xu, J. Wee. Topological feature engineering for machine learning based halide perovskite materials design. npj Comput Mater 8, 203 (2022). https://doi.org/10.1038/s41524-022-00883-8
[9] M. Zöller, H. F. Huber (2021). Benchmark and survey of automated machine learning frameworks. https://arxiv.org/abs/1904.12054
[10] Y. Abouelnaga, O. S. Ali, H. Rady, M. Moustafa. (2016). CIFAR-10: KNN-based ensemble of classifiers. https://arxiv.org/abs/1611.04905
[11] Z. Yan, J. Zhou, W. Wong. (2021). Near Lossless Transfer Learning for Spiking Neural Networks. AAAI Conference on Artificial Intelligence.
[12] R. Istrate, F. Scheidegger, G. Mariani, D. Nikolopoulos, C. Bekas, A. Cristiano Innocenza Malossi. Tapas: Train-less accuracy predictor for architecture search. In Proceedings of the AAAI conference on artificial intelligence, pp. 3927–3934, 2019.
[13] Y. You and J. Demmel, "Runtime Data Layout Scheduling for Machine Learning Dataset," 2017 46th International Conference on Parallel Processing (ICPP), Bristol, UK, 2017, pp. 452-461, doi: 10.1109/ICPP.2017.54.
[14] P. M. Radiuk. Impact of training set batch size on the performance of convolutional neural networks for diverse datasets. (2017)
[15] M. S. Başarslan, F. Kayaalp, “Sentiment analysis with ensemble and machine learning methods in multi-domain datasets”, TUJE, vol. 7, no. 2, pp. 141–148, 2023, doi: 10.31127/tuje.1079698.
[16] Weblink-1: https://archive.ics.uci.edu/dataset/691/cifar+10
[17] Weblink-2: https://www.tensorflow.org/datasets/catalog/imdb_reviews
[18] Weblink-3: https://archive.ics.uci.edu/dataset/2/adult

There are 18 citations in total.

Details

Primary Language	English
Subjects	Machine Learning (Other), Artificial Intelligence (Other)
Journal Section	Articles
Authors	Merve Sırt 0009-0004-6348-2032 Can Eyüpoğlu 0000-0002-6133-8617
Publication Date	February 1, 2025
Submission Date	December 20, 2024
Acceptance Date	December 23, 2024
Published in Issue	Year 2024 Volume: 9 Issue: 2

Cite

APA	Sırt, M., & Eyüpoğlu, C. (2025). Comprehensive Benchmarking Analysis for Evaluating Effectiveness of Transfer Learning-based Feature Engineering in AutoML. The Journal of Cognitive Systems, 9(2), 30-37. https://doi.org/10.52876/jcs.1604889

Download Cover Image

Article Files

Full Text