TY  - JOUR
T1  - Comprehensive Benchmarking Analysis for Evaluating Effectiveness of Transfer Learning-based Feature Engineering in AutoML
AU  - Sırt, Merve
AU  - Eyüpoğlu, Can
PY  - 2025
DA  - February
Y2  - 2024
DO  - 10.52876/jcs.1604889
JF  - The Journal of Cognitive Systems
JO  - JCS
PB  - İstanbul Technical University
WT  - DergiPark
SN  - 2548-0650
SP  - 30
EP  - 37
VL  - 9
IS  - 2
LA  - en
AB  - This study conducts a comprehensive benchmarking analysis to evaluate the effectiveness of transfer learning-based feature engineering in Automated Machine Learning (AutoML) systems. The research compares traditional manual feature engineering, standard AutoML approaches, and transfer learning-enhanced AutoML across diverse data modalities, including images, text, and tabular data. Experimental evaluations were carried out using CIFAR-10, IMDB Reviews, and Adult Census Income datasets, focusing on assessing each approach in terms of model performance, training time, and resource utilization. The findings reveal that transfer learning-enhanced AutoML significantly reduces training time by up to 45% while improving model accuracy by up to 20%, particularly for image and text datasets. Furthermore, scenarios with high feature reuse rates demonstrated memory utilization improvements of up to 30%. These results underscore the substantial advantages of integrating transfer learning into AutoML systems for optimizing feature engineering processes.
KW  - AutoML
KW  - Transfer Learning
KW  - Feature Engineering
KW  - Machine Learning Optimization
CR  - [1]	X. He, K. Zhao, X. Chu. “Automl: A Survey of the State-of-the-Art.” Knowledge-Based Systems 212 (January 2021): 106622. https://doi.org/10.1016/j.knosys.2020.106622.
CR  - [2]	K. Chauhan et al., &quot;Automated Machine Learning: The New Wave of Machine Learning,&quot; 2020 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), Bangalore, India, 2020, pp. 205-212, . https://doi: 10.1109/ICIMIA48430.2020.9074859.
CR  - [3]	T. Nagarajah and G. Poravi, &quot;A Review on Automated Machine Learning (AutoML) Systems,&quot; 2019 IEEE 5th International Conference for Convergence in Technology (I2CT), Bombay, India, 2019, pp. 1-6, . https://doi: 10.1109/I2CT45611.2019.9033810.
CR  - [4]	M. Baratchi, C. Wang, S. Limmer, J. N. van Rijn, H. Hoos, T. Bäck and M. Olhofer. 2024. “Automated Machine Learning: Past, Present and Future.” Artificial Intelligence Review 57 (5). https://doi.org/10.1007/s10462-024-10726-1.
CR  - [5]	V. K. Harikrishnan, M. Vijarania, and A. Gambhir. “Diabetic Retinopathy Identification Using Automl.” Computational Intelligence and Its Applications in Healthcare, 2020, 175–88. https://doi.org/10.1016/b978-0-12-820604-1.00012-1.
CR  - [6]	D. Salinas and N. Erickson. 2023. “TabRepo: A Large Scale Repository of Tabular Model Evaluations and its AutoML Applications.” arXiv preprint arXiv:2311.02971.
CR  - [7]     	P. Malakar, P. Balaprakash, V. Vishwanath, V. Morozov and K. Kumaran, &quot;Benchmarking Machine Learning Methods for Performance Modeling of Scientific Applications,&quot; 2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), Dallas, TX, USA, 2018, pp. 33-44, doi: 10.1109/PMBS.2018.8641686.
CR  - [8]	D.V. Anand, Q. Xu, J. Wee. Topological feature engineering for machine learning based halide perovskite materials design. npj Comput Mater 8, 203 (2022). https://doi.org/10.1038/s41524-022-00883-8
CR  - [9]	M. Zöller, H. F. Huber (2021). Benchmark and survey of automated machine learning frameworks. https://arxiv.org/abs/1904.12054
CR  - [10]	Y. Abouelnaga, O. S. Ali, H. Rady, M. Moustafa. (2016). CIFAR-10: KNN-based ensemble of classifiers. https://arxiv.org/abs/1611.04905
CR  - [11]	Z. Yan, J. Zhou, W. Wong. (2021). Near Lossless Transfer Learning for Spiking Neural Networks. AAAI Conference on Artificial Intelligence.
CR  - [12]	R. Istrate, F. Scheidegger, G. Mariani, D. Nikolopoulos, C. Bekas, A. Cristiano Innocenza Malossi. Tapas: Train-less accuracy predictor for architecture search. In Proceedings of the AAAI conference on artificial intelligence, pp. 3927–3934, 2019.
CR  - [13]	Y. You and J. Demmel, &quot;Runtime Data Layout Scheduling for Machine Learning Dataset,&quot; 2017 46th International Conference on Parallel Processing (ICPP), Bristol, UK, 2017, pp. 452-461, doi: 10.1109/ICPP.2017.54.
CR  - [14]	P. M. Radiuk. Impact of training set batch size on the performance of convolutional neural networks for diverse datasets. (2017)
CR  - [15]	M. S. Başarslan, F. Kayaalp, “Sentiment analysis with ensemble and machine learning methods in multi-domain datasets”, TUJE, vol. 7, no. 2, pp. 141–148, 2023, doi: 10.31127/tuje.1079698.
CR  - [16]    	Weblink-1: https://archive.ics.uci.edu/dataset/691/cifar+10
CR  - [17]        Weblink-2: https://www.tensorflow.org/datasets/catalog/imdb_reviews
CR  - [18]         Weblink-3: https://archive.ics.uci.edu/dataset/2/adult
UR  - https://doi.org/10.52876/jcs.1604889
L1  - https://dergipark.org.tr/en/download/article-file/4456103
ER  -