TY - JOUR T1 - RLSE Aktivasyon Fonksiyonu Tasarımının Derin Sinir Ağlarının Performansındaki Etkisi TT - The Impact of RLSE Activation Function Design on the Performance of Deep Neural Networks AU - Özeloğlu, İsmihan Gül AU - Akman Aydın, Eda AU - Barışçı, Necaattin PY - 2025 DA - November Y2 - 2025 DO - 10.2339/politeknik.1601441 JF - Politeknik Dergisi PB - Gazi University WT - DergiPark SN - 2147-9429 SP - 1 EP - 1 LA - tr AB - Aktivasyon fonksiyonu derin sinir ağlarının performansı üzerinde kritik etkisi olan bir bileşendir. Bu çalışmada, derin sinir ağlarında, yüksek sınıflandırma doğruluğu ve düşük kayıp elde etmek için yeni bir aktivasyon fonksiyonu önerilmektedir. Önerilen RLSE (ReLu-LIP-Sigmoid-ELU kombinasyonu) aktivasyon fonksiyonu ile, kaybolan gradyan sorunu ve ölmekte olan ReLU probleminin üstesinden gelinmesi hedeflemektedir. RLSE aktivayon fonksiyonunun performansı MNIST ve Fashion-MNIST veri kümeleri üzerinde değerlendirilmiş ve literatürde bulunan yeni geliştirilmiş aktivasyon fonksiyonlarıyla karşılaştırılmıştır. RLSE aktivasyon fonksiyonunun kullanılması ile, bu çalışmada tasarlanan Evrişimsel Sinir Ağı (ESA) mimarisinde MNIST veri kümesi için %99,04 ve Fashion MNIST veri kümesi için %90,40 doğruluk oranları elde edilmiştir. Sonuçlar, RLSE aktivasyon foksiyonunun diğer aktivasyon fonksiyonlarından daha iyi performans gösterdiğini ortaya koymaktadır. KW - Aktivasyon fonksiyonu KW - Derin sinir ağları KW - Evrişimsel sinir ağı N2 - The activation function is a critical component that significantly impacts the performance of deep neural networks. In this study, a novel activation function, RLSE (a combination of ReLU-LIP-Sigmoid-ELU), is proposed to achieve high classification accuracy and low loss in deep neural networks. The RLSE activation function aims to address the vanishing gradient problem and the dying ReLU issue. The performance of the RLSE activation function has been evaluated on the MNIST and Fashion-MNIST datasets and compared with recently developed activation functions in the literature. Using the RLSE activation function, the Convolutional Neural Network (CNN) architecture designed in this study achieved accuracy rates of 99.04% for the MNIST dataset and 90.40% for the Fashion-MNIST dataset. The results demonstrate that the RLSE activation function outperforms other activation functions. CR - [1] Sengupta, S., Basak, S., Saikia, P., Paul, S., Tsalavoutis, V., Atiah, F., & Peters, A., “A review of deep learning with special emphasis on architectures, applications and recent trends”, Knowledge-Based Systems, 194: 105596, (2020). CR - [2] Sarker, I. H., “Machine learning: Algorithms, real-world applications and research directions”, SN computer science, 2(3): 160, (2021). CR - [3] Cong, S., & Zhou, Y., “A review of convolutional neural network architectures and their optimizations”, Artificial Intelligence Review, 56(3): 1905-1969, (2023). CR - [4] Zheng, Y., Gao, Z., Wang, Y., & Fu, Q., “MOOC dropout prediction using FWTS-CNN model based on fused feature weighting and time series”, IEEE Access, 8: 225324-225335, (2020). CR - [5] Dubey, S. R., Singh, S. K., & Chaudhuri, B. B., “Activation functions in deep learning: A comprehensive survey and benchmark”, Neurocomputing, 503: 92-108, (2022). CR - [6] Krichen, M., “Convolutional neural networks: A survey”, Computers, 12(8): 151, (2023). CR - [7] V. Nair, G.E. Hinton, Rectified linear units improve restricted boltzmann machines, in: Proc. International Conference on Machine Learning, Haifa, Israel, 807–814, (2010). CR - [8] X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectifier neural networks, in: Proc. International Conference on Artificial Intelligence and Statistics Conference, Ft. Lauderdale, FL, USA, (2011). CR - [9] Apicella A., Donnarumma F., Isgrò F., Prevete R., “A survey on modern trainable activation functions”, Neural Netw., 138: 14-32, (2021). CR - [10] Clevert D.-A., Unterthiner T., Hochreiter S., “Fast and accurate deep network learning by exponential linear units (ELUs)”, arXiv [cs.LG], (2015). CR - [11] Klambauer G., Unterthiner T., Mayr A., Hochreiter S., “Self-normalizing neural networks”, arXiv [cs.LG], (2017). CR - [12] Ramachandran, P., Zoph, B., & Le, Q. V., “Swish: a self-gated activation function”, arXiv preprint arXiv:1710.05941, 7(1): 5, (2017). CR - [13] Hendrycks, D., & Gimpel, K., “Gaussian error linear units (gelus)”, arXiv preprint arXiv:1606.08415, (2016). CR - [14] Lu, Lu, et al. "Dying relu and initialization: Theory and numerical examples." arXiv preprint arXiv:1903.06733, (2019). CR - [15] Elfwing, Stefan, Eiji Uchibe, and Kenji Doya., "Sigmoid-weighted linear units for neural network function approximation in reinforcement learning", Neural networks, 107: 3-11, (2018). CR - [16] Venkatappareddy, P., Culli, J., Srivastava, S., & Lall, B., “A Legendre polynomial based activation function: An aid for modeling of max pooling”, Digital Signal Processing, 115: 103093, (2021). CR - [17] Carini, Alberto, et al., "Legendre nonlinear filters", Signal Processing, 109: 84-94, (2015). CR - [18] Jahan, Israt, et al., "Self-gated rectified linear unit for performance improvement of deep neural networks", ICT Express, 9(3): 320-325, (2023). CR - [19] Kiliçarslan, S., & Celik, M., “RSigELU: A nonlinear activation function for deep neural networks”, Expert Systems with Applications, 174: 114805, (2021). CR - [20] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P., “Gradient-based learning applied to document recognition”, Proceedings of the IEEE, 86(11): 2278-2324, (1998). CR - [21] Xiao, H., Rasul, K., & Vollgraf, R., “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms”, arXiv preprint arXiv:1708.07747, (2017). CR - [22] Zhao, X., Wang, L., Zhang, Y., Han, X., Deveci, M., & Parmar, M., “A review of convolutional neural networks in computer vision”, Artificial Intelligence Review, 57(4): 99, (2024). CR - [23] Fan, C. L., “Multiscale Feature Extraction by Using Convolutional Neural Network: Extraction of Objects from Multiresolution Images of Urban Areas”, ISPRS International Journal of Geo-Information, 13(1): 5, (2023). CR - [24] Taye, M. M., “Theoretical understanding of convolutional neural network: Concepts, architectures, applications, future directions”, Computation, 11(3): 52, (2023). CR - [25] Zhao, L., & Zhang, Z., “A improved pooling method for convolutional neural networks”, Scientific Reports, 14(1): 1589, (2024). CR - [26] Elizar, E., Zulkifley, M. A., Muharar, R., Zaman, M. H. M., & Mustaza, S. M., “A review on multiscale-deep-learning applications”, Sensors, 22(19): 7384, (2022). CR - [27] Dubey, S. R., Singh, S. K., & Chaudhuri, B. B., “Activation functions in deep learning: A comprehensive survey and benchmark”, Neurocomputing, 503: 92-108, (2022). CR - [28] Özdemir, C., “Avg-topk: A new pooling method for convolutional neural networks”, Expert Systems with Applications, 223: 119892, (2023). CR - [29] Zafar, A., Aamir, M., Mohd Nawi, N., Arshad, A., Riaz, S., Alruban, A., ... & Almotairi, S., “A comparison of pooling methods for convolutional neural networks”, Applied Sciences, 12(17): 8643, (2022). CR - [30] Jeczmionek, E., & Kowalski, P. A., “Flattening layer pruning in convolutional neural networks”, Symmetry, 13(7): 1147, (2021). CR - [31] Ullah, U., Jurado, A. G. O., Gonzalez, I. D., & Garcia-Zapirain, B., “A fully connected quantum convolutional neural network for classifying ischemic cardiopathy”, IEEE Access, 10: 134592-134605, (2022). UR - https://doi.org/10.2339/politeknik.1601441 L1 - https://dergipark.org.tr/en/download/article-file/4440290 ER -