TY - JOUR T1 - The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models TT - Derin Öğrenme Modelleri için İrrasyonellerin Arctan Aktivasyon Fonksiyonunun Aralığı Üzerindeki Etkisi AU - Pervan Akman, Nergis AU - Tümer Sivri, Talya AU - Berkol, Ali PY - 2025 DA - February Y2 - 2025 DO - 10.19072/ijet.1394093 JF - International Journal of Engineering Technologies IJET JO - IJET PB - İstanbul Gelisim University WT - DergiPark SN - 2149-0104 SP - 89 EP - 101 VL - 9 IS - 3 LA - en AB - Deep learning has been applied in numerous areas, significantly impacting applications that address real-life challenges. Its success across a wide range of domains is partly attributed to activation functions, which introduce non-linearity into neural networks, enabling them to effectively model complex relationships in data. Activation functions remain a key area of focus for artificial intelligence researchers aiming to enhance neural network performance. This paper comprehensively explains and compares various activation functions, particularly emphasizing the arc tangent and its specific variations. The primary focus is on evaluating the impact of these activation functions in two different contexts: a multiclass classification problem applied to the Reuters Newswire dataset and a time-series prediction problem involving the energy trade value of Türkiye. Experimental results demonstrate that variations of the arc tangent function, leveraging irrational numbers such as π (pi), the golden ratio (ϕ), Euler number (e), and a self-arctan formulation, yield promising outcomes. The findings suggest that different variations perform optimally for specific tasks: arctan ϕ achieves superior results in multiclass classification problems, while arctan e is more effective in time-series prediction challenges. KW - Deep neural networks KW - Activation functions KW - Multiclass classification KW - Time-Series prediction KW - Reuters data KW - Energy trade value data N2 - Gerçek hayattaki çözümü zorlu uygulamalarda, derin öğrenme modelleri birçok alanda önemli başarı sergilemiştir. Bu başarının önemli bir kısmını, sinir ağlarındaki doğrusal olmayan yapılar aracılığı ile verideki karmaşık ilişkileri etkili bir şekilde modellemelerini sağlayan aktivasyon fonksiyonlarına dayanmaktadır. Aktivasyon fonksiyonları, sinir ağlarının performansını artırmayı hedefleyen yapay zeka araştırmacıları için hala önemli bir odak alanıdır. Bu makale, özellikle arktanjant ve onun belirli varyasyonlarına vurgu yaparak çeşitli aktivasyon fonksiyonlarını kapsamlı bir şekilde açıklamakta ve karşılaştırmaktadır. Ana odak noktası, bu aktivasyon fonksiyonlarının iki farklı bağlamdaki etkilerinin değerlendirilmesidir: Reuters Newswire veri kümesine uygulanan çok sınıflı sınıflandırma problemi ve Türkiye'nin enerji ticaret değerini içeren bir zaman serisi tahmini problemidir. Deneysel sonuçlar, π (pi), altın oran (ϕ), Euler sayısı (e) gibi irrasyonel sayıları ve yeni formüle edilmiş kendine ark tanjant formülasyonunu kullanan arktanjant fonksiyonu varyasyonlarının dikkate değer sonuçlar verdiğini göstermektedir. Bulgular, farklı varyasyonların belirli görevler için en iyi performansı sergilediğini öne sürmektedir: arctan ϕ çok sınıflı sınıflandırma problemlerinde üstün sonuçlar elde ederken, arctan e zaman serisi tahmini problemlerinde daha etkili olmaktadır. CR - [1] T.T. Sivri, N.P. Akman, A. Berkol, “A.: Multiclass Classification Using Arctangent Activation Function and Its Variations.”, 2022 14th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Ploiesti, pp. 1-6, 30 June 2022 - 01 July 2022. CR - [2] T.T. Sivri, N.P. Akman, A. Berkol, C. Peker, “Web Intrusion Detection Using Character Level Machine Learning Approaches with Upsampled Data”, Annals of Computer Science and Information Systems, DOI: http://dx.doi.org/10.15439/2022F147, Vol. 32, pp. 269–274. CR - [3] J. Kamruzzaman, “Arctangent Activation Function to Accelerate Backpropagation Learning”, IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences, Vol. E85-A, pp. 2373–2376, October 2002. CR - [4] S. Sharma, S. Sharma, A. Athaiya, “Activation Functions In Neural Networks”, International Journal of Engineering Applied Sciences and Technology, Vol. 4, pp. 310–316, April 2020. CR - [5] D. Paul, G. Sanap, S. Shenoy, D. Kalyane, K. Kalia, R.K. Tekade, “Artificial intelligence in drug discovery and development”, Drug Discovery Today, DOI: 10.1016/j.drudis.2020.10.010, Vol. 26, No. 1, pp. 80–93. CR - [6] B. Kisačanin, “Deep Learning for Autonomous Vehicles”, 2017 IEEE 47th International Symposium on Multiple-Valued Logic (ISMVL), Novi Sad, pp. 142, 22-24 May 2017. CR - [7] D.W. Otter; J.R. Medina; J.K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing”, IEEE Transactions on Neural Networks and Learning Systems, DOI: 10.1109/TNNLS.2020.2979670, Vol. 32, No. 2, pp. 604–624. CR - [8] R. Miotto, F. Wang, S. Wang, X. Jiang, J.T. Dudley, “Deep learning for healthcare: review, opportunities and challenges”, Briefings in Bioinformatics, Vol. 19, pp. 1236–1246, November 2018. CR - [9] S. I. Lee, S. J. Yoo, “Multimodal deep learning for finance: integrating and forecasting international stock markets”, The Journal of Supercomputing, DOI: https://doi.org/10.1007/s11227-019-03101-3, Vol. 76, pp. 8294–8312. CR - [10] Team, K.: Keras Documentation: Reuters Newswire Classification Dataset. Keras. Retrieved December 2022, from https://keras.io/api/datasets/reuters/ CR - [11] T. Kim, T. Adali, “Complex backpropagation neural network using elementary transcendental activation functions”, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings, Utah, pp. 1281–1284, 07-11 May 2001. CR - [12] G. Cybenko, “Approximation by superpositions of a sigmoidal function.”, Mathematics of Control, Signals and Systems, DOI: https://doi.org/10.1007/BF02551274, Vol. 2, No. 4, pp. 303–314. CR - [13] K. Hornik, M. Stinchcombe, H. White, “Multilayer feedforward networks are universal approximators”, Neural Networks, Vol. 2, pp. 359–366, 1989. CR - [14] K.-I. Funahashi, “On the approximate realization of continuous mappings by neural networks”, Neural Networks, Vol. 2, pp. 183–192, 1989. CR - [15] A.R. Barron, “Universal approximation bounds for superpositions of a sigmoidal function”, IEEE Transactions on Information Theory, DOI: 10.1109/18.256500, Vol. 39, No. 3, pp. 930–945. CR - [16] M. Leshno, V.Y. Lin, A. Pinkus, S. Schocken, “Multilayer feedforward networks with a nonpolynomial activation function can approximate any function”, Neural Networks, Vol. 6, pp. 861–867, 1993. CR - [17] D. Misra, “Mish: A Self Regularized Non-Monotonic Activation Function”, British Machine Vision Conference, 7–10 September 2020. CR - [18] R. Parhi, R.D. Nowak, “The Role of Neural Network Activation Functions”, IEEE Signal Processing Letters, DOI: 10.1109/LSP.2020.3027517, Vol. 27, pp. 1779–1783. CR - [19] N. Kulathunga, N. R. Ranasinghe, D. Vrinceanu, Z. Kinsman, L. Huang, Y. Wang, “Effects of the Nonlinearity in Activation Functions on the Performance of Deep Learning Models”, CoRR, 2020. CR - [20] Y. LeCun; B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel, “Backpropagation applied to handwritten zip code recognition”, Neural Computation, DOI: 10.1162/neco.1989.1.4.541, Vol. 1, No. 4, pp. 541–551. CR - [21] O. Sharma, “A new activation function for deep neural network”, 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, pp. 84 – 86, 14 – 16 February 2019. CR - [22] S.S. Liew, M. Khalil-Hani, R. Bakhteri, “Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems”, Neurocomputing, Vol. 216, pp. 718–734, December 2016. CR - [23] M.Ö. Efe, “Novel neuronal activation functions for Feedforward Neural Networks”, Neural Processing Letters, DOI: https://doi.org/10.1007/s11063-008-9082-0, Vol. 28, No. 2, pp. 63–79. CR - [24] E.N. Skoundrianos, S.G. Tzafestas, “Modelling and FDI of Dynamic Discrete Time Systems Using a MLP with a New Sigmoidal Activation Function”, Journal of Intelligent and Robotic Systems, DOI: https://doi.org/10.1023/B:JINT.0000049175.78893.2f, Vol. 41, No. 1, pp. 19–36. CR - [25] S.-L. Shen, N. Zhang, A. Zhou, Z.-Y. Yin “Enhancement of neural networks with an alternative activation function tanhLU”, Expert Systems with Applications, Vol. 199, pp. 117181, August 2022. CR - [26] M.F. Augusteijn, T.P. Harrington, “Evolving transfer functions for artificial neural networks”, Neural Computing & Applications, DOI: https://doi.org/10.1007/s00521-003-0393-9, Vol. 13, No. 1, pp. 38–46. CR - [27] J.M. Benitez, J.L. Castro, I. Requena, “Are artificial neural networks black boxes?”, IEEE Transactions on Neural Networks, DOI: 10.1109/72.623216, Vol. 8, No. 5, pp. 1156–1164. CR - [28] J. Jin, J. Zhu, J. Gong, W. Chen, “Novel activation functions-based ZNN models for fixed-time solving dynamic Sylvester equation”, Neural Computing and Applications, DOI: https://doi.org/10.1007/s00521-022-06905-2, Vol. 34, No. 17, pp. 14297–14315. CR - [29] X. Wang, H. Ren, A. Wang, “Smish: A Novel Activation Function for Deep Learning Methods”, Electronics, Vol. 11, pp. 540, February 2022. CR - [30] Trade Value - Day Ahead Market - Electricity Markets| EPIAS Transparency Platform. (n.d.). Retrieved December 20, 2022, from https://seffaflik.epias.com.tr/transparency/piyasalar/gop/islem-hacmi.xhtml UR - https://doi.org/10.19072/ijet.1394093 L1 - https://dergipark.org.tr/en/download/article-file/3551453 ER -