The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models

Talya Tümer Sivri; Nergis Pervan Akman; Ali Berkol

doi:10.19072/ijet.1394093

Research Article

Derin Öğrenme Modelleri için İrrasyonellerin Arctan Aktivasyon Fonksiyonunun Aralığı Üzerindeki Etkisi

Year 2024, Volume: 9 Issue: 3, 89 - 101, 01.02.2025

Talya Tümer Sivri , Nergis Pervan Akman , Ali Berkol

https://doi.org/10.19072/ijet.1394093

Abstract

Gerçek hayattaki çözümü zorlu uygulamalarda, derin öğrenme modelleri birçok alanda önemli başarı sergilemiştir. Bu başarının önemli bir kısmını, sinir ağlarındaki doğrusal olmayan yapılar aracılığı ile verideki karmaşık ilişkileri etkili bir şekilde modellemelerini sağlayan aktivasyon fonksiyonlarına dayanmaktadır. Aktivasyon fonksiyonları, sinir ağlarının performansını artırmayı hedefleyen yapay zeka araştırmacıları için hala önemli bir odak alanıdır. Bu makale, özellikle arktanjant ve onun belirli varyasyonlarına vurgu yaparak çeşitli aktivasyon fonksiyonlarını kapsamlı bir şekilde açıklamakta ve karşılaştırmaktadır. Ana odak noktası, bu aktivasyon fonksiyonlarının iki farklı bağlamdaki etkilerinin değerlendirilmesidir: Reuters Newswire veri kümesine uygulanan çok sınıflı sınıflandırma problemi ve Türkiye'nin enerji ticaret değerini içeren bir zaman serisi tahmini problemidir. Deneysel sonuçlar, π (pi), altın oran (ϕ), Euler sayısı (e) gibi irrasyonel sayıları ve yeni formüle edilmiş kendine ark tanjant formülasyonunu kullanan arktanjant fonksiyonu varyasyonlarının dikkate değer sonuçlar verdiğini göstermektedir. Bulgular, farklı varyasyonların belirli görevler için en iyi performansı sergilediğini öne sürmektedir: arctan ϕ çok sınıflı sınıflandırma problemlerinde üstün sonuçlar elde ederken, arctan e zaman serisi tahmini problemlerinde daha etkili olmaktadır.

Keywords

Derin sinir ağları, Aktivasyon fonksiyonları, Çok sınıflı sınıflandırma, Zaman serisi tahmini, Reuters verisi, Enerji ticaret değeri verisi

References

[1] T.T. Sivri, N.P. Akman, A. Berkol, “A.: Multiclass Classification Using Arctangent Activation Function and Its Variations.”, 2022 14th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Ploiesti, pp. 1-6, 30 June 2022 - 01 July 2022.
[2] T.T. Sivri, N.P. Akman, A. Berkol, C. Peker, “Web Intrusion Detection Using Character Level Machine Learning Approaches with Upsampled Data”, Annals of Computer Science and Information Systems, DOI: http://dx.doi.org/10.15439/2022F147, Vol. 32, pp. 269–274.
[3] J. Kamruzzaman, “Arctangent Activation Function to Accelerate Backpropagation Learning”, IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences, Vol. E85-A, pp. 2373–2376, October 2002.
[4] S. Sharma, S. Sharma, A. Athaiya, “Activation Functions In Neural Networks”, International Journal of Engineering Applied Sciences and Technology, Vol. 4, pp. 310–316, April 2020.
[5] D. Paul, G. Sanap, S. Shenoy, D. Kalyane, K. Kalia, R.K. Tekade, “Artificial intelligence in drug discovery and development”, Drug Discovery Today, DOI: 10.1016/j.drudis.2020.10.010, Vol. 26, No. 1, pp. 80–93.
[6] B. Kisačanin, “Deep Learning for Autonomous Vehicles”, 2017 IEEE 47th International Symposium on Multiple-Valued Logic (ISMVL), Novi Sad, pp. 142, 22-24 May 2017.
[7] D.W. Otter; J.R. Medina; J.K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing”, IEEE Transactions on Neural Networks and Learning Systems, DOI: 10.1109/TNNLS.2020.2979670, Vol. 32, No. 2, pp. 604–624.
[8] R. Miotto, F. Wang, S. Wang, X. Jiang, J.T. Dudley, “Deep learning for healthcare: review, opportunities and challenges”, Briefings in Bioinformatics, Vol. 19, pp. 1236–1246, November 2018.
[9] S. I. Lee, S. J. Yoo, “Multimodal deep learning for finance: integrating and forecasting international stock markets”, The Journal of Supercomputing, DOI: https://doi.org/10.1007/s11227-019-03101-3, Vol. 76, pp. 8294–8312.
[10] Team, K.: Keras Documentation: Reuters Newswire Classification Dataset. Keras. Retrieved December 2022, from https://keras.io/api/datasets/reuters/
[11] T. Kim, T. Adali, “Complex backpropagation neural network using elementary transcendental activation functions”, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings, Utah, pp. 1281–1284, 07-11 May 2001.
[12] G. Cybenko, “Approximation by superpositions of a sigmoidal function.”, Mathematics of Control, Signals and Systems, DOI: https://doi.org/10.1007/BF02551274, Vol. 2, No. 4, pp. 303–314.
[13] K. Hornik, M. Stinchcombe, H. White, “Multilayer feedforward networks are universal approximators”, Neural Networks, Vol. 2, pp. 359–366, 1989.
[14] K.-I. Funahashi, “On the approximate realization of continuous mappings by neural networks”, Neural Networks, Vol. 2, pp. 183–192, 1989.
[15] A.R. Barron, “Universal approximation bounds for superpositions of a sigmoidal function”, IEEE Transactions on Information Theory, DOI: 10.1109/18.256500, Vol. 39, No. 3, pp. 930–945.
[16] M. Leshno, V.Y. Lin, A. Pinkus, S. Schocken, “Multilayer feedforward networks with a nonpolynomial activation function can approximate any function”, Neural Networks, Vol. 6, pp. 861–867, 1993.
[17] D. Misra, “Mish: A Self Regularized Non-Monotonic Activation Function”, British Machine Vision Conference, 7–10 September 2020.
[18] R. Parhi, R.D. Nowak, “The Role of Neural Network Activation Functions”, IEEE Signal Processing Letters, DOI: 10.1109/LSP.2020.3027517, Vol. 27, pp. 1779–1783.
[19] N. Kulathunga, N. R. Ranasinghe, D. Vrinceanu, Z. Kinsman, L. Huang, Y. Wang, “Effects of the Nonlinearity in Activation Functions on the Performance of Deep Learning Models”, CoRR, 2020.
[20] Y. LeCun; B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel, “Backpropagation applied to handwritten zip code recognition”, Neural Computation, DOI: 10.1162/neco.1989.1.4.541, Vol. 1, No. 4, pp. 541–551.
[21] O. Sharma, “A new activation function for deep neural network”, 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, pp. 84 – 86, 14 – 16 February 2019.
[22] S.S. Liew, M. Khalil-Hani, R. Bakhteri, “Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems”, Neurocomputing, Vol. 216, pp. 718–734, December 2016.
[23] M.Ö. Efe, “Novel neuronal activation functions for Feedforward Neural Networks”, Neural Processing Letters, DOI: https://doi.org/10.1007/s11063-008-9082-0, Vol. 28, No. 2, pp. 63–79.
[24] E.N. Skoundrianos, S.G. Tzafestas, “Modelling and FDI of Dynamic Discrete Time Systems Using a MLP with a New Sigmoidal Activation Function”, Journal of Intelligent and Robotic Systems, DOI: https://doi.org/10.1023/B:JINT.0000049175.78893.2f, Vol. 41, No. 1, pp. 19–36.
[25] S.-L. Shen, N. Zhang, A. Zhou, Z.-Y. Yin “Enhancement of neural networks with an alternative activation function tanhLU”, Expert Systems with Applications, Vol. 199, pp. 117181, August 2022.
[26] M.F. Augusteijn, T.P. Harrington, “Evolving transfer functions for artificial neural networks”, Neural Computing & Applications, DOI: https://doi.org/10.1007/s00521-003-0393-9, Vol. 13, No. 1, pp. 38–46.
[27] J.M. Benitez, J.L. Castro, I. Requena, “Are artificial neural networks black boxes?”, IEEE Transactions on Neural Networks, DOI: 10.1109/72.623216, Vol. 8, No. 5, pp. 1156–1164.
[28] J. Jin, J. Zhu, J. Gong, W. Chen, “Novel activation functions-based ZNN models for fixed-time solving dynamic Sylvester equation”, Neural Computing and Applications, DOI: https://doi.org/10.1007/s00521-022-06905-2, Vol. 34, No. 17, pp. 14297–14315.
[29] X. Wang, H. Ren, A. Wang, “Smish: A Novel Activation Function for Deep Learning Methods”, Electronics, Vol. 11, pp. 540, February 2022.
[30] Trade Value - Day Ahead Market - Electricity Markets| EPIAS Transparency Platform. (n.d.). Retrieved December 20, 2022, from https://seffaflik.epias.com.tr/transparency/piyasalar/gop/islem-hacmi.xhtml

The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models

Year 2024, Volume: 9 Issue: 3, 89 - 101, 01.02.2025

Talya Tümer Sivri , Nergis Pervan Akman , Ali Berkol

https://doi.org/10.19072/ijet.1394093

Abstract

Deep learning has been applied in numerous areas, significantly impacting applications that address real-life challenges. Its success across a wide range of domains is partly attributed to activation functions, which introduce non-linearity into neural networks, enabling them to effectively model complex relationships in data. Activation functions remain a key area of focus for artificial intelligence researchers aiming to enhance neural network performance. This paper comprehensively explains and compares various activation functions, particularly emphasizing the arc tangent and its specific variations. The primary focus is on evaluating the impact of these activation functions in two different contexts: a multiclass classification problem applied to the Reuters Newswire dataset and a time-series prediction problem involving the energy trade value of Türkiye. Experimental results demonstrate that variations of the arc tangent function, leveraging irrational numbers such as π (pi), the golden ratio (ϕ), Euler number (e), and a self-arctan formulation, yield promising outcomes. The findings suggest that different variations perform optimally for specific tasks: arctan ϕ achieves superior results in multiclass classification problems, while arctan e is more effective in time-series prediction challenges.

Keywords

Deep neural networks, Activation functions, Multiclass classification, Time-Series prediction, Reuters data, Energy trade value data

Supporting Institution

ASELSAN

References

[1] T.T. Sivri, N.P. Akman, A. Berkol, “A.: Multiclass Classification Using Arctangent Activation Function and Its Variations.”, 2022 14th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Ploiesti, pp. 1-6, 30 June 2022 - 01 July 2022.
[2] T.T. Sivri, N.P. Akman, A. Berkol, C. Peker, “Web Intrusion Detection Using Character Level Machine Learning Approaches with Upsampled Data”, Annals of Computer Science and Information Systems, DOI: http://dx.doi.org/10.15439/2022F147, Vol. 32, pp. 269–274.
[3] J. Kamruzzaman, “Arctangent Activation Function to Accelerate Backpropagation Learning”, IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences, Vol. E85-A, pp. 2373–2376, October 2002.
[4] S. Sharma, S. Sharma, A. Athaiya, “Activation Functions In Neural Networks”, International Journal of Engineering Applied Sciences and Technology, Vol. 4, pp. 310–316, April 2020.
[5] D. Paul, G. Sanap, S. Shenoy, D. Kalyane, K. Kalia, R.K. Tekade, “Artificial intelligence in drug discovery and development”, Drug Discovery Today, DOI: 10.1016/j.drudis.2020.10.010, Vol. 26, No. 1, pp. 80–93.
[6] B. Kisačanin, “Deep Learning for Autonomous Vehicles”, 2017 IEEE 47th International Symposium on Multiple-Valued Logic (ISMVL), Novi Sad, pp. 142, 22-24 May 2017.
[7] D.W. Otter; J.R. Medina; J.K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing”, IEEE Transactions on Neural Networks and Learning Systems, DOI: 10.1109/TNNLS.2020.2979670, Vol. 32, No. 2, pp. 604–624.
[8] R. Miotto, F. Wang, S. Wang, X. Jiang, J.T. Dudley, “Deep learning for healthcare: review, opportunities and challenges”, Briefings in Bioinformatics, Vol. 19, pp. 1236–1246, November 2018.
[9] S. I. Lee, S. J. Yoo, “Multimodal deep learning for finance: integrating and forecasting international stock markets”, The Journal of Supercomputing, DOI: https://doi.org/10.1007/s11227-019-03101-3, Vol. 76, pp. 8294–8312.
[10] Team, K.: Keras Documentation: Reuters Newswire Classification Dataset. Keras. Retrieved December 2022, from https://keras.io/api/datasets/reuters/
[11] T. Kim, T. Adali, “Complex backpropagation neural network using elementary transcendental activation functions”, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings, Utah, pp. 1281–1284, 07-11 May 2001.
[12] G. Cybenko, “Approximation by superpositions of a sigmoidal function.”, Mathematics of Control, Signals and Systems, DOI: https://doi.org/10.1007/BF02551274, Vol. 2, No. 4, pp. 303–314.
[13] K. Hornik, M. Stinchcombe, H. White, “Multilayer feedforward networks are universal approximators”, Neural Networks, Vol. 2, pp. 359–366, 1989.
[14] K.-I. Funahashi, “On the approximate realization of continuous mappings by neural networks”, Neural Networks, Vol. 2, pp. 183–192, 1989.
[15] A.R. Barron, “Universal approximation bounds for superpositions of a sigmoidal function”, IEEE Transactions on Information Theory, DOI: 10.1109/18.256500, Vol. 39, No. 3, pp. 930–945.
[16] M. Leshno, V.Y. Lin, A. Pinkus, S. Schocken, “Multilayer feedforward networks with a nonpolynomial activation function can approximate any function”, Neural Networks, Vol. 6, pp. 861–867, 1993.
[17] D. Misra, “Mish: A Self Regularized Non-Monotonic Activation Function”, British Machine Vision Conference, 7–10 September 2020.
[18] R. Parhi, R.D. Nowak, “The Role of Neural Network Activation Functions”, IEEE Signal Processing Letters, DOI: 10.1109/LSP.2020.3027517, Vol. 27, pp. 1779–1783.
[19] N. Kulathunga, N. R. Ranasinghe, D. Vrinceanu, Z. Kinsman, L. Huang, Y. Wang, “Effects of the Nonlinearity in Activation Functions on the Performance of Deep Learning Models”, CoRR, 2020.
[20] Y. LeCun; B. Boser, J.S. Denker, D. Henderson, R.E. Howard, W. Hubbard, L.D. Jackel, “Backpropagation applied to handwritten zip code recognition”, Neural Computation, DOI: 10.1162/neco.1989.1.4.541, Vol. 1, No. 4, pp. 541–551.
[21] O. Sharma, “A new activation function for deep neural network”, 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, pp. 84 – 86, 14 – 16 February 2019.
[22] S.S. Liew, M. Khalil-Hani, R. Bakhteri, “Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems”, Neurocomputing, Vol. 216, pp. 718–734, December 2016.
[23] M.Ö. Efe, “Novel neuronal activation functions for Feedforward Neural Networks”, Neural Processing Letters, DOI: https://doi.org/10.1007/s11063-008-9082-0, Vol. 28, No. 2, pp. 63–79.
[24] E.N. Skoundrianos, S.G. Tzafestas, “Modelling and FDI of Dynamic Discrete Time Systems Using a MLP with a New Sigmoidal Activation Function”, Journal of Intelligent and Robotic Systems, DOI: https://doi.org/10.1023/B:JINT.0000049175.78893.2f, Vol. 41, No. 1, pp. 19–36.
[25] S.-L. Shen, N. Zhang, A. Zhou, Z.-Y. Yin “Enhancement of neural networks with an alternative activation function tanhLU”, Expert Systems with Applications, Vol. 199, pp. 117181, August 2022.
[26] M.F. Augusteijn, T.P. Harrington, “Evolving transfer functions for artificial neural networks”, Neural Computing & Applications, DOI: https://doi.org/10.1007/s00521-003-0393-9, Vol. 13, No. 1, pp. 38–46.
[27] J.M. Benitez, J.L. Castro, I. Requena, “Are artificial neural networks black boxes?”, IEEE Transactions on Neural Networks, DOI: 10.1109/72.623216, Vol. 8, No. 5, pp. 1156–1164.
[28] J. Jin, J. Zhu, J. Gong, W. Chen, “Novel activation functions-based ZNN models for fixed-time solving dynamic Sylvester equation”, Neural Computing and Applications, DOI: https://doi.org/10.1007/s00521-022-06905-2, Vol. 34, No. 17, pp. 14297–14315.
[29] X. Wang, H. Ren, A. Wang, “Smish: A Novel Activation Function for Deep Learning Methods”, Electronics, Vol. 11, pp. 540, February 2022.
[30] Trade Value - Day Ahead Market - Electricity Markets| EPIAS Transparency Platform. (n.d.). Retrieved December 20, 2022, from https://seffaflik.epias.com.tr/transparency/piyasalar/gop/islem-hacmi.xhtml

There are 30 citations in total.

Details

Primary Language	English
Subjects	Neural Networks
Journal Section	Makaleler
Authors	Talya Tümer Sivri 0000-0003-1813-5539 Nergis Pervan Akman 0000-0003-3241-6812 Ali Berkol 0000-0002-3056-1226
Early Pub Date	February 10, 2025
Publication Date	February 1, 2025
Submission Date	November 21, 2023
Acceptance Date	February 1, 2025
Published in Issue	Year 2024 Volume: 9 Issue: 3

Cite

APA	Tümer Sivri, T., Pervan Akman, N., & Berkol, A. (2025). The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models. International Journal of Engineering Technologies IJET, 9(3), 89-101. https://doi.org/10.19072/ijet.1394093
AMA	Tümer Sivri T, Pervan Akman N, Berkol A. The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models. IJET. February 2025;9(3):89-101. doi:10.19072/ijet.1394093
Chicago	Tümer Sivri, Talya, Nergis Pervan Akman, and Ali Berkol. “The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models”. International Journal of Engineering Technologies IJET 9, no. 3 (February 2025): 89-101. https://doi.org/10.19072/ijet.1394093.
EndNote	Tümer Sivri T, Pervan Akman N, Berkol A (February 1, 2025) The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models. International Journal of Engineering Technologies IJET 9 3 89–101.
IEEE	T. Tümer Sivri, N. Pervan Akman, and A. Berkol, “The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models”, IJET, vol. 9, no. 3, pp. 89–101, 2025, doi: 10.19072/ijet.1394093.
ISNAD	Tümer Sivri, Talya et al. “The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models”. International Journal of Engineering Technologies IJET 9/3 (February 2025), 89-101. https://doi.org/10.19072/ijet.1394093.
JAMA	Tümer Sivri T, Pervan Akman N, Berkol A. The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models. IJET. 2025;9:89–101.
MLA	Tümer Sivri, Talya et al. “The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models”. International Journal of Engineering Technologies IJET, vol. 9, no. 3, 2025, pp. 89-101, doi:10.19072/ijet.1394093.
Vancouver	Tümer Sivri T, Pervan Akman N, Berkol A. The Impact of Irrationals on the Range of Arctan Activation Function for Deep Learning Models. IJET. 2025;9(3):89-101.