Research Article
BibTex RIS Cite

A Framework for Parametric Model Selection in Time Series Problems

Year 2023, Volume: 9 Issue: 4, 82 - 91, 31.12.2023

Abstract

People make future plans with the aim of simplifying their lives, and these plans are essential for preparing for forthcoming challenges. Forecasting methodologies take precedence in order to anticipate and plan for future events. Time series data stands out as a pivotal information type employed for predicting the future. This research introduces a framework for selecting the optimal model among classical artificial neural networks in time series forecasting. The classical artificial neural networks considered encompass the LSTM, CNN, and DNN models. The framework employs various parameters – including the dataset, model depth, loss functions, minimal success rate in model performance, epochs, and optimization algorithms – to determine the best-fitting model. Users have the flexibility to adjust these parameters to address specific issues. By default, the framework incorporates seven distinct loss functions and five optimization algorithms to facilitate model selection. The mean average error loss function is used as the metric for evaluating model performance. To validate the framework, Brent oil prices were utilized as the dataset in a series of tests, encompassing a total of 9000 daily price data points. The dataset was partitioned into 80\% for training and 20\% for testing purposes. The training iterations within the framework were 50 epochs. In the test scenarios, the price for the eighth day was predicted using price data from the preceding seven days. Consequently, a mean average error score of 1.1239657 was achieved. The results showed that the LSTM model, comprising two layers, the Adadelta optimization algorithm, and the mean square error loss function, emerged as the most successful configuration.

References

  • [1] P. Wang, E. Fan, and P. Wang, “Comparative analysis of image classification algorithms based on traditional machine learning and deep learning,” Pattern Recognit Lett, vol. 141, pp. 61–67, Jan. 2021, doi: 10.1016/j.patrec.2020.07.042.
  • [2] U. Michelucci, “A Case-Based Approach to Understanding Deep Neural Networks,” in Applied Deep Learning, Berkeley, CA: Apress, 2018. doi: 10.1007/978-1-4842-3790-8.
  • [3] E. Okewu, P. Adewole, and O. Sennaike, “Experimental Comparison of Stochastic Optimizers in Deep Learning,” in Computational Science and Its Applications – ICCSA 2019, 2019, pp. 704–715. doi: 10.1007/978-3-030-24308-1_55.
  • [4] Y. Srivastava, V. Murali, and S. R. Dubey, “A Performance Comparison of Loss Functions for Deep Face Recognition,” ArXiv, Dec. 2018, doi: 10.48550/arXiv.1901.05903.
  • [5] L. Kotthoff, C. Thornton, H. H. Hoos, F. Hutter, and K. Leyton-Brown, “Auto-WEKA: Automatic Model Selection and Hyperparameter Optimization in WEKA,” in Automated Machine Learning, 2019, pp. 81–95. doi: 10.1007/978-3-030-05318-5_4.
  • [6] B. Taylor, V. S. Marco, W. Wolff, Y. Elkhatib, and Z. Wang, “Adaptive deep learning model selection on embedded systems,” ACM SIGPLAN Notices, vol. 53, no. 6, pp. 31–43, Dec. 2018, doi: 10.1145/3299710.3211336.
  • [7] H. Bertrand, R. Ardon, M. Perrot, and I. Bloch, “Hyperparameter optimization of deep neural networks: combining Hperband with Bayesian model selection,” in Conférence sur l’Apprentissage Automatique, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:10497518.
  • [8] G. Gharibi, V. Walunj, R. Alanazi, S. Rella, and Y. Lee, “Automated Management of Deep Learning Experiments,” in Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, New York, NY, USA: ACM, Jun. 2019, pp. 1–4. doi: 10.1145/3329486.3329495.
  • [9] C. Murdock, Z. Li, H. Zhou, and T. Duerig, “Blockout: Dynamic Model Selection for Hierarchical Deep Networks,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. 2016, pp. 2583–2591. doi: 10.1109/CVPR.2016.283.
  • [10] T. Developers, “TensorFlow,” Zenodo, May 2021, doi: 10.5281/zenodo.4758419.
  • [11] N. Ketkar, “Introduction to Keras,” in Deep Learning with Python, Berkeley, CA: Apress, 2017, pp. 97–111. doi: 10.1007/978-1-4842-2766-4_7.
  • [12] L. O. Chua and T. Roska, “The CNN paradigm,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 40, no. 3, pp. 147–156, Mar. 1993, doi: 10.1109/81.222795.
  • [13] A. Sherstinsky, “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Physica D, vol. 404, p. 132306, Mar. 2020, doi: 10.1016/j.physd.2019.132306.
  • [14] G. Li et al., “Understanding error propagation in deep learning neural network (DNN) accelerators and applications,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New York, NY, USA: ACM, Nov. 2017, pp. 1–12. doi: 10.1145/3126908.3126964.
  • [15] J.-H. Lin, T. M. Sellke, and E. J. Coyle, “Adaptive stack filtering under the mean absolute error criterion,” IEEE Trans Acoust, vol. 38, no. 6, pp. 938–954, Jun. 1990, doi: 10.1109/29.56055.
  • [16] A. A. Poli and M. C. Cirillo, “On the use of the normalized mean square error in evaluating dispersion model performance,” Atmospheric Environment. Part A. General Topics, vol. 27, no. 15, pp. 2427–2434, Oct. 1993, doi: 10.1016/0960-1686(93)90410-Z.
  • [17] W. Wang and Y. Lu, “Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model,” IOP Conf Ser Mater Sci Eng, vol. 324, p. 012049, Mar. 2018, doi: 10.1088/1757-899X/324/1/012049.
  • [18] J. McKenzie, “Mean absolute percentage error and bias in economic forecasting,” Econ Lett, vol. 113, no. 3, pp. 259–262, Dec. 2011, doi: 10.1016/j.econlet.2011.08.010.
  • [19] P. H. Franses, “A note on the Mean Absolute Scaled Error,” Int J Forecast, vol. 32, no. 1, pp. 20–22, Jan. 2016, doi: 10.1016/j.ijforecast.2015.03.008.
  • [20] Z. Zhang, “Improved Adam Optimizer for Deep Neural Networks,” in 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), IEEE, Jun. 2018, pp. 1–2. doi: 10.1109/IWQoS.2018.8624183.
  • [21] S. Ruder, “An overview of gradient descent optimization algorithms,” ArXiv, Sep. 2016, doi: 10.48550/arXiv.1609.04747.
  • [22] E. M. Dogo, O. J. Afolabi, N. I. Nwulu, B. Twala, and C. O. Aigbavboa, “A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks,” in 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS), IEEE, Dec. 2018, pp. 92–99. doi: 10.1109/CTEMS.2018.8769211.
  • [23] Q. Zhang et al., “Boosting Adversarial Attacks with Nadam Optimizer,” Electronics (Basel), vol. 12, no. 6, p. 1464, Mar. 2023, doi: 10.3390/electronics12061464.
  • [24] R. Llugsi, S. El Yacoubi, A. Fontaine, and P. Lupera, “Comparison between Adam, AdaMax and Adam W optimizers to implement a Weather Forecast based on Neural Networks for the Andean city of Quito,” in 2021 IEEE Fifth Ecuador Technical Chapters Meeting (ETCM), IEEE, Oct. 2021, pp. 1–6. doi: 10.1109/ETCM53643.2021.9590681.

Zaman Serisi Problemlerinde Parametrik Model Seçimi İçin Bir Çerçeve

Year 2023, Volume: 9 Issue: 4, 82 - 91, 31.12.2023

Abstract

İnsanlar yaşamlarını kolaylaştırmak için geleceğe yönelik planlamalar yapmaktadır. Bu planlamalar gelecekte karşılaşabilecek problemlere hazırlıklı olmak için önemlidir. Geleceğe yönelik hazırlıklar yapılabilmesi için de tahmin yöntemleri ön plana çıkmaktadır. Geleceğe yönelik tahminler için kullanılan verilerden birisi de zaman serileridir. Bu çalışmada zaman serisi tahminlerinde kullanılacak klasik yapay sinir ağları için en iyi modeli seçen bir çerçeve geliştirilmiştir. Klasik yapay sinir ağları olarak LSTM, CNN ve DNN modelleri kullanılmaktadır. Framework en iyi modeli seçmek için veri seti, model derinliği, kayıp fonksiyonları, model performansında minimum başarı oranı, tekrar sayısı ve optimizasyon algoritmalarını parametre olarak kullanmaktadır. Kullanıcılar bu parametreleri kendi problemlerine uygun güncelleyebilmektedir. Model seçimi içinse varsayılan olarak 7 farklı kayıp fonksiyonu ve 5 farklı optimizasyon algoritması kullanmaktadır. Model performansları Mean Avarage Error kayıp fonksiyonuyla belirlenmektedir. Framework deneylerinde, veri seti olarak Brent Ham Petrol fiyatları kullanılmış olup veri seti 9000 günlük fiyat bilgisi içermektedir. Veri seti %80 eğitim ve %20 test olarak iki bölünmüştür. Çerçeve testindeyse eğitimler 50 tekrar ile gerçekleştirilmiştir. Deneyde 7 günlük ardışık fiyat bilgisiyle 8. gündeki fiyat tahmin ettirilmiştir. Sonuç olarak 1.1239657 Mean Average Error skoru elde edilmiştir. En başarılı model, 2 katmanlı Adadelta optimizasyon algoritmasını ve Mean Square Error kayıp fonksiyonu kullanan LSTM modeli olmuştur.

References

  • [1] P. Wang, E. Fan, and P. Wang, “Comparative analysis of image classification algorithms based on traditional machine learning and deep learning,” Pattern Recognit Lett, vol. 141, pp. 61–67, Jan. 2021, doi: 10.1016/j.patrec.2020.07.042.
  • [2] U. Michelucci, “A Case-Based Approach to Understanding Deep Neural Networks,” in Applied Deep Learning, Berkeley, CA: Apress, 2018. doi: 10.1007/978-1-4842-3790-8.
  • [3] E. Okewu, P. Adewole, and O. Sennaike, “Experimental Comparison of Stochastic Optimizers in Deep Learning,” in Computational Science and Its Applications – ICCSA 2019, 2019, pp. 704–715. doi: 10.1007/978-3-030-24308-1_55.
  • [4] Y. Srivastava, V. Murali, and S. R. Dubey, “A Performance Comparison of Loss Functions for Deep Face Recognition,” ArXiv, Dec. 2018, doi: 10.48550/arXiv.1901.05903.
  • [5] L. Kotthoff, C. Thornton, H. H. Hoos, F. Hutter, and K. Leyton-Brown, “Auto-WEKA: Automatic Model Selection and Hyperparameter Optimization in WEKA,” in Automated Machine Learning, 2019, pp. 81–95. doi: 10.1007/978-3-030-05318-5_4.
  • [6] B. Taylor, V. S. Marco, W. Wolff, Y. Elkhatib, and Z. Wang, “Adaptive deep learning model selection on embedded systems,” ACM SIGPLAN Notices, vol. 53, no. 6, pp. 31–43, Dec. 2018, doi: 10.1145/3299710.3211336.
  • [7] H. Bertrand, R. Ardon, M. Perrot, and I. Bloch, “Hyperparameter optimization of deep neural networks: combining Hperband with Bayesian model selection,” in Conférence sur l’Apprentissage Automatique, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:10497518.
  • [8] G. Gharibi, V. Walunj, R. Alanazi, S. Rella, and Y. Lee, “Automated Management of Deep Learning Experiments,” in Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, New York, NY, USA: ACM, Jun. 2019, pp. 1–4. doi: 10.1145/3329486.3329495.
  • [9] C. Murdock, Z. Li, H. Zhou, and T. Duerig, “Blockout: Dynamic Model Selection for Hierarchical Deep Networks,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jun. 2016, pp. 2583–2591. doi: 10.1109/CVPR.2016.283.
  • [10] T. Developers, “TensorFlow,” Zenodo, May 2021, doi: 10.5281/zenodo.4758419.
  • [11] N. Ketkar, “Introduction to Keras,” in Deep Learning with Python, Berkeley, CA: Apress, 2017, pp. 97–111. doi: 10.1007/978-1-4842-2766-4_7.
  • [12] L. O. Chua and T. Roska, “The CNN paradigm,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 40, no. 3, pp. 147–156, Mar. 1993, doi: 10.1109/81.222795.
  • [13] A. Sherstinsky, “Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network,” Physica D, vol. 404, p. 132306, Mar. 2020, doi: 10.1016/j.physd.2019.132306.
  • [14] G. Li et al., “Understanding error propagation in deep learning neural network (DNN) accelerators and applications,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New York, NY, USA: ACM, Nov. 2017, pp. 1–12. doi: 10.1145/3126908.3126964.
  • [15] J.-H. Lin, T. M. Sellke, and E. J. Coyle, “Adaptive stack filtering under the mean absolute error criterion,” IEEE Trans Acoust, vol. 38, no. 6, pp. 938–954, Jun. 1990, doi: 10.1109/29.56055.
  • [16] A. A. Poli and M. C. Cirillo, “On the use of the normalized mean square error in evaluating dispersion model performance,” Atmospheric Environment. Part A. General Topics, vol. 27, no. 15, pp. 2427–2434, Oct. 1993, doi: 10.1016/0960-1686(93)90410-Z.
  • [17] W. Wang and Y. Lu, “Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model,” IOP Conf Ser Mater Sci Eng, vol. 324, p. 012049, Mar. 2018, doi: 10.1088/1757-899X/324/1/012049.
  • [18] J. McKenzie, “Mean absolute percentage error and bias in economic forecasting,” Econ Lett, vol. 113, no. 3, pp. 259–262, Dec. 2011, doi: 10.1016/j.econlet.2011.08.010.
  • [19] P. H. Franses, “A note on the Mean Absolute Scaled Error,” Int J Forecast, vol. 32, no. 1, pp. 20–22, Jan. 2016, doi: 10.1016/j.ijforecast.2015.03.008.
  • [20] Z. Zhang, “Improved Adam Optimizer for Deep Neural Networks,” in 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), IEEE, Jun. 2018, pp. 1–2. doi: 10.1109/IWQoS.2018.8624183.
  • [21] S. Ruder, “An overview of gradient descent optimization algorithms,” ArXiv, Sep. 2016, doi: 10.48550/arXiv.1609.04747.
  • [22] E. M. Dogo, O. J. Afolabi, N. I. Nwulu, B. Twala, and C. O. Aigbavboa, “A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks,” in 2018 International Conference on Computational Techniques, Electronics and Mechanical Systems (CTEMS), IEEE, Dec. 2018, pp. 92–99. doi: 10.1109/CTEMS.2018.8769211.
  • [23] Q. Zhang et al., “Boosting Adversarial Attacks with Nadam Optimizer,” Electronics (Basel), vol. 12, no. 6, p. 1464, Mar. 2023, doi: 10.3390/electronics12061464.
  • [24] R. Llugsi, S. El Yacoubi, A. Fontaine, and P. Lupera, “Comparison between Adam, AdaMax and Adam W optimizers to implement a Weather Forecast based on Neural Networks for the Andean city of Quito,” in 2021 IEEE Fifth Ecuador Technical Chapters Meeting (ETCM), IEEE, Oct. 2021, pp. 1–6. doi: 10.1109/ETCM53643.2021.9590681.
There are 24 citations in total.

Details

Primary Language English
Subjects Computer Software
Journal Section Research Articles
Authors

Muhammed Abdulhamid Karabıyık 0000-0001-7927-8790

Publication Date December 31, 2023
Submission Date November 17, 2023
Acceptance Date December 4, 2023
Published in Issue Year 2023 Volume: 9 Issue: 4

Cite

IEEE M. A. Karabıyık, “A Framework for Parametric Model Selection in Time Series Problems”, GJES, vol. 9, no. 4, pp. 82–91, 2023.

Gazi Journal of Engineering Sciences (GJES) publishes open access articles under a Creative Commons Attribution 4.0 International License (CC BY). 1366_2000-copia-2.jpg