Research Article
BibTex RIS Cite
Year 2025, Volume: 43 Issue: 1, 199 - 212, 28.02.2025

Abstract

References

  • REFERENCES
  • [1] Hyndman RJ, Athanasopoulos G. Forecasting: principles and practice. 2nd ed. Melbourne: OTexts; 2018. [CrossRef]
  • [2] Engle RF. GARCH 101: The use of ARCH/GARCH models in applied econometrics. J Econ Perspect 2001;15:157168. [CrossRef]
  • [3] Zhao L, Wen X, Wang Y, Shao Y. A novel hybrid model of ARIMA-MCC and CKDE-GARCH for urban short-term traffic flow prediction. IET Intell Transp Syst 2022;16:206217. [CrossRef]
  • [4] Devianto D, Yollanda M, Maiyastri M, Yanuar F. The soft computing FFNN method for adjusting heteroscedasticity on the time series model of currency exchange rate. Front Appl Math Stat 2023;9:1045218. [CrossRef]
  • [5] Chandola A, Pandey RM, Agarwal R, Rathour L, Mishra VN. On some properties and applications of the generalized m-parameter Mittag-Leffler function. Adv Math Model Appl 2022;7:130145.
  • [6] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA. 2017:5998-6008.
  • [7] Efimova O, Serletis A. Energy markets volatility modeling using GARCH. Energy Econ 2014;43:264273. [CrossRef]
  • [8] Zhou B, He D, Sun Z. Traffic modeling and prediction using ARIMA/GARCH model. In: Modeling and Simulation Tools for Emerging Telecommunication Networks. 2006:101121. [CrossRef]
  • [9] Li J, Yin J, Zhang R. Analysis and forecast of USD/EUR exchange rate based on ARIMA and GARCH models. In: Li X, Yuan C, Kent J, (editors). Proceedings of the 7th International Conference on Economic Management and Green Development (ICEMGD 2023). Midtown Manhattan, New York City: Springer; 2024. p. 566575. [CrossRef]
  • [10] Yaziza SR, Azizanb NA, Zakariaa R, Ahmadc M. The performance of hybrid ARIMA-GARCH modeling in forecasting gold price. In proceedings of the 20th International Congress on Modelling and Simulation. 2013. p. 12011207.
  • [11] Adebiyi AA, Adewumi AO, Ayo CK. Comparison of ARIMA and artificial neural networks models for stock price prediction. J Appl Math 2014;2014:614342. [CrossRef]
  • [12] Valipour M, Banihabib ME, Behbahani SMR. Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J Hydrol 2013;476:433441. [CrossRef]
  • [13] Wang Y, Shen Z, Jiang Y. Comparison of autoregressive integrated moving average model and generalized regression neural network model for prediction of haemorrhagic fever with renal syndrome in China: a time-series study. BMJ Open 2019;9:e025773. [CrossRef]
  • [14] Chung J, Gulcehre C, Cho K, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. In NIPS 2014 Workshop on Deep Learning, December 2014. 2014.
  • [15] Dauphin YN, Fan A, Auli M, Grangier D, Precup D, Teh YW. Language modeling with gated convolutional networks. In: Proceedings of the 34th International Conference on Machine Learning 2017;70:933941.
  • [16] Wen Q, Zhou T, Zhang C, Chen W, Ma Z, Yan J, Sun L. Transformers in time series: a survey. In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence 2023. p. 67786786. [CrossRef]
  • [17] Waljee A, Mukherjee A, Singal A, Zhang Y, Warren J, Balis U, et al. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open 2013;3:e002847. [CrossRef]
  • [18] Mohamed C, Sedory S, Singh S. Improved mean methods of imputation. Statistics Optim Inf Comput 2018;6:526535. [CrossRef]
  • [19] Dokumentov A, Hyndman R. STR: Seasonal-trend decomposition using regression. INFORMS J Data Sci 2022;1:5062. [CrossRef]
  • [20] Qin Y, Zhang S, Zhu X, Zhang J, Zhang C. Semi-parametric optimization for missing data imputation. Appl Intell 2007;27:7988. [CrossRef]
  • [21] Amini A, Thevenaz P, Ward J, Unser M. On the linearity of Bayesian interpolators for non-Gaussian continuous-time AR(1) processes. IEEE Trans Inf Theory 2013;59:50635074. [CrossRef] [22] Raubitzek S, Neubauer T, Friedrich J, Rauber A. Interpolating strange attractors via fractional Brownian bridges. Entropy. 2022;24:718. [CrossRef]
  • [23] Sharma MK, Chaudhary S, Rathour L, Mishra VN. Modified genetic algorithm with novel crossover and mutation operator in traveling salesman problem. Sigma J Eng Nat Sci 2024;42:18761883. [CrossRef]
  • [24] Negero NT, Duressa GF, Rathour L, Mishra VN. A novel fitted numerical scheme for singularly perturbed delay parabolic problems with two small parameters. Partial Differ Equ Appl Math 2023;8:100546. [CrossRef]
  • [25] Hogeme MS, Woldaregay MM, Rathour L, Mishra VN. A stable numerical method for singularly perturbed Fredholm integro-differential equation using exponentially fitted difference method. J Comput Appl Math 2024;441:115709. [CrossRef]
  • [26] Mishra LN, Raiz M, Rathour L, Mishra VN. Tauberian theorems for weighted means of double sequences in intuitionistic fuzzy normed spaces. Yugoslav J Oper Res 2022;32:377388. [CrossRef]
  • [27] Rathour L, Singh V, Yadav H, Sharma MK, Mishra VN. A dual hesitant fuzzy set theoretic approach in fuzzy reliability analysis of a fuzzy system. Inf Sci Lett 2024;13:433440. [CrossRef]
  • [28] Atmaca K, Yenilmez I. RNNs and transformer model in case of incomplete time series. In: V. International Applied Statistics Congress (UYIK - 2024); Istanbul, Turkiye; May 2123, 2024.
  • [29] Yenilmez I, Atmaca K. Performance of deep learning models on imputed time series data: a simulation study and application to leading airline companies’ stock price. Int J Adv Eng Pure Sci 2025;37(Suppl):3039. [CrossRef]
  • [30] Sharma MK, Sadhna, Bhargava AK, Kumar S, Rathour L, Mishra LN, Pandey S. A fermatean fuzzy ranking function in optimization of intuitionistic fuzzy transportation problems. Adv Math
  • Models Appl 2022;7:191204.
  • [31] Sharma MK, Dhiman N, Kumar S, Rathour L, Mishra VN. Neutrosophic Monte Carlo simulation approach for decision making in medical diagnostic process under uncertain environment. Int J Neutrosophic Sci 2023;22:816. [CrossRef]
  • [32] Soares G, Chavarette F, Gonçalves A, Faria H, Outa R, Mishra V. Optimizing the transition: replacing conventional lubricants with biological alternatives through artificial intelligence. J Appl Comput Mech. 2024;19.
  • [33] Hassan J. ARIMA and regression models for prediction of daily and monthly clearness index. Renew Energy 2014;68:421427. [CrossRef]
  • [34] Yang B, Bao W, Chen Y. Time series prediction based on complex-valued S-system model. Complexity. 2020;2020:113. [CrossRef]
  • [35] Ali M. Traffic speed prediction using GARCH-GRU hybrid model. IET Intell Transp Syst 2023;17:23002312. [CrossRef]
  • [36] Streiner DL. The case of the missing data: methods of dealing with dropouts and other research vagaries. Can J Psychiatry 2002;47:6875. [CrossRef]
  • [37] Little RJA, Rubin DB. Statistical Analysis with Missing Data. 2nd ed. New York: Wiley Interscience; 2002. [CrossRef]
  • [38] Bloomfield P. Fourier Analysis of Time Series: An Introduction. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc.; 2000. [CrossRef]
  • [39] Cleveland RB, Cleveland WS, McRae JE, Terpenning I. STL: A seasonal-trend decomposition procedure based on LOESS. J Official Stat 1990;6:333.
  • [40] Brockwell PJ, Davis RA. Introduction to Time Series and Forecasting. New York, NY: Springer; 2016. [CrossRef]
  • [41] Allison PD. Missing Data. Thousand Oaks, CA: Sage Publications; 2001.
  • [42] Kalman RE. A new approach to linear filtering and prediction problems. J Basic Eng 1960;82:3545. [CrossRef]
  • [43] Yenilmez I. Imputation methods effect on the goodness of fit of the statistical model. In: Proceedings of the 9th International Conference on Business, Management and Economics (9th ICBMECONF). March 1-3, 2024; Vienna, Austria.

Performance of imputation techniques: A comprehensive simulation study using the transformer model

Year 2025, Volume: 43 Issue: 1, 199 - 212, 28.02.2025

Abstract

This study addresses the critical challenge of handling missing data in time series analysis, which is maintaining the accuracy and reliability of financial forecasting and other predictive models. The study aims to assess various imputation techniques’ and estimation methods’ performance. The purpose of using imputed data is to enhance the robustness and accuracy of time series analyses, especially when dealing with incomplete datasets. We compared eight different imputation methods to identify the most effective approach. We also compared the performance of the Transformer model, Autoregressive Integrated Moving Average, and Generalized Autoregressive Conditional Heteroskedasticity methods in time series analysis using both complete and imputed datasets. The study employed a comprehensive approach, utilizing the Transformer model, Autoregressive Integrated Moving Average, and Generalized Autoregressive Conditional Heteroskedasticity for time series analysis. Eight imputation methods—last observation carried forward, next observation carried backward, mean imputation, linear interpolation, seasonal decomposition, moving average, regression imputation, and Kalman filtering—were evaluated. Monte Carlo simulations and an application were conducted on generated and real data-driven datasets with different proportions of missing data to assess the performance of these methods. The findings suggest that imputation techniques, such as mean imputation, considered conventional, and Kalman filtering, can significantly en-hance the accuracy of time series models, particularly when integrated with innovative models like the Transformer. Moreover, the last observation carried forward, seasonal decomposition, and moving average did not provide better results in any scenario. Simulation-based synthetic data and application-based real data also revealed that the Transformer model outperformed traditional methods in scenarios with complete data (the original dataset) and new datasets generated through imputation at different rates. The results obtained from the real data-driven application support the findings from the simulation results. In addition to the simulation findings, the application results show that mean imputation performs well in cases with low levels of imputation, while Kalman filtering proves more successful when imputing a high proportion of missing data. This work goes beyond previous studies by systematically comparing a wide range of imputation methods within a unified framework, incorporating both traditional and modern time series models. A comprehensive evaluation of estimation techniques and imputation strategies applicable to time series analysis is presented, exploring appropriate combinations of estimation methods and imputation techniques.

References

  • REFERENCES
  • [1] Hyndman RJ, Athanasopoulos G. Forecasting: principles and practice. 2nd ed. Melbourne: OTexts; 2018. [CrossRef]
  • [2] Engle RF. GARCH 101: The use of ARCH/GARCH models in applied econometrics. J Econ Perspect 2001;15:157168. [CrossRef]
  • [3] Zhao L, Wen X, Wang Y, Shao Y. A novel hybrid model of ARIMA-MCC and CKDE-GARCH for urban short-term traffic flow prediction. IET Intell Transp Syst 2022;16:206217. [CrossRef]
  • [4] Devianto D, Yollanda M, Maiyastri M, Yanuar F. The soft computing FFNN method for adjusting heteroscedasticity on the time series model of currency exchange rate. Front Appl Math Stat 2023;9:1045218. [CrossRef]
  • [5] Chandola A, Pandey RM, Agarwal R, Rathour L, Mishra VN. On some properties and applications of the generalized m-parameter Mittag-Leffler function. Adv Math Model Appl 2022;7:130145.
  • [6] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA. 2017:5998-6008.
  • [7] Efimova O, Serletis A. Energy markets volatility modeling using GARCH. Energy Econ 2014;43:264273. [CrossRef]
  • [8] Zhou B, He D, Sun Z. Traffic modeling and prediction using ARIMA/GARCH model. In: Modeling and Simulation Tools for Emerging Telecommunication Networks. 2006:101121. [CrossRef]
  • [9] Li J, Yin J, Zhang R. Analysis and forecast of USD/EUR exchange rate based on ARIMA and GARCH models. In: Li X, Yuan C, Kent J, (editors). Proceedings of the 7th International Conference on Economic Management and Green Development (ICEMGD 2023). Midtown Manhattan, New York City: Springer; 2024. p. 566575. [CrossRef]
  • [10] Yaziza SR, Azizanb NA, Zakariaa R, Ahmadc M. The performance of hybrid ARIMA-GARCH modeling in forecasting gold price. In proceedings of the 20th International Congress on Modelling and Simulation. 2013. p. 12011207.
  • [11] Adebiyi AA, Adewumi AO, Ayo CK. Comparison of ARIMA and artificial neural networks models for stock price prediction. J Appl Math 2014;2014:614342. [CrossRef]
  • [12] Valipour M, Banihabib ME, Behbahani SMR. Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J Hydrol 2013;476:433441. [CrossRef]
  • [13] Wang Y, Shen Z, Jiang Y. Comparison of autoregressive integrated moving average model and generalized regression neural network model for prediction of haemorrhagic fever with renal syndrome in China: a time-series study. BMJ Open 2019;9:e025773. [CrossRef]
  • [14] Chung J, Gulcehre C, Cho K, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. In NIPS 2014 Workshop on Deep Learning, December 2014. 2014.
  • [15] Dauphin YN, Fan A, Auli M, Grangier D, Precup D, Teh YW. Language modeling with gated convolutional networks. In: Proceedings of the 34th International Conference on Machine Learning 2017;70:933941.
  • [16] Wen Q, Zhou T, Zhang C, Chen W, Ma Z, Yan J, Sun L. Transformers in time series: a survey. In: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence 2023. p. 67786786. [CrossRef]
  • [17] Waljee A, Mukherjee A, Singal A, Zhang Y, Warren J, Balis U, et al. Comparison of imputation methods for missing laboratory data in medicine. BMJ Open 2013;3:e002847. [CrossRef]
  • [18] Mohamed C, Sedory S, Singh S. Improved mean methods of imputation. Statistics Optim Inf Comput 2018;6:526535. [CrossRef]
  • [19] Dokumentov A, Hyndman R. STR: Seasonal-trend decomposition using regression. INFORMS J Data Sci 2022;1:5062. [CrossRef]
  • [20] Qin Y, Zhang S, Zhu X, Zhang J, Zhang C. Semi-parametric optimization for missing data imputation. Appl Intell 2007;27:7988. [CrossRef]
  • [21] Amini A, Thevenaz P, Ward J, Unser M. On the linearity of Bayesian interpolators for non-Gaussian continuous-time AR(1) processes. IEEE Trans Inf Theory 2013;59:50635074. [CrossRef] [22] Raubitzek S, Neubauer T, Friedrich J, Rauber A. Interpolating strange attractors via fractional Brownian bridges. Entropy. 2022;24:718. [CrossRef]
  • [23] Sharma MK, Chaudhary S, Rathour L, Mishra VN. Modified genetic algorithm with novel crossover and mutation operator in traveling salesman problem. Sigma J Eng Nat Sci 2024;42:18761883. [CrossRef]
  • [24] Negero NT, Duressa GF, Rathour L, Mishra VN. A novel fitted numerical scheme for singularly perturbed delay parabolic problems with two small parameters. Partial Differ Equ Appl Math 2023;8:100546. [CrossRef]
  • [25] Hogeme MS, Woldaregay MM, Rathour L, Mishra VN. A stable numerical method for singularly perturbed Fredholm integro-differential equation using exponentially fitted difference method. J Comput Appl Math 2024;441:115709. [CrossRef]
  • [26] Mishra LN, Raiz M, Rathour L, Mishra VN. Tauberian theorems for weighted means of double sequences in intuitionistic fuzzy normed spaces. Yugoslav J Oper Res 2022;32:377388. [CrossRef]
  • [27] Rathour L, Singh V, Yadav H, Sharma MK, Mishra VN. A dual hesitant fuzzy set theoretic approach in fuzzy reliability analysis of a fuzzy system. Inf Sci Lett 2024;13:433440. [CrossRef]
  • [28] Atmaca K, Yenilmez I. RNNs and transformer model in case of incomplete time series. In: V. International Applied Statistics Congress (UYIK - 2024); Istanbul, Turkiye; May 2123, 2024.
  • [29] Yenilmez I, Atmaca K. Performance of deep learning models on imputed time series data: a simulation study and application to leading airline companies’ stock price. Int J Adv Eng Pure Sci 2025;37(Suppl):3039. [CrossRef]
  • [30] Sharma MK, Sadhna, Bhargava AK, Kumar S, Rathour L, Mishra LN, Pandey S. A fermatean fuzzy ranking function in optimization of intuitionistic fuzzy transportation problems. Adv Math
  • Models Appl 2022;7:191204.
  • [31] Sharma MK, Dhiman N, Kumar S, Rathour L, Mishra VN. Neutrosophic Monte Carlo simulation approach for decision making in medical diagnostic process under uncertain environment. Int J Neutrosophic Sci 2023;22:816. [CrossRef]
  • [32] Soares G, Chavarette F, Gonçalves A, Faria H, Outa R, Mishra V. Optimizing the transition: replacing conventional lubricants with biological alternatives through artificial intelligence. J Appl Comput Mech. 2024;19.
  • [33] Hassan J. ARIMA and regression models for prediction of daily and monthly clearness index. Renew Energy 2014;68:421427. [CrossRef]
  • [34] Yang B, Bao W, Chen Y. Time series prediction based on complex-valued S-system model. Complexity. 2020;2020:113. [CrossRef]
  • [35] Ali M. Traffic speed prediction using GARCH-GRU hybrid model. IET Intell Transp Syst 2023;17:23002312. [CrossRef]
  • [36] Streiner DL. The case of the missing data: methods of dealing with dropouts and other research vagaries. Can J Psychiatry 2002;47:6875. [CrossRef]
  • [37] Little RJA, Rubin DB. Statistical Analysis with Missing Data. 2nd ed. New York: Wiley Interscience; 2002. [CrossRef]
  • [38] Bloomfield P. Fourier Analysis of Time Series: An Introduction. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc.; 2000. [CrossRef]
  • [39] Cleveland RB, Cleveland WS, McRae JE, Terpenning I. STL: A seasonal-trend decomposition procedure based on LOESS. J Official Stat 1990;6:333.
  • [40] Brockwell PJ, Davis RA. Introduction to Time Series and Forecasting. New York, NY: Springer; 2016. [CrossRef]
  • [41] Allison PD. Missing Data. Thousand Oaks, CA: Sage Publications; 2001.
  • [42] Kalman RE. A new approach to linear filtering and prediction problems. J Basic Eng 1960;82:3545. [CrossRef]
  • [43] Yenilmez I. Imputation methods effect on the goodness of fit of the statistical model. In: Proceedings of the 9th International Conference on Business, Management and Economics (9th ICBMECONF). March 1-3, 2024; Vienna, Austria.
There are 44 citations in total.

Details

Primary Language English
Subjects Building Technology
Journal Section Research Articles
Authors

İsmail Yenilmez 0000-0002-3357-3898

Publication Date February 28, 2025
Submission Date May 6, 2024
Acceptance Date October 3, 2024
Published in Issue Year 2025 Volume: 43 Issue: 1

Cite

Vancouver Yenilmez İ. Performance of imputation techniques: A comprehensive simulation study using the transformer model. SIGMA. 2025;43(1):199-212.

IMPORTANT NOTE: JOURNAL SUBMISSION LINK https://eds.yildiz.edu.tr/sigma/