Applying Machine Learning Prediction Methods to COVID-19 Data
Year 2022,
, 11 - 21, 28.06.2022
Adnan Keçe
,
Yiğit Alişan
,
Faruk Serin
Abstract
The Coronavirus (COVID-19) epidemic emerged in China and has caused many problems such as loss of life, and deterioration of social and economic structure. Thus, understanding and predicting the course of the epidemic is very important. In this study, SEIR model and machine learning methods LSTM and SVM were used to predict the values of Susceptible, Exposed, Infected, and Recovered for COVID-19. For this purpose, COVID-19 data of Egypt and South Korea provided by John Hopkins University were used. The results of the methods were compared by using MAPE. Total 79% of MAPE were between 0-10. The comparisons show that although LSTM provided the better results, the results of all three methods were successful in predicting the number of cases, the number of patients who died, the peaks and dimensions of the epidemic.
References
- N. Madhav, B. Oppenheim, M. Gallivan, P. Mulembakani, E. Rubin, and N. Wolfe, “Pandemics: Risks, Impacts, and Mitigation,” in Disease Control Priorities: Improving Health and Reducing Poverty, 3rd ed., D. T. Jamison, H. Gelband, S. Horton, P. Jha, R. Laxminarayan, C. N. Mock, and R. Nugent, Eds. Washington (DC): The International Bank for Reconstruction and Development / The World Bank, 2017.
- Q. Li et al., “Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia,” New England Journal of Medicine, vol. 382, no. 13, pp. 1199–1207, Oct. 2020, doi: 10.1056/NEJMoa2001316.
- https://www.who.int/director-general/speeches/detail/who-director-general-opening-remarks-at-the-media-briefing-on-covid-19---11-march-2020. WHO Director-General's opening remarks at the media briefing on COVID-19 - 11 March 2020. (Access date: 10 july 2021)
- CSSEGISandData · GitHub. (n.d.). Retrieved June 6, 2021, from https://github.com/CSSEGISandData (Access date: 10 july 2021)
- N. Bernardini et al., “How lockdown measures, during COVID-19 pandemic, matter on psoriatic patient’s perception: study on 600 patients on biologic therapy,” Journal of Infection and Public Health, 2021, doi: 10.1016/j.jiph.2021.03.010.
- W.O. Kermack, A.G. McKendrick, “A contribution to the mathematical theory of epidemics”, Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci., 115 (772) (1927), pp. 700-721
- S. Dil, N. Dil, and Z. H. Maken, “COVID-19 Trends and Forecast in the Eastern Mediterranean Region With a Particular Focus on Pakistan,” Cureus, vol. 12, no. 6, May 2021, doi: 10.7759/cureus.8582.
- L. Roques, E. Klein, J. Papax, A. Sar, and S. Soubeyrand, “Using early data to estimate the actual infection fatality ratio from COVID-19 in France (Running title: Infection fatality ratio from COVID-19),” Biology, vol. 9, no. 5, p. 97, Jun. 2020, doi: 10.3390/biology9050097.
- C.-H. Li, C.-C. Tsai, and S.-Y. Yang, “Analysis of epidemic spreading of an SIRS model in complex heterogeneous networks,” Communications in Nonlinear Science and Numerical Simulation, vol. 19, no. 4, pp. 1042–1054, Jun. 2014, doi: 10.1016/j.cnsns.2013.08.033.
- C. Reno et al., “Forecasting COVID-19-Associated Hospitalizations under Different Levels of Social Distancing in Lombardy and Emilia-Romagna, Northern Italy: Results from an Extended SEIR Compartmental Model,” Journal of Clinical Medicine, vol. 9, no. 5, p. 1492, May 2020, doi: 10.3390/jcm9051492.
- L. Peng, W. Yang, D. Zhang, C. Zhuge, and L. Hong, “Epidemic analysis of COVID-19 in China by dynamical modeling,” arXiv:2002.06563 [q-bio], Jun. 2020.
- P. Teles, “A time-dependent SEIR model to analyse the evolution of the SARS-CoV-2 epidemic outbreak in Portugal,” arXiv:2004.04735 [q-bio], Jul. 2020.
- C. Anastassopoulou, L. Russo, A. Tsakris, and C. Siettos, “Data-based analysis, modelling and forecasting of the COVID-19 outbreak,” PLOS ONE, vol. 15, no. 3, p. e0230405, May 2020, doi: 10.1371/journal.pone.0230405.
- D. Zhao, J. Sun, Y. Tan, J. Wu, and Y. Dou, “An extended SEIR model considering homepage effect for the information propagation of online social networks,” Physica A: Statistical Mechanics and its Applications, vol. 512, pp. 1019–1031, Jan. 2018, doi: 10.1016/j.physa.2018.08.006.
- D. Hussain, T. Hussain, A. A. Khan, S. A. A. Naqvi, and A. Jamil, “A deep learning approach for hydrological time-series prediction: A case study of Gilgit river basin,” Earth Sci Inform, vol. 13, no. 3, pp. 915–927, Sep. 2020, doi: 10.1007/s12145-020-00477-2.
- A. Gasmi, C. Gomez, P. Lagacherie, H. Zouari, A. Laamrani, and A. Chehbouni, “Mean spectral reflectance from bare soil pixels along a Landsat-TM time series to increase both the prediction accuracy of soil clay content and mapping coverage,” Geoderma, vol. 388, p. 114864, Apr. 2021, doi: 10.1016/j.geoderma.2020.114864.
- F. Serin, Y. Alisan, and A. Kece, “Hybrid time series forecasting methods for travel time prediction,” Physica A: Statistical Mechanics and its Applications, vol. 579, p. 126134, Oct. 2021, doi: 10.1016/j.physa.2021.126134.
- F. Serin, Y. Alisan, and M. Erturkler, “Predicting Bus Travel Time Using Machine Learning Methods with Three-Layer Architecture,” Measurement, p. 111403, May 2022, doi: 10.1016/j.measurement.2022.111403.
- X. Chen and D. Cong, “Application of Improved Algorithm Based on Four-Dimensional ResNet in Rural Tourism Passenger Flow Prediction,” Journal of Sensors, vol. 2022, pp. 1–8, Apr. 2022, doi: 10.1155/2022/9675647.
- A. S. Ahmar and E. B. Del Val, “SutteARIMA: Short-term forecasting method, a case: Covid-19 and stock market in Spain,” The Science of the Total Environment, vol. 729, p. 138883, 2020, doi: 10.1016/j.scitotenv.2020.138883.
- T. Chakraborty and I. Ghosh, “Real-time forecasts and risk assessment of novel coronavirus (COVID-19) cases: A data-driven analysis,” Chaos, Solitons & Fractals, vol. 135, p. 109850, May 2020, doi: 10.1016/j.chaos.2020.109850.
- N. Chintalapudi, G. Battineni, and F. Amenta, “COVID-19 virus outbreak forecasting of registered and recovered cases after sixty day lockdown in Italy: A data driven model approach,” Journal of Microbiology, Immunology and Infection, vol. 53, no. 3, pp. 396–403, May 2020, doi: 10.1016/j.jmii.2020.04.004.
- V. K. R. Chimmula and L. Zhang, “Time series forecasting of COVID-19 transmission in Canada using LSTM networks,” Chaos, Solitons & Fractals, vol. 135, p. 109864, May 2020, doi: 10.1016/j.chaos.2020.109864.
- P. Wang, X. Zheng, G. Ai, D. Liu, and B. Zhu, “Time series prediction for the epidemic trends of COVID-19 using the improved LSTM deep learning method: Case studies in Russia, Peru and Iran,” Chaos, Solitons & Fractals, vol. 140, p. 110214, Dec. 2020, doi: 10.1016/j.chaos.2020.110214.
- D. Parbat and M. Chakraborty, “A python based support vector regression model for prediction of COVID19 cases in India,” Chaos, Solitons & Fractals, vol. 138, p. 109942, May 2020, doi: 10.1016/j.chaos.2020.109942.
- V. Singh et al., “Prediction of COVID-19 corona virus pandemic based on time series data using support vector machine,” Journal of Discrete Mathematical Sciences and Cryptography, vol. 23, no. 8, pp. 1583–1597, Feb. 2020, doi: 10.1080/09720529.2020.1784535.
- R. Gupta, G. Pandey, P. Chaudhary, and S. K. Pal, “SEIR and Regression Model based COVID-19 outbreak predictions in India,” Public and Global Health, Jun. 2020.
- S. Feng, Z. Feng, C. Ling, C. Chang, and Z. Feng, “Prediction of the COVID-19 Epidemic Trends Based on SEIR and AI Models,” Epidemiology, Dec. 2020.
- Z. Yang et al., “Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions,” Journal of Thoracic Disease, vol. 12, no. 3, pp. 165–174, Sep. 2020, doi: 10.21037/jtd.2020.02.64.
- “CSSEGISandData · GitHub.” https://github.com/CSSEGISandData (accessed Jun. 06, 2021).
- A. Godio, F. Pace, and A. Vergnano, “SEIR Modeling of the Italian Epidemic of SARS-CoV-2,” MATHEMATICS & COMPUTER SCIENCE, Jul. 2020.
- I. Rahimi, A. Gandomi, and F. Chen, Analysis and Prediction of COVID-19 using SIR, SEIR, and Machine Learning Models: Australia, Italy, and UK Cases. 2020.
- M. J. D. Powell, “Restart procedures for the conjugate gradient method,” Mathematical Programming, vol. 12, no. 1, pp. 241–254, Feb. 1977, doi: 10.1007/BF01593790.
- M. J. D. Powell, “A View of Algorithms for Optimization without Derivatives,” p. 12.
- M. Gupta and B. Gupta, “An Ensemble Model for Breast Cancer Prediction Using Sequential Least Squares Programming Method (SLSQP),” in 2018 Eleventh International Conference on Contemporary Computing (IC3), Mar. 2018, pp. 1–3. doi: 10.1109/IC3.2018.8530572.
- C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning 1995 20:3, vol. 20, no. 3, pp. 273–297, Sep. 1995, doi: 10.1007/BF00994018.
- T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler, “Support vector machine classification and validation of cancer tissue samples using microarray expression data,” Bioinformatics, vol. 16, no. 10, pp. 906–914, May 2000, doi: 10.1093/bioinformatics/16.10.906.
- V. N. Vapnik, “The Nature of Statistical Learning Theory,” The Nature of Statistical Learning Theory, 1995, doi: 10.1007/978-1-4757-2440-0.
- O. L. Mangasarian and D. R. Musicant, “Active Support Vector Machine Classification,” p. 7, 2000.
- M. Awad and R. Khanna, “Support Vector Regression,” in Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers, M. Awad and R. Khanna, Eds. Berkeley, CA: Apress, 2015, pp. 67–80.
- F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, “Learning Precise Timing with LSTM Recurrent Networks,” p. 29, 2002.
- A. Graves, A. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2013, pp. 6645–6649. doi: 10.1109/ICASSP.2013.6638947.
- R. Fu, Z. Zhang, and L. Li, “Using LSTM and GRU neural network methods for traffic flow prediction,” in 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), May 2016, pp. 324–328. doi: 10.1109/YAC.2016.7804912.
- W. C. Roda, M. B. Varughese, D. Han, and M. Y. Li, “Why is it difficult to accurately predict the COVID-19 epidemic?,” Infectious Disease Modelling, vol. 5, pp. 271–281, Jun. 2020, doi: 10.1016/j.idm.2020.03.001.
Year 2022,
, 11 - 21, 28.06.2022
Adnan Keçe
,
Yiğit Alişan
,
Faruk Serin
References
- N. Madhav, B. Oppenheim, M. Gallivan, P. Mulembakani, E. Rubin, and N. Wolfe, “Pandemics: Risks, Impacts, and Mitigation,” in Disease Control Priorities: Improving Health and Reducing Poverty, 3rd ed., D. T. Jamison, H. Gelband, S. Horton, P. Jha, R. Laxminarayan, C. N. Mock, and R. Nugent, Eds. Washington (DC): The International Bank for Reconstruction and Development / The World Bank, 2017.
- Q. Li et al., “Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia,” New England Journal of Medicine, vol. 382, no. 13, pp. 1199–1207, Oct. 2020, doi: 10.1056/NEJMoa2001316.
- https://www.who.int/director-general/speeches/detail/who-director-general-opening-remarks-at-the-media-briefing-on-covid-19---11-march-2020. WHO Director-General's opening remarks at the media briefing on COVID-19 - 11 March 2020. (Access date: 10 july 2021)
- CSSEGISandData · GitHub. (n.d.). Retrieved June 6, 2021, from https://github.com/CSSEGISandData (Access date: 10 july 2021)
- N. Bernardini et al., “How lockdown measures, during COVID-19 pandemic, matter on psoriatic patient’s perception: study on 600 patients on biologic therapy,” Journal of Infection and Public Health, 2021, doi: 10.1016/j.jiph.2021.03.010.
- W.O. Kermack, A.G. McKendrick, “A contribution to the mathematical theory of epidemics”, Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci., 115 (772) (1927), pp. 700-721
- S. Dil, N. Dil, and Z. H. Maken, “COVID-19 Trends and Forecast in the Eastern Mediterranean Region With a Particular Focus on Pakistan,” Cureus, vol. 12, no. 6, May 2021, doi: 10.7759/cureus.8582.
- L. Roques, E. Klein, J. Papax, A. Sar, and S. Soubeyrand, “Using early data to estimate the actual infection fatality ratio from COVID-19 in France (Running title: Infection fatality ratio from COVID-19),” Biology, vol. 9, no. 5, p. 97, Jun. 2020, doi: 10.3390/biology9050097.
- C.-H. Li, C.-C. Tsai, and S.-Y. Yang, “Analysis of epidemic spreading of an SIRS model in complex heterogeneous networks,” Communications in Nonlinear Science and Numerical Simulation, vol. 19, no. 4, pp. 1042–1054, Jun. 2014, doi: 10.1016/j.cnsns.2013.08.033.
- C. Reno et al., “Forecasting COVID-19-Associated Hospitalizations under Different Levels of Social Distancing in Lombardy and Emilia-Romagna, Northern Italy: Results from an Extended SEIR Compartmental Model,” Journal of Clinical Medicine, vol. 9, no. 5, p. 1492, May 2020, doi: 10.3390/jcm9051492.
- L. Peng, W. Yang, D. Zhang, C. Zhuge, and L. Hong, “Epidemic analysis of COVID-19 in China by dynamical modeling,” arXiv:2002.06563 [q-bio], Jun. 2020.
- P. Teles, “A time-dependent SEIR model to analyse the evolution of the SARS-CoV-2 epidemic outbreak in Portugal,” arXiv:2004.04735 [q-bio], Jul. 2020.
- C. Anastassopoulou, L. Russo, A. Tsakris, and C. Siettos, “Data-based analysis, modelling and forecasting of the COVID-19 outbreak,” PLOS ONE, vol. 15, no. 3, p. e0230405, May 2020, doi: 10.1371/journal.pone.0230405.
- D. Zhao, J. Sun, Y. Tan, J. Wu, and Y. Dou, “An extended SEIR model considering homepage effect for the information propagation of online social networks,” Physica A: Statistical Mechanics and its Applications, vol. 512, pp. 1019–1031, Jan. 2018, doi: 10.1016/j.physa.2018.08.006.
- D. Hussain, T. Hussain, A. A. Khan, S. A. A. Naqvi, and A. Jamil, “A deep learning approach for hydrological time-series prediction: A case study of Gilgit river basin,” Earth Sci Inform, vol. 13, no. 3, pp. 915–927, Sep. 2020, doi: 10.1007/s12145-020-00477-2.
- A. Gasmi, C. Gomez, P. Lagacherie, H. Zouari, A. Laamrani, and A. Chehbouni, “Mean spectral reflectance from bare soil pixels along a Landsat-TM time series to increase both the prediction accuracy of soil clay content and mapping coverage,” Geoderma, vol. 388, p. 114864, Apr. 2021, doi: 10.1016/j.geoderma.2020.114864.
- F. Serin, Y. Alisan, and A. Kece, “Hybrid time series forecasting methods for travel time prediction,” Physica A: Statistical Mechanics and its Applications, vol. 579, p. 126134, Oct. 2021, doi: 10.1016/j.physa.2021.126134.
- F. Serin, Y. Alisan, and M. Erturkler, “Predicting Bus Travel Time Using Machine Learning Methods with Three-Layer Architecture,” Measurement, p. 111403, May 2022, doi: 10.1016/j.measurement.2022.111403.
- X. Chen and D. Cong, “Application of Improved Algorithm Based on Four-Dimensional ResNet in Rural Tourism Passenger Flow Prediction,” Journal of Sensors, vol. 2022, pp. 1–8, Apr. 2022, doi: 10.1155/2022/9675647.
- A. S. Ahmar and E. B. Del Val, “SutteARIMA: Short-term forecasting method, a case: Covid-19 and stock market in Spain,” The Science of the Total Environment, vol. 729, p. 138883, 2020, doi: 10.1016/j.scitotenv.2020.138883.
- T. Chakraborty and I. Ghosh, “Real-time forecasts and risk assessment of novel coronavirus (COVID-19) cases: A data-driven analysis,” Chaos, Solitons & Fractals, vol. 135, p. 109850, May 2020, doi: 10.1016/j.chaos.2020.109850.
- N. Chintalapudi, G. Battineni, and F. Amenta, “COVID-19 virus outbreak forecasting of registered and recovered cases after sixty day lockdown in Italy: A data driven model approach,” Journal of Microbiology, Immunology and Infection, vol. 53, no. 3, pp. 396–403, May 2020, doi: 10.1016/j.jmii.2020.04.004.
- V. K. R. Chimmula and L. Zhang, “Time series forecasting of COVID-19 transmission in Canada using LSTM networks,” Chaos, Solitons & Fractals, vol. 135, p. 109864, May 2020, doi: 10.1016/j.chaos.2020.109864.
- P. Wang, X. Zheng, G. Ai, D. Liu, and B. Zhu, “Time series prediction for the epidemic trends of COVID-19 using the improved LSTM deep learning method: Case studies in Russia, Peru and Iran,” Chaos, Solitons & Fractals, vol. 140, p. 110214, Dec. 2020, doi: 10.1016/j.chaos.2020.110214.
- D. Parbat and M. Chakraborty, “A python based support vector regression model for prediction of COVID19 cases in India,” Chaos, Solitons & Fractals, vol. 138, p. 109942, May 2020, doi: 10.1016/j.chaos.2020.109942.
- V. Singh et al., “Prediction of COVID-19 corona virus pandemic based on time series data using support vector machine,” Journal of Discrete Mathematical Sciences and Cryptography, vol. 23, no. 8, pp. 1583–1597, Feb. 2020, doi: 10.1080/09720529.2020.1784535.
- R. Gupta, G. Pandey, P. Chaudhary, and S. K. Pal, “SEIR and Regression Model based COVID-19 outbreak predictions in India,” Public and Global Health, Jun. 2020.
- S. Feng, Z. Feng, C. Ling, C. Chang, and Z. Feng, “Prediction of the COVID-19 Epidemic Trends Based on SEIR and AI Models,” Epidemiology, Dec. 2020.
- Z. Yang et al., “Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions,” Journal of Thoracic Disease, vol. 12, no. 3, pp. 165–174, Sep. 2020, doi: 10.21037/jtd.2020.02.64.
- “CSSEGISandData · GitHub.” https://github.com/CSSEGISandData (accessed Jun. 06, 2021).
- A. Godio, F. Pace, and A. Vergnano, “SEIR Modeling of the Italian Epidemic of SARS-CoV-2,” MATHEMATICS & COMPUTER SCIENCE, Jul. 2020.
- I. Rahimi, A. Gandomi, and F. Chen, Analysis and Prediction of COVID-19 using SIR, SEIR, and Machine Learning Models: Australia, Italy, and UK Cases. 2020.
- M. J. D. Powell, “Restart procedures for the conjugate gradient method,” Mathematical Programming, vol. 12, no. 1, pp. 241–254, Feb. 1977, doi: 10.1007/BF01593790.
- M. J. D. Powell, “A View of Algorithms for Optimization without Derivatives,” p. 12.
- M. Gupta and B. Gupta, “An Ensemble Model for Breast Cancer Prediction Using Sequential Least Squares Programming Method (SLSQP),” in 2018 Eleventh International Conference on Contemporary Computing (IC3), Mar. 2018, pp. 1–3. doi: 10.1109/IC3.2018.8530572.
- C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning 1995 20:3, vol. 20, no. 3, pp. 273–297, Sep. 1995, doi: 10.1007/BF00994018.
- T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler, “Support vector machine classification and validation of cancer tissue samples using microarray expression data,” Bioinformatics, vol. 16, no. 10, pp. 906–914, May 2000, doi: 10.1093/bioinformatics/16.10.906.
- V. N. Vapnik, “The Nature of Statistical Learning Theory,” The Nature of Statistical Learning Theory, 1995, doi: 10.1007/978-1-4757-2440-0.
- O. L. Mangasarian and D. R. Musicant, “Active Support Vector Machine Classification,” p. 7, 2000.
- M. Awad and R. Khanna, “Support Vector Regression,” in Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers, M. Awad and R. Khanna, Eds. Berkeley, CA: Apress, 2015, pp. 67–80.
- F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, “Learning Precise Timing with LSTM Recurrent Networks,” p. 29, 2002.
- A. Graves, A. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2013, pp. 6645–6649. doi: 10.1109/ICASSP.2013.6638947.
- R. Fu, Z. Zhang, and L. Li, “Using LSTM and GRU neural network methods for traffic flow prediction,” in 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), May 2016, pp. 324–328. doi: 10.1109/YAC.2016.7804912.
- W. C. Roda, M. B. Varughese, D. Han, and M. Y. Li, “Why is it difficult to accurately predict the COVID-19 epidemic?,” Infectious Disease Modelling, vol. 5, pp. 271–281, Jun. 2020, doi: 10.1016/j.idm.2020.03.001.