Missing Data Imputation for Solar Radiatıon by Deep Neural Network
Year 2022,
, 548 - 555, 07.05.2022
Eyyup Ensar Başakın
,
Mehmet Özger
Abstract
The quality of observations is fundamental issue in natural sciences. Here, the accurate and complete data is required to accomplish satisfactory estimations. There are several factors impairing the quality of measurements, such as a broken or mis-calibrated device and error in reading the measurements. Thus, this study primarily aims the imputation of the missing values in measurement of solar radiation data. Deep Neural Network (DNN) method was used to handle the missing data, and benchmarked with the classical approaches, i.e., Mean Imputation (MI), which one of the most frequently adopted data imputation method in the pertinent literature, the Linear Interpolation (LI) and Spline Interpolation (SI). The overall results highlighted that the DNN method outperformed its counterparts in terms of missing value handling through providing a greater accuracy according to the various performance metrics compared to the classical methods. It is believed that the proposed approach could make valuable contribution to the body of knowledge as well as providing significant overview to the interested researchers by filling the important gap exists in the pertinent literature.
Thanks
We would like to thank Meteorological General Institution and Turkish Statistical Institution for providing meteorological and wheat yield data, respectively
References
- Awawdeh, S., Faris, H., & Hiary, H. (2022). EvoImputer: An evolutionary approach for Missing Data Imputation and feature selection in the context of supervised learning. Knowledge-Based Systems, 236, 107734. https://doi.org/10.1016/j.knosys.2021.107734
- Başakın, E. E., & Ekmekcioğlu, Ö. (2021). Letter to the Editor “Estimation of global solar radiation data based on satellite-derived atmospheric parameters over the urban area of Mashhad, Iran.” Environmental Science and Pollution Research, 28(15), 19530–19532. https://doi.org/10.1007/s11356-021-13201-4
- Başakın, E. E., Ekmekcioğlu, Ö., Özger, M., Altınbaş, N., & Şaylan, L. (2021). Estimation of measured evapotranspiration using data-driven methods with limited meteorological variables. Italian Journal of Agrometeorology, 2021(1), 63–80. https://doi.org/10.36253/ijam-1055
- Coutinho, E. R., da Silva, R. M., Madeira, J. G. F., Coutinho, P. R. de O. dos S., Boloy, R. A. M., & Delgado, A. R. S. (2018). Application of artificial neural networks (ANNs) in the gap filling of meteorological time series. Revista Brasileira de Meteorologia, 33(2), 317–328. https://doi.org/10.1590/0102-7786332013
- Demir, V., Uray, E., Orhan, O., Yavariabdi, A., & Kusetogullari, H. (2021). Trend Analysis of Ground-Water Levels and The Effect of Effective Soil Stress Change: The Case Study of Konya Closed Basin. European Journal of Science and Technology, 24, 515–522. https://doi.org/10.31590/ejosat.916026
- Gill, M. K., Asefa, T., Kaheil, Y., & McKee, M. (2007). Effect of missing data on performance of learning algorithms for hydrologic predictions: Implications to an imputation technique. Water Resources Research, 43(7), 1–12. https://doi.org/10.1029/2006WR005298
- Hamzah, F. B., Hamzah, F. M., Razali, S. F. M., & Samad, H. (2021). A comparison of multiple imputation methods for recovering missing data in hydrological studies. Civil Engineering Journal (Iran), 7(9), 1608–1619. https://doi.org/10.28991/cej-2021-03091747
- Heck, K., Coltman, E., Schneider, J., & Helmig, R. (2020). Influence of Radiation on Evaporation Rates: A Numerical Analysis. Water Resources Research, 56(10). https://doi.org/10.1029/2020WR027332
- Hunziker, S., Gubler, S., Calle, J., Moreno, I., Andrade, M., Velarde, F., Ticona, L., Carrasco, G., Castellón, Y., Oria, C., Croci-Maspoli, M., Konzelmann, T., Rohrer, M., & Brönnimann, S. (2017). Identifying, attributing, and overcoming common data quality issues of manned station observations. International Journal of Climatology, 37(11), 4131–4145. https://doi.org/10.1002/joc.5037
- Lecun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
- Nash, E., & Sutcliffe, V. (1970). River flow forecasting Through conceptual models PART I- A Discussion of principles. Journal of Hydrology, 10, 282–290.
- Nikroo, L., Kompani-Zare, M., Sepaskhah, A. R., & Fallah Shamsi, S. R. (2010). Groundwater depth and elevation interpolation by kriging methods in Mohr Basin of Fars province in Iran. Environmental Monitoring and Assessment, 166(1–4), 387–407. https://doi.org/10.1007/s10661-009-1010-x
- Ratolojanahary, R., Houé Ngouna, R., Medjaher, K., Junca-Bourié, J., Dauriac, F., & Sebilo, M. (2019). Model selection to improve multiple imputation for handling high rate missingness in a water quality dataset. Expert Systems with Applications, 131, 299–307. https://doi.org/10.1016/j.eswa.2019.04.049
- Saplioglu, K., & Kucukerdem, T. S. (2018). Estimation of missing streamflow data using anfis models and determination of the number of datasets for anfis: The case of yeŞİlirmak river. Applied Ecology and Environmental Research, 16(3), 3583–3594. https://doi.org/10.15666/aeer/1603_35833594
- Schneider, T. (2001). Analysis of incomplete climate data: Estimation of Mean Values and covariance matrices and imputation of Missing values. Journal of Climate, 14(5), 853–871. https://doi.org/10.1175/1520-0442(2001)014<0853:AOICDE>2.0.CO;2
- Stisen, S., & Tumbo, M. (2015). Interpolation des données pluviométriques journalières pour la modélisation hydrologique dans des régions à données clairsemées en utilisant des informations issues de données satellitaires. Hydrological Sciences Journal, 60(11), 1911–1926. https://doi.org/10.1080/02626667.2014.992789
Eksik Solar Radyasyon Verilerinin Derin Sinir Ağları ile Tamamlanması
Year 2022,
, 548 - 555, 07.05.2022
Eyyup Ensar Başakın
,
Mehmet Özger
Abstract
Gözlemlerin kalitesi doğa bilimlerinde önemli bir konudur. Tatmin edici tahminleri gerçekleştirmek için doğru ve eksiksiz veriler gereklidir. Bozuk veya yanlış kalibre edilmiş bir cihaz ve ölçümlerin okunmasındaki hata gibi ölçümlerin kalitesini bozan çeşitli faktörler vardır. Bu çalışmada, güneş radyasyonu verilerinin ölçümünde kayıp değerlerin tamamlanması amaçlanmaktadır. Eksik verileri işlemek için Derin Sinir Ağı (DNN) yöntemi kullanılmış ve ilgili literatürde en sık benimsenen veri atama yöntemlerinden biri olan Ortalama Atama (MI) gibi klasik yaklaşımlarla, Doğrusal İnterpolasyon (LI) ve Spline İnterpolasyon ile kıyaslama yapılmıştır. Genel sonuçlar, DNN yönteminin, klasik yöntemlere kıyasla çeşitli performans ölçütlerine göre daha fazla doğruluk sağlayarak eksik veri tamamlama açısından benzerlerinden daha iyi performans gösterdiğini vurguladı. Önerilen yaklaşımın, ilgili literatürde var olan önemli boşluğu doldurarak ilgili araştırmacılara önemli bir genel bakış sağlamanın yanı sıra bilgi birikimine değerli katkılarda bulunabileceğine inanılmaktadır.
References
- Awawdeh, S., Faris, H., & Hiary, H. (2022). EvoImputer: An evolutionary approach for Missing Data Imputation and feature selection in the context of supervised learning. Knowledge-Based Systems, 236, 107734. https://doi.org/10.1016/j.knosys.2021.107734
- Başakın, E. E., & Ekmekcioğlu, Ö. (2021). Letter to the Editor “Estimation of global solar radiation data based on satellite-derived atmospheric parameters over the urban area of Mashhad, Iran.” Environmental Science and Pollution Research, 28(15), 19530–19532. https://doi.org/10.1007/s11356-021-13201-4
- Başakın, E. E., Ekmekcioğlu, Ö., Özger, M., Altınbaş, N., & Şaylan, L. (2021). Estimation of measured evapotranspiration using data-driven methods with limited meteorological variables. Italian Journal of Agrometeorology, 2021(1), 63–80. https://doi.org/10.36253/ijam-1055
- Coutinho, E. R., da Silva, R. M., Madeira, J. G. F., Coutinho, P. R. de O. dos S., Boloy, R. A. M., & Delgado, A. R. S. (2018). Application of artificial neural networks (ANNs) in the gap filling of meteorological time series. Revista Brasileira de Meteorologia, 33(2), 317–328. https://doi.org/10.1590/0102-7786332013
- Demir, V., Uray, E., Orhan, O., Yavariabdi, A., & Kusetogullari, H. (2021). Trend Analysis of Ground-Water Levels and The Effect of Effective Soil Stress Change: The Case Study of Konya Closed Basin. European Journal of Science and Technology, 24, 515–522. https://doi.org/10.31590/ejosat.916026
- Gill, M. K., Asefa, T., Kaheil, Y., & McKee, M. (2007). Effect of missing data on performance of learning algorithms for hydrologic predictions: Implications to an imputation technique. Water Resources Research, 43(7), 1–12. https://doi.org/10.1029/2006WR005298
- Hamzah, F. B., Hamzah, F. M., Razali, S. F. M., & Samad, H. (2021). A comparison of multiple imputation methods for recovering missing data in hydrological studies. Civil Engineering Journal (Iran), 7(9), 1608–1619. https://doi.org/10.28991/cej-2021-03091747
- Heck, K., Coltman, E., Schneider, J., & Helmig, R. (2020). Influence of Radiation on Evaporation Rates: A Numerical Analysis. Water Resources Research, 56(10). https://doi.org/10.1029/2020WR027332
- Hunziker, S., Gubler, S., Calle, J., Moreno, I., Andrade, M., Velarde, F., Ticona, L., Carrasco, G., Castellón, Y., Oria, C., Croci-Maspoli, M., Konzelmann, T., Rohrer, M., & Brönnimann, S. (2017). Identifying, attributing, and overcoming common data quality issues of manned station observations. International Journal of Climatology, 37(11), 4131–4145. https://doi.org/10.1002/joc.5037
- Lecun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
- Nash, E., & Sutcliffe, V. (1970). River flow forecasting Through conceptual models PART I- A Discussion of principles. Journal of Hydrology, 10, 282–290.
- Nikroo, L., Kompani-Zare, M., Sepaskhah, A. R., & Fallah Shamsi, S. R. (2010). Groundwater depth and elevation interpolation by kriging methods in Mohr Basin of Fars province in Iran. Environmental Monitoring and Assessment, 166(1–4), 387–407. https://doi.org/10.1007/s10661-009-1010-x
- Ratolojanahary, R., Houé Ngouna, R., Medjaher, K., Junca-Bourié, J., Dauriac, F., & Sebilo, M. (2019). Model selection to improve multiple imputation for handling high rate missingness in a water quality dataset. Expert Systems with Applications, 131, 299–307. https://doi.org/10.1016/j.eswa.2019.04.049
- Saplioglu, K., & Kucukerdem, T. S. (2018). Estimation of missing streamflow data using anfis models and determination of the number of datasets for anfis: The case of yeŞİlirmak river. Applied Ecology and Environmental Research, 16(3), 3583–3594. https://doi.org/10.15666/aeer/1603_35833594
- Schneider, T. (2001). Analysis of incomplete climate data: Estimation of Mean Values and covariance matrices and imputation of Missing values. Journal of Climate, 14(5), 853–871. https://doi.org/10.1175/1520-0442(2001)014<0853:AOICDE>2.0.CO;2
- Stisen, S., & Tumbo, M. (2015). Interpolation des données pluviométriques journalières pour la modélisation hydrologique dans des régions à données clairsemées en utilisant des informations issues de données satellitaires. Hydrological Sciences Journal, 60(11), 1911–1926. https://doi.org/10.1080/02626667.2014.992789