Research Article
BibTex RIS Cite

Kayıp IoT Verilerinin Makine Öğrenmesi Teknikleri ile Tahmini

Year 2022, Volume: 9 Issue: 4, 1388 - 1397, 31.12.2022
https://doi.org/10.31202/ecjse.1135485

Abstract

Nesnelerin İnterneti (IoT) tabanlı endüstriyel uygulamalardan toplanan veriler her geçen gün giderek artmaktadır. Bununla birlikte, IoT cihazlarındaki arıza ve iletişim kopukluğu sebebi ile toplanan veriler gürültülü, belirsiz ve eksik olabilmektedir. Bu problemler, veri üretimi, kalitesi, işlenmesi ve analizleri için kritik bir konu haline gelmiştir. Bu çalışma kapsamında kullanılan veri setleri, Sivas Numune Hastanesi tıbbi atıkları evsel atığa dönüştüren su nötralizatör sisteminden gerçek zamanlı toplanarak oluşturulmuştur. Hastanelerde bulunan tıbbi sıvı atıklar kanalizasyona aktarılmadan önce nötralizasyon cihazları ile pH değişikliği yoluyla kimyasal nötralizasyon işlemine maruz bırakılmaktadır. Bu anlamda, tıbbi atık nötralizasyon sistemindeki pH değerlerinin gözlemlenmesi çevrenin korunması adına oldukça önemlidir. Bu kapsamda, farklı miktarlarda eksiltilerek oluşturulan iki veri seti ile kayıp pH verilerinin tahmini için Lineer Regresyon (LR), Destek Vektör Makineleri (DVM), K-En Yakın Komşuluk (KNN), Rastgele Orman (RO), Karar Ağacı (KA) ve Adaboost olmak üzere altı farklı makine öğrenmesi algoritması kullanılmıştır. Makine öğrenmesi algoritmalarının değerlendirilmesinde ortalama mutlak hata (Mean Absolute Error, MAE), ortalama karesel hata (Mean Squared Error, MSE) ve kök ortalama kare hata (Root Mean Square Error, RMSE) performans metrikleri kullanılmıştır. Gerçekleştirilen çalışma sonucunda iki farklı veri seti içinde DVM algoritmasının daha başarılı olduğu gözlemlenmiştir. Yapılan değerlendirme sonucu, makine öğrenmesi algoritmalarının kayıp pH verilerinin tahmininde oldukça başarılı olduğunu göstermektedir.

References

  • Dubey, A., & Rasool, A. (2019). Data Mining based Handling Missing Data. Proceedings of the 3rd International Conference on I-SMAC IoT in Social, Mobile, Analytics and Cloud, I-SMAC 2019, 483–489.
  • Gond, V. K., Dubey, A., & Rasool, A. (2021). A Survey of Machine Learning-Based Approaches for Missing Value Imputation. Proceedings of the 3rd International Conference on Inventive Research in Computing Applications, ICIRCA 2021, 841–846.
  • Ma, J., Cheng, J. C. P., Ding, Y., Lin, C., Jiang, F., Wang, M., & Zhai, C. (2020). Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series. Advanced Engineering Informatics, 44, 101092.
  • Qin, Y., Sheng, Q. Z., Falkner, N. J. G., Dustdar, S., Wang, H., & Vasilakos, A. V. (2016). When things matter: A survey on data-centric internet of things. Journal of Network and Computer Applications, 64, 137–153.
  • Guzel, M., Kok, I., Akay, D., & Ozdemir, S. (2020). ANFIS and Deep Learning based missing sensor data prediction in IoT. Concurrency and Computation: Practice and Experience, 32(2), e5400.
  • Zainal Abidin, N., Ritahani Ismail, A., & Emran, N. A. (2018). Performance Analysis of Machine Learning Algorithms for Missing Value Imputation. IJACSA) International Journal of Advanced Computer Science and Applications, 9(6).
  • Global Health Observatory. (n.d.). https://www.who.int/data/gho
  • Raja, P. S., & Thangavel, K. (2020). Missing value imputation using unsupervised machine learning techniques. Soft Computing, 24(6), 4361–4392. https://doi.org/10.1007/S00500-019-04199-6/TABLES/33.
  • UCI Machine Learning Repository. (n.d.). https://archive.ics.uci.edu/ml/index.php
  • Liu, Y., Dillon, T., Yu, W., Rahayu, W., & Mostafa, F. (2020). Missing Value Imputation for Industrial IoT Sensor Data with Large Gaps. IEEE Internet of Things Journal, 7(8), 6855–6867.
  • WHO Coronavirus (COVID-19) Dashboard. (n.d.). https://covid19.who.int/
  • Sivas Belediye Meclisinin Kasım Ayı Toplantısı 28/11/2014 Tarihli Birleşiminde Aldığı Karar. (N.D.). https://www.sivas.bel.tr/Files/ATIKSU_YONETMELiiii.pdf
  • David A. Freedman. (2009). Statistical models: theory and practice.
  • P. Kaur, R. Kumar, and M. Kumar, “A healthcare monitoring system using random forest and internet of things (IoT),” Multimed. Tools Appl., vol. 78, no. 14, pp. 19905–19916, Jul. 2019.
  • Breiman, L. (2001). Random Forest (Vol. 45, Issue Mach. Learn.).
  • Ani, R., Krishna, S., Anju, N., Sona, A. M., & Deepa, O. S. (2017). IoT based patient monitoring and diagnostic prediction tool using ensemble classifier. 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017, 2017-January, 1588–1593.
  • Xu, M., Watanachaturaporn, P., Varshney, P. K., & Arora, M. K. (2005). Decision tree regression for soft classification of remote sensing data. Remote Sensing of Environment, 97(3), 322–336.
  • Jhaveri, S., Khedkar, I., Kantharia, Y., & Jaswal, S. (2019). Success prediction using random forest, catboost, xgboost and adaboost for kickstarter campaigns. Proceedings of the 3rd International Conference on Computing Methodologies and Communication, ICCMC 2019, 1170–1173.

Missing IoT Data Prediction with Machine Learning Techniques

Year 2022, Volume: 9 Issue: 4, 1388 - 1397, 31.12.2022
https://doi.org/10.31202/ecjse.1135485

Abstract

Every day, the amount of data generated by industrial applications based on the Internet of Things (IoT) grows. However, data acquired as a result of failures and communication disconnections in IoT devices might be noisy, inaccurate, and incomplete. These issues have become crucial for data production, quality, processing, and analysis. The datasets used in the scope of this study were collected in real-time from the water neutralizer system of Sivas Numune Hospital, which converts medical waste into household waste. Medical liquid wastes in hospitals are exposed to chemical neutralization process by means of pH change with neutralization devices before being transferred to the sewer. In this regard, the monitoring of pH levels in the medical waste neutralization system is crucial for environmental protection. In this aspect, two datasets with varying quantities of missing data were evaluated for the prediction of the PH using the linear regression (LR), support vector machines (SVM), k-nearest neighbor (KNN), random forest (RF), and decision tree (DT) machine learning algorithms. Mean absolute error (MAE), mean squared error (MSE), and root mean square error (RMSE) performance metrics were used to evaluate machine learning algorithms. As a consequence of the analysis, it was determined that the SVM algorithm performed better performance on the two distinct datasets. The result of the evaluation indicates that machine learning algorithms are remarkably efficient at predicting missing pH data.

References

  • Dubey, A., & Rasool, A. (2019). Data Mining based Handling Missing Data. Proceedings of the 3rd International Conference on I-SMAC IoT in Social, Mobile, Analytics and Cloud, I-SMAC 2019, 483–489.
  • Gond, V. K., Dubey, A., & Rasool, A. (2021). A Survey of Machine Learning-Based Approaches for Missing Value Imputation. Proceedings of the 3rd International Conference on Inventive Research in Computing Applications, ICIRCA 2021, 841–846.
  • Ma, J., Cheng, J. C. P., Ding, Y., Lin, C., Jiang, F., Wang, M., & Zhai, C. (2020). Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series. Advanced Engineering Informatics, 44, 101092.
  • Qin, Y., Sheng, Q. Z., Falkner, N. J. G., Dustdar, S., Wang, H., & Vasilakos, A. V. (2016). When things matter: A survey on data-centric internet of things. Journal of Network and Computer Applications, 64, 137–153.
  • Guzel, M., Kok, I., Akay, D., & Ozdemir, S. (2020). ANFIS and Deep Learning based missing sensor data prediction in IoT. Concurrency and Computation: Practice and Experience, 32(2), e5400.
  • Zainal Abidin, N., Ritahani Ismail, A., & Emran, N. A. (2018). Performance Analysis of Machine Learning Algorithms for Missing Value Imputation. IJACSA) International Journal of Advanced Computer Science and Applications, 9(6).
  • Global Health Observatory. (n.d.). https://www.who.int/data/gho
  • Raja, P. S., & Thangavel, K. (2020). Missing value imputation using unsupervised machine learning techniques. Soft Computing, 24(6), 4361–4392. https://doi.org/10.1007/S00500-019-04199-6/TABLES/33.
  • UCI Machine Learning Repository. (n.d.). https://archive.ics.uci.edu/ml/index.php
  • Liu, Y., Dillon, T., Yu, W., Rahayu, W., & Mostafa, F. (2020). Missing Value Imputation for Industrial IoT Sensor Data with Large Gaps. IEEE Internet of Things Journal, 7(8), 6855–6867.
  • WHO Coronavirus (COVID-19) Dashboard. (n.d.). https://covid19.who.int/
  • Sivas Belediye Meclisinin Kasım Ayı Toplantısı 28/11/2014 Tarihli Birleşiminde Aldığı Karar. (N.D.). https://www.sivas.bel.tr/Files/ATIKSU_YONETMELiiii.pdf
  • David A. Freedman. (2009). Statistical models: theory and practice.
  • P. Kaur, R. Kumar, and M. Kumar, “A healthcare monitoring system using random forest and internet of things (IoT),” Multimed. Tools Appl., vol. 78, no. 14, pp. 19905–19916, Jul. 2019.
  • Breiman, L. (2001). Random Forest (Vol. 45, Issue Mach. Learn.).
  • Ani, R., Krishna, S., Anju, N., Sona, A. M., & Deepa, O. S. (2017). IoT based patient monitoring and diagnostic prediction tool using ensemble classifier. 2017 International Conference on Advances in Computing, Communications and Informatics, ICACCI 2017, 2017-January, 1588–1593.
  • Xu, M., Watanachaturaporn, P., Varshney, P. K., & Arora, M. K. (2005). Decision tree regression for soft classification of remote sensing data. Remote Sensing of Environment, 97(3), 322–336.
  • Jhaveri, S., Khedkar, I., Kantharia, Y., & Jaswal, S. (2019). Success prediction using random forest, catboost, xgboost and adaboost for kickstarter campaigns. Proceedings of the 3rd International Conference on Computing Methodologies and Communication, ICCMC 2019, 1170–1173.
There are 18 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Makaleler
Authors

Fatma Azizoğlu 0000-0003-0892-8711

Emre Ünsal 0000-0001-6042-0742

Publication Date December 31, 2022
Submission Date June 25, 2022
Acceptance Date September 7, 2022
Published in Issue Year 2022 Volume: 9 Issue: 4

Cite

IEEE F. Azizoğlu and E. Ünsal, “Kayıp IoT Verilerinin Makine Öğrenmesi Teknikleri ile Tahmini”, El-Cezeri Journal of Science and Engineering, vol. 9, no. 4, pp. 1388–1397, 2022, doi: 10.31202/ecjse.1135485.
Creative Commons License El-Cezeri is licensed to the public under a Creative Commons Attribution 4.0 license.
88x31.png