TY - JOUR T1 - Makine Öğrenmesi Yöntemleri Kullanılarak Hanehalkı Toplam Enerji Harcamaları Tahmini TT - Prediction of Household Total Energy Expenditures Using Machine Learning Methods AU - Kesriklioglu, Esma AU - Oktay, Erkan PY - 2023 DA - January JF - Turkish Research Journal of Academic Social Science JO - TURAJAS PB - Hüdaverdi BİRCAN WT - DergiPark SN - 2667-4491 SP - 110 EP - 118 VL - 5 IS - 2 LA - tr AB - Ülkelerin kalkınma hızı ve ekonomik gelişmelere bağlı olarak tüketim alışkanlıklarının yelpazesi kontrolsüz bir şekilde genişlemektedir. Yükselen yaşam standartları ile birlikte, kaçınılmaz olarak hanehalkı enerji tüketimi de enerji talebinin son yıllarda önemli ölçüde artmasına neden olmaktadır. Dünya çapında önemli bir enerji kullanıcısı olan hanehalklarının enerji kullanımı hakkında artan bir endişe bulunmaktadır. Hanehalkı toplam enerji harcamalarını tahmin etmek için makine öğrenmesi yöntemlerinin uygunluğunu araştıran çalışmalar yetersizdir. Bu boşluğu gidermek için bu çalışmada, hanehalkı toplam enerji harcamalarının regresyon tahmini için farklı makine öğrenmesi yöntemlerinin karşılaştırılması sunulmuştur. En iyi tahmin performansını sağlayan makine öğrenmesi yönteminin bulunması amaçlanmıştır. Çalışmada, Türkiye İstatistik Kurumu (TÜİK)’ten elde edilen, Hanehalkı Bütçe Anketi 2019 veri seti kullanılmıştır. 11.521 hanenin hanehalkı tüketim verileri incelenmiştir. Yapılan literatür taraması ve uzman görüşü rehberliğinde hanehalkı enerji harcamaları ile doğrudan veya dolaylı olarak ilgili değişkenler oluşturulmuştur. Hazırlanan değişkenler açık kaynak erişimli RapidMiner yazılım programı ile veri ön işleme, öznitellik seçimi, modelleme, tahmin ve performans değerlendirme aşamalarından geçirilmiştir. Hanehalkı toplam enerji harcamalarını tahmin etmek için regresyona bağlı farklı makine öğrenimi yaklaşımları kullanılmıştır. Modelleme aşamasında, DL, GBT, RF, KNN, DT makine öğrenmesi modelleri kullanılmıştır. Sonuç olarak, en yüksek R2 (0,99) ve en düşük RMSE (5,5) ile en iyi performansı, DL modeli göstermiştir. Analiz sonuçları, hanehalkı toplam enerji harcamaları tahmininde derin öğrenme yöntemi ile daha doğru sonuçlar elde edildiğini göstermektedir. KW - makine öğrenmesi KW - Hanehalkı Enerji Harcaması KW - Derin Öğrenme N2 - Depending on the development rate and economic developments of the countries, the range of consumption habits is expanding uncontrollably. Along with rising living standards, household energy consumption inevitably also causes energy demand to increase significantly in recent years. There is a growing concern about the energy use of households, which are a major energy user worldwide. Studies investigating the suitability of machine learning methods for estimating household total energy expenditure are insufficient. To fill this gap, this study presents a comparison of different machine learning methods for regression estimation of household total energy expenditure. It is aimed to find the machine learning method that provides the best prediction performance. The Household Budget Survey 2019 data set obtained from the Turkish Statistical Institute (TUIK) was used. Household consumption data of 11,521 households were analyzed. Under the guidance of the literature review and expert opinion, variables directly or indirectly related to household energy expenditures were created. The prepared variables were passed through data preprocessing, feature selection, modeling and estimation stages with the open source RapidMiner software program. A regression-based machine learning approach was used to estimate household total energy expenditure. In the modeling phase, DL, GBT, RF, KNN, DT machine learning methods were used. As a result, DL method showed the best performance with the highest R2 (0.99) and lowest RMSE (5.5). The results of the analysis show that more accurate results are obtained with the DLmethod in the estimation of household total energy expenditures. CR - Ahmad, A. S., Hassan, M. Y., Abdullah, M. P., Rahman, H. A., Hussin, F., Abdullah, H. ve Saidur, R. (2014). A review on applications of ANN and SVM for building electrical energy consumption forecasting. Renewable and Sustainable Energy Reviews, 33, 102–109. doi:10.1016/j.rser.2014.01.069 CR - Ahmad, M. W., Mourshed, M. ve Rezgui, Y. (2017). Trees vs Neurons: Comparison between random forest and ANN for high-resolution prediction of building energy consumption. Energy and Buildings, 147, 77–89. doi:10.1016/j.enbuild.2017.04.038 CR - Aizenberg, I. N., Aizenberg, N. N. ve Vandewalle, J. (2000). Multiple-Valued Threshold Logic and Multi-Valued Neurons. Multi-Valued and Universal Binary Neurons içinde (ss. 25–80). Boston, MA: Springer US. doi:10.1007/978-1-4757-3115-6_2 CR - Akar, Ö. ve Güngör, O. (2012). Rastgele orman algoritması kullanılarak çok bantlı görüntülerin sınıflandırılması. Journal of Geodesy and Geoinformation, 1(2), 139–146. doi:10.9733/jgg.241212.1t CR - Archer, K. J. ve Kimes, R. v. (2008). Empirical characterization of random forest variable importance measures. Computational Statistics & Data Analysis, 52(4), 2249–2260. doi:10.1016/j.csda.2007.08.015 CR - Arunadevi, J. ve Nithya, M. J. (2016). Comparison of feature selection strategies for classification using rapid miner. International Journal of Innovative Research in Computer and Communication Engineering, 4(7), 13556-13563. CR - Atems, B. ve Hotaling, C. (2018). The effect of renewable and nonrenewable electricity generation on economic growth. Energy Policy, 112, 111–118. doi:10.1016/j.enpol.2017.10.015 CR - Ayık, Y. Z., Özdemir, A. ve Yavuz, U. (2007). Lise Türü Ve Lise Mezuniyet Başarısının, Kazanılan Fakülte İle İlişkisinin Veri Madenciliği Tekniği İle Analizi. Atatürk Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, 10(2), 441-454. CR - Beckel, C., Sadamori, L., Staake, T. ve Santini, S. (2014). Revealing household characteristics from smart meter data. Energy, 78, 397–410. doi:10.1016/j.energy.2014.10.025 CR - Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. doi:10.1023/A:1010933404324 CR - Breunig, M. M., Kriegel, H.-P., Ng, R. T. ve Sander, J. (2000). LOF. Proceedings of the 2000 ACM SIGMOD international conference on Management of data - SIGMOD ’00 içinde (ss. 93–104). New York, New York, USA: ACM Press. doi:10.1145/342009.335388 CR - Burnett, J. W. ve Kiesling, L. L. (2022). How do machines predict energy use? Comparing machine learning approaches for modeling household energy demand in the United States. Energy Research & Social Science, 91, 102715. doi:10.1016/j.erss.2022.102715 CR - Chandrashekar, G. ve Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28. doi:10.1016/j.compeleceng.2013.11.024 CR - Chou, J.-S. ve Tran, D.-S. (2018). Forecasting energy consumption time series using machine learning techniques based on usage patterns of residential householders. Energy, 165, 709–726. doi:10.1016/j.energy.2018.09.144 CR - Cover, T. ve Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. doi:10.1109/TIT.1967.1053964 CR - EUROSTAT Energy statistics, https://ec.europa.eu/eurostat/databrowser/view/ten00124/default/table?lang=en. CR - Deb, C., Zhang, F., Yang, J., Lee, S. E. ve Shah, K. W. (2017). A review on time series forecasting techniques for building energy consumption. Renewable and Sustainable Energy Reviews, 74, 902–924. doi:10.1016/j.rser.2017.02.085 CR - Deng, L. (2014). Deep Learning: Methods and Applications. Foundations and Trends® in Signal Processing, 7(3–4), 197–387. doi:10.1561/2000000039 CR - De Ville, B. (2013). Decision trees. Wiley Interdisciplinary Reviews: Computational Statistics, 5(6), 448-455. CR - Dong, B., Li, Z., Rahman, S. M. M. ve Vega, R. (2016). A hybrid model approach for forecasting future residential electricity consumption. Energy and Buildings, 117, 341–351. doi:10.1016/j.enbuild.2015.09.033 CR - Enerji Görünümü; TSKB; 2020. https://www.tskb.com.tr. CR - Flores, V. ve Keith, B. (2019). Gradient Boosted Trees Predictive Models for Surface Roughness in High-Speed Milling in the Steel and Aluminum Metalworking Industry. Complexity, 2019, 1–15. doi:10.1155/2019/1536716 CR - Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 29(5), 1189–1232. http://www.jstor.org/stable/269998 CR - Ghasemi, F., Neysiani, B. S. ve Nematbakhsh, N. (2020). Feature Selection in Pre-Diagnosis Heart Coronary Artery Disease Detection: A heuristic approach for feature selection based on Information Gain Ratio and Gini Index. 2020 6th International Conference on Web Research (ICWR) içinde (ss. 27–32). IEEE. doi:10.1109/ICWR49608.2020.9122285 CR - Gilbert, C., Browell, J. ve McMillan, D. (2020). Leveraging Turbine-Level Data for Improved Probabilistic Wind Power Forecasting. IEEE Transactions on Sustainable Energy, 11(3), 1152–1160. doi:10.1109/TSTE.2019.2920085 CR - Gulshan, V., Peng, L., Coram, M., Stumpe, M. C., Wu, D., Narayanaswamy, A., ... & Webster, D. R. (2016). Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. Jama, 316(22), 2402-2410. Ivakhnenko, A. G. ve Lapa, V. G. (1966). Cybernetic predicting devices. PURDUE UNIV LAFAYETTE IND SCHOOL OF ELECTRICAL ENGINEERING. CR - Lee, S., Schowe, B. ve Sivakumar, V. (2011). Feature selection for high-dimensional data with RapidMiner. Technical Report. CR - Li, K., Xie, X., Xue, W., Dai, X., Chen, X. ve Yang, X. (2018). A hybrid teaching-learning artificial neural network for building electrical energy consumption prediction. Energy and Buildings, 174, 323–334. doi:10.1016/j.enbuild.2018.06.017 CR - Liu, Z., Wu, D., Liu, Y., Han, Z., Lun, L., Gao, J., ... ve Cao, G. (2019). Accuracy analyses and model comparison of machine learning adopted in building energy consumption prediction. Energy Exploration & Exploitation, 37(4), 1426-1451. CR - Morgan, J. N. ve Sonquist, J. A. (1963). Problems in the Analysis of Survey Data, and a Proposal. Journal of the American Statistical Association, 58(302), 415–434. doi:10.1080/01621459.1963.10500855 CR - Nazeriye, M., Haeri, A., Haghighat, F. ve Panchabikesan, K. (2021). Understanding the influence of building characteristics on enhancing energy efficiency in residential buildings: A data mining based study. Journal of Building Engineering, 43, 103069. doi:10.1016/j.jobe.2021.103069 CR - Olu-Ajayi, R. (2017). An investigation into the suitability of k-Nearest Neighbour (k-NN) for software effort estimation. International Journal of Advanced Computer Science and Applications, 8(6). CR - Olu-Ajayi, R., Alaka, H., Sulaimon, I., Sunmola, F. ve Ajayi, S. (2022). Building energy consumption prediction for residential buildings using deep learning and other machine learning techniques. Journal of Building Engineering, 45, 103406. doi:10.1016/j.jobe.2021.103406 CR - Panthong, R. ve Srivihok, A. (2015). Wrapper Feature Subset Selection for Dimension Reduction Based on Ensemble Learning Algorithm. Procedia Computer Science içinde (C. 72). doi:10.1016/j.procs.2015.12.117 CR - Pham, A.-D., Ngo, N.-T., Ha Truong, T. T., Huynh, N.-T. ve Truong, N.-S. (2020). Predicting energy consumption in multiple buildings using machine learning for improving energy efficiency and sustainability. Journal of Cleaner Production, 260, 121082. doi:10.1016/j.jclepro.2020.121082 CR - Poloczek, J., Treiber, N. A. ve Kramer, O. (2014). KNN Regression as Geo-Imputation Method for Spatio-Temporal Wind Data (ss. 185–193). doi:10.1007/978-3-319-07995-0_19 CR - Rapidminer, https://rapidminer.com/, erişim tarihi; mayıs 2022. CR - Robinson, C., Dilkina, B., Hubbs, J., Zhang, W., Guhathakurta, S., Brown, M. A. ve Pendyala, R. M. (2017). Machine learning approaches for estimating commercial building energy consumption. Applied Energy, 208, 889–904. doi:10.1016/j.apenergy.2017.09.060 CR - Salari, M. ve Javid, R. J. (2017). Modeling household energy expenditure in the United States. Renewable and Sustainable Energy Reviews, 69, 822–832. doi:10.1016/j.rser.2016.11.183 CR - Singh, U., Rizwan, M., Alaraj, M. ve Alsaidan, I. (2021). A Machine Learning-Based Gradient Boosting Regression Approach for Wind Power Production Forecasting: A Step towards Smart Grid Environments. Energies, 14(16), 5196. doi:10.3390/en14165196 CR - Tso, G. K. F. ve Yau, K. K. W. (2007). Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks. Energy, 32(9), 1761–1768. doi:10.1016/j.energy.2006.11.010 CR - Vringer, K., Aalbers, T. ve Blok, K. (2007). Household energy requirement and value patterns. Energy Policy, 35(1). doi:10.1016/j.enpol.2005.12.025 CR - Wang, F., Lu, X., Chang, X., Cao, X., Yan, S., Li, K., … Catalão, J. P. S. (2022). Household profile identification for behavioral demand response: A semi-supervised learning approach using smart meter data. Energy, 238, 121728. doi:10.1016/j.energy.2021.121728 CR - Wang, W., Ji, T., Sun, J., Xiang, L., Xie, T. ve Xie, W. (2020). Prediction model of household appliance energy consumption based on machine learning Prediction of Sea Clutter Based on Recurrent Neural Network Prediction model of household appliance energy consumption based on machine learning. Journal of Physics: Conference Series, 1453, 12064. doi:10.1088/1742-6596/1453/1/012064 CR - Wang, Z., Wang, Y., Zeng, R., Srinivasan, R. S. ve Ahrentzen, S. (2018). Random Forest based hourly building energy prediction. Energy and Buildings, 171, 11–25. doi:10.1016/j.enbuild.2018.04.008 CR - Zhang, J., Li, X., Jiang, W., Wang, Y., Li, C., Wang, Q. ve Rao, S. (2005). A Novel Ensemble Decision Tree Approach for Mining Genes Coding Ion Channels for Cardiopathy Subtype (ss. 852–860). doi:10.1007/11540007_106 UR - https://dergipark.org.tr/tr/pub/turajas/article/1163420 L1 - https://dergipark.org.tr/tr/download/article-file/2602007 ER -