The Estimation of Turkey's Energy Demand Through Artificial Neural Networks and Support Vector Regression Methods

Energy demand and consumption are very important for the development and progress of countries. Energy demand is increasing rapidly day by day, especially in developing countries. Energy policies should be determined correctly to sustain the industry sector and make the right investments. Forecasting energy demand in the near and long term is important for the strategy that countries will follow. In this study, by using the monthly electricity energy data realized in Turkey between January 2016 and March 2020 and other data affecting this, a model to estimate electrical energy consumption was developed. In this model, artificial neural networks (ANN) and support vector regression (SVR) were used as methods. This study used 15 independent variables as the input value, and Turkey's energy consumption value as the dependent variable was estimated. Correlation, coefficient of determination, MAE, MSE, RMSE, and MAPE statistical methods were used to measure success and error rate, and both models were found to have acceptable error values and success estimation rates. According to the results, it was concluded that the ANN method was more successful than the SVR method.


Introduction
Countries have to consume more energy to continue their progress and to reach the status of developed countries. It shows that energy demand will be important in the coming years as it is today. The need for energy is increasing day by day for the development and progress of countries. (Bayramoğlu, Pabuçcu, & Boz, 2017).
Forecasting energy demand is very important in terms of making sustainable and correct investments. Accurate planning will both provide economic benefits and reduce dependence on other countries.
One of the methods frequently used on energy demand and consumption estimation is the support vector regression method. Kaytez et al. used SVR in order to predict Turkey's electricity consumption (Kavaklioglu, 2011), Wang et al. for China's hydropower consumption (Wang, Yu, Tang, & Wang, 2011), Jung et al. for building energy consumption (Jung, Kim, & Heo, 2015), Yang et al. to analyse the data affecting electricity consumption.
In this study, in order to estimate the electricity consumption in Turkey, a model consisting of 15 independent variables was designed. Artificial neural networks and support vector regression methods were used to estimate this model. In this study, the data such as hydraulic, imported coal, hard coal, lignite, natural gas, sun, wind, geothermal, biomass, asphaltite, fuel oil, electricity imports, Turkey's temperature average, Turkey's monthly population, the number of days worked, the number of vacation days between 2016-2020 were used as the independent variable. Alphanumeric Journal Volume 8, Issue 1, 2020

Artificial neural networks (ANNs)
ANNs that mimic the way the human brain works have an important place among artificial intelligence technologies. ANN methodology has many important features such as learning from data, making generalizations, working with an unlimited number of variables (Ataseven, 2013). As seen in Figure 1, the artificial neural network model consists of 3 layers as input layer, hidden layer and output layer.  (Khashei & Bijari, 2010) The input layer is the first layer and ensures that external data is received into the artificial neural network. The input layer consists of parameters that affect the problem, and the number of neurons in the input layer is shaped by the number of parameters. The output layer functions to transmit information outside. The hidden layer is located between the input layer and the output layer. Neurons in the hidden layer have no connections with the external environment. They only receive data from the input layer and send the data to the output layer. There are one or more hidden intermediate layers between the input layer and the output layer from which the data are obtained. Each neuron in a layer is connected to all neurons in a top layer with different numerical weights (wi,j). Weights indicate the effect of the neuron in the preceding layer on the neuron in the upper layer. Positive weight values indicate reinforcement and negative values indicate obstacle (Asilkan & IRMAK, 2009;Gallant & Gallant, 1993).
ANN is classified under two architectural structures, "feed-forward" and "feedback", depending on the way the cells attach (Slaughter & Hobson, 2006). Feed-forward networks have a network structure in which data are transferred forward only from input units to output units. The outputs of neurons in one layer are given as inputs to the next layer through weights. There is no connection between neurons in the same layer or to the previous layer. Feedback networks have a network structure in which the data flow can be not only forward but also backwards (Asilkan & IRMAK, 2009). It obtains the information to be used in changing the weights by comparing the output value it generates according to the network input information to the desired value. Training continues until the difference between the entered value and the desired value is less than the predetermined value as the error value. When the error value Alphanumeric Journal Volume 8, Issue 2, 2020 falls below the desired value, all weights are fixed, and the training process is ended. (Çuhadar, Güngör, & Göksu, 2009).
Each unit on the network takes the value from the neuron that precedes it, and weighted sums are calculated. The input data are multiplied by weights and advance to the next layer. Weights are chosen randomly at the beginning. Multiplication results are collected in hidden layers, and the result is passed through a transfer function. Signals coming from the input layer move forwardly towards the output layer consecutively. (1)

Support Vector Regression (SVR)
Support Vector Machines (SVMs) are a kind of machine learning tool that can solve classification, regression and innovation detection problems with better generalization compared to other traditional learning algorithms. (Ali & Smith-Miles, 2006). It was first proposed by Vapnik in 1995 for classification and regression type problem solutions (Vapnik, 1995). SVM operates based on structural risk minimization and is also successful in applications with high dimensions but few data. The type of SVM used in classification applications is known as SVC (Support Vector Classification), and the type used in regression applications is known as SVR (Shen, Pei, Fisher, & Lee, 2006). The purpose of SVM is to separate the input data, whose classes are defined by the class tag, as shown in Figure 2, into two separate classes by specifying the most appropriate hyperplane (Fidan, Uzunhisarcıklı, & Çalıkuşu, 2019). SVM can be divided into Linear and Nonlinear SVRs depending on the state of data discrimination.
Linear SVM is the simplest SVM model that can only be applied to distinguishable linear data. Considering that the training data set is = {(⃗⃗⃗ , ), = 1. . . } and it consists of N elements ∈ {−1, +1}, class tag ⃗⃗⃗ ∈ is any example in ∈ and n dimensional space. In the ( ) = ⃗⃗ + expression ⃗⃗ indicates the normal function of the decision function, expression indicates the points on this line and b indicates the trend value. The goal is to find ⃗⃗ and b with the help of training data, and ultimately to train the system (Küçüksille & Ateş).
Functions that can separate linearly distinctive data sets are very limited for practical applications. This is because no results can be obtained by using linear in most practical applications. In such cases, SVM moves the input space to a higher level of space through various transformation processes and tries to perform the linear separation (Çomak, 2008)

Application
Turkey's energy demand was estimated using 15 independent variables in this study. In this study, the data such as hydraulic, imported coal, hard coal, natural gas, sun, wind, geothermal, biomass, asphaltite, fuel oil, electricity imports, Turkey's temperature average, Turkey's monthly population, the number of days worked, the number of vacation days between 2016-2020 were used as the independent variable.
There was a total of 52 data in the study. Since the number of data was low, the number of data was increased by using different cross-validation methods and different groupings. In this method, the data are initially divided randomly as test and training data. While training data are used in the model installation phase, test data are not used in model setup, and the accuracy of the model is tested on this new data set (Bishop, 1995).
In the study, decimal scale normalization method was used to improve the performance of machine learning methods and increase the accuracy rate. In this technique, the decimal point of the values of the attribute is moved. This movement of decimal points completely depends on the maximum value among all values in the attribute. The decimal Scale normalization formula is,

= 10
(2) Where, vi is the scaled values, v is the range of values, j is the smallest integer Max(|vi|)<1.
In the study, as seen in Figure 3, the artificial neural networks model was designed using a backpropagation algorithm consisting of an input layer, two hidden layers and an output layer. There are 15 neurons in the input layer and 1 neuron in the output layer. The number of neurons in the hidden layer was determined by trial and error method to give the best results for the training and verification clusters, and it was designed to have 3 neurons in each hidden layer. Sigmoid activation function was Alphanumeric Journal Volume 8, Issue 2, 2020 used as hidden layer activation function, and linear activation function was used as output layer activation function. The nonlinear method was used as the SVR method. Different combinations were tested for SVR parameters of C and ε, and finally we determined that we obtained a good performance with ε 0.01 and C 500, and we decided to use it. Polynomial was preferred as the core function, as it yielded the most successful result as a result of the tests. The Polynomial kernel formula is, (4) where Y i is the observation value and Yi * is the predicted value, Yave average of observation values. Table 1 shows the comparison of the success and error performance of the models. According to the MAPE error criteria in the model created, the estimation error of the test data was 4.9% for ANN and 7.4% for the SVR. Estimation models with MAPE values below 10% are considered as models with "high accuracy" or "very good" accuracy levels (Lewis, 1982;Witt & Witt, 1992). Here, the ANN model was found to be more successful since the method with the lowest value would be considered more Alphanumeric Journal Volume 8, Issue 1, 2020 successful. MSE and RMSE approaching zero indicates that the model is more successful and shows high performance (Singh, Basant, Malik, & Jain, 2009). According to MSE and RMSE, ANN was found to be a more successful model as well. MAE is a value that measures the average size of errors in a series of estimates. According to Table 1, ANN yields estimates with fewer errors, since its being close to zero indicates a less erroneous estimate.
The correlation coefficient is used to determine whether there is a relationship between the true value and the estimated value, and if any, the direction (forward / reverse) and severity of this relationship. If there is a positive relationship, one is increasing while the other one is also increasing. If there is a negative relationship, it indicates that one is increasing and the other is decreasing. If the correlation coefficient is greater than 0.8, it is interpreted that there is a very high correlation. The correlation coefficient was found to be 0.936 for ANN and 0.9 for SVR. Although both methods seem to have a strong relationship, ANN has a stronger correlation.
Coefficient of determination allows us to determine how much of the variation in one variable is explained by the other variable. The high value of this value indicates that we can explain the dependent variable with the independent variables in our model. Coefficient of determination was calculated as 0.913 for ANN and 0.851 for SVR. Although Coefficient of determination has acceptable values in the two methods, it is seen that ANN is a more successful model.

Conclusion
In this study, a model that estimates monthly energy consumption using 15 different independent variables was developed. Data on hydraulic, imported coal, hard coal, lignite, natural gas, sun, wind, geothermal, biomass, asphaltite, fuel oil, electricity imports, Turkey's temperature average, Turkey's monthly population, the number of days worked, the number of vacation days were used as the independent variable. In the model, the data between 2016-2020 were used, and ANN and SVR were employed as a method.
Correlation, Coefficient of determination (R 2 ), MAE, MSE, RMSE, MAPE statistical methods were used to measure the success and errors of the methods. According to the study, it was found that both methods were acceptable and predicted with high accuracy, but ANN was found to be more successful.
Meeting the energy demand in a sustainable way is vital for the development and progress of the countries. Accurate estimation of energy consumption ensures investment planning and appropriate use of resources. The industrial sector and energy policy will be shaped according to this information.
More successful and stable estimates can be made by researching and using different variables for estimating energy consumption.