Rule extraction and performance estimation by using variable neighborhoodsearch for solar power plant in Konya

The use of renewable energy sources in the production of electricity has become inevitable in order to reduce the greenhouse gases left in the atmosphere that cause the Earth to warm up. Although countries on a national basis have implemented a number of policies to support electricity generated from renewable energy sources, investments to produce electricity without a license on a local basis are not desirable. Those who want to invest medium and small scale for the most reason expect that this work will be supported by real data. Although the electricity generated by renewable investments is generated by simulation data, these data are not realistic for such investors. In this study, the climatic conditions of the power plant of 1 MW installed in Konya and power plant production data are monitored. The artificial neural network (ANN) can achieve a high value for accuracy, but these values are sometimes complex and unclear. In the literature, a number of studies have been conducted using different methods to overcome such problems. Real-time solar power plant (SPP) data were used to determine the feasibility and success of the proposed method. The variable neighborhood search (VNS) metaheuristic method was used to acquire the optimal values belonging to input vectors, Gh , which were maximized to the value of the fitness function Fs belonging to output class node s. The results obtained by the VNS method showed that the proposed method has the potential to produce the correct rules. Generally, energy investors are curious about the return on their investment. It is very important for energy providers to estimate how much electricity will be generated from existing solar power plants and accordingly determine the measures they will take to meet the electricity demand in the future. In this study, the performance estimation value obtained from the solar power plant depending on the weather conditions was obtained with 95.55% accuracy.


Introduction
Since electric energy is easily transformed into energy types, it is one of the most widely used energy types today. As a result of the use of coal and petroleum-based energy sources used in electricity generation, greenhouse gases released to the atmosphere cause global warming. In global warming, floods in some parts of the country due to climate change are seen as global problems such as drought in some regions and desertification in the world. The constant increase of the human population and the widespread use of electric energy cause these problems to grow even more. Energy supply companies are trying to meet this increasing energy demand. For the energy supply companies, the change in the production amounts due to the climate conditions in the power plants that generate electricity from solar, wind, and wave energy is an important problem. Accurate estimation of the amount of energy produced by these plants before production is very important in terms of network reliability. Inaccurate estimation of the amount of energy produced by renewable energy sources, voltage drop in the network, frequency change, etc. all cause undesirable situations. Yumurtaci suggested an ANN controller approach to provide energy management and performance of a hybrid system including hydrogen, wind, and solar energy technologies. In the literature review, studies were encountered using different methods related to rule extraction from a trained ANN [1]. Kulaksiz et al. presented a genetic algorithm (GA) method to improve the maximum power point tracing capacity of a photovoltaic system [2]. Sharma et al. proposed an optimal power point tracking and control method for a hybrid renewable energy system under an independent environment [3]. Raju et al. suggested the application of the improved distributed energy management system and request management of a solar microgrid using a multiagent system coordination method [4]. Zhang et al. developed a rule extraction method from a pruned GA [5]. Fukumi et al. obtained rules by using the evolutionary algorithm (EA) method from the trained ANN [6]. Zhenya et al. developed a method for obtaining rules from a fuzzy neural network (FNN) structure using the particle swarm optimization (PSO) method [7]. Dorado et al. presented a study to obtain rules from the trained ANN using the GA method [8]. Elalfi et al. suggested a method using a GA for accuracy and understandable rules extraction from a trained ANN [9]. Tokinaga et al.
proposed a method of obtaining the rule using the GA method to create intelligent and descriptive evaluation systems [10]. Hruschka et al. developed a rule extraction application using a clustering method from a trained ANN for the classification problem [11]. Kahramanlı [15]. Marghny suggested a rule extraction method for a trained ANN using the GA [16]. El Alami suggested a destructive algorithm for rule extraction from a trained ANN [17].
Kamruzzaman et al. proposed a rule extraction method from a trained ANN [18]. Rule extraction from a trained ANN is an important technique for easily interpreting the results obtained. Generally, the rules are shown and used in the "if ... then ..." structure. One of the most important disadvantages of an ANN is its incomplete clarification capabilities [19]. ANNs were used successfully in many studies, but the results of ANNs are not fully understood, and the information on activation functions and connections between neurons is limited [20].
Keedwell et al. developed a method to obtain rules from a trained ANN using the GA [21]. Researchers are working on a number of methods to obtain accurate and understandable rules from the trained ANN. In this study, the VNS algorithm was used to obtain correct, understandable, and valid rules from a trained ANN.
The structure of this study is as follows: In Section 2, the details of the materials and methods used in the study are explained. In Section 3, the results of the experimental studies are presented. In Section 4, evaluations and discussions of the study are described. In Section 5, conclusions are presented.

Solar power plant
The use of renewable energy sources is increasing day by day and the infrastructure of the production system is shifting from the central network to many small capacity production centers called a distributed network.
In this study, production data obtained from a solar power plant (SPP) with power of 1 MW established in Konya's organized industry zone were used. The SPP is located in Konya at 37 • 58' 32" N, 32 • 37'19" E. A representative view of the SPP installed on a flat area with a surface area of approximately 19000 m 2 in Konya is shown in Figure 1. Where the SPP was established, a pyrheliometer was used to measure daily irradiance solar radiation (W/m 2 ).
Inverter output obtained from this voltage transformer will be fed to the center of the network with 31.5 kV. Solar panels with stainless steel structures have 32-degree angles in the southeast direction. Solar panels are fixed, because this angle would be more efficient in this direction. The SPP field has about 0.5 km distance from the center of the power distribution substation.
Twenty-one panels are connected in series and these series-connected panels are then connected in parallel to each other in 4 groups to the inverter input. In this connection, a total number of 50 inverters are combined and the whole system is connected to the network in order to sell generated electricity to the main grid. Weather conditions directly affect the efficiency of the solar panels and electricity production. Cloud, rain, lightning, hail, or snow effectively reduce the amount of sunlight that can reach the solar panels' surfaces. Depending on how bad the weather conditions are, solar panels will experience a remarkable drop with respect to energy efficiency. It is possible to monitor the production data of the SPP instantly, daily, monthly, and yearly with the Supervisory Control and Data Acquisition (SCADA) software available in the system. As of June 2, 2015, 3152.7 GWh of energy was generated from the installation of the SPP. For example, on June 16, 2015, 5137 KWh was produced, and on June 16, 2016, approximately 5943 KWh was produced in daily energy production. The cost of the installed SPP is approximately 1.1 million dollars and the income obtained in 16 months is 409859 dollars. If the cost of the land where the system is installed, annual maintenance, and breakdown costs are not taken into consideration, the investment cost of the facility appears to be amortized in approximately 4 years.
In the field where solar power plants are established, grasses growing from the ground, dust accumulation on the panels, bird droppings, snow, etc. have disturbing effects and lead to a decrease in the electricity produced in the PV panels. In order to estimate these situations, it is necessary to estimate the amount of power that the power plant needs to produce as a result of processing data (W/m 2 ) from the pyrheliometer. The total amount of power that the power plant needs to generate instantaneously is calculated by multiplying the data from the pyrheliometer, the total surface area of the PV panels (m 2 ), and the panel efficiency (16.1%). The power generated by the power plant can be obtained from SCADA software. In this study, the ratio of daily or instantaneous generated power to calculated power value is defined as performance. It was observed that the performance value of the PV plant used in the study ranged from 75% to 80%. If it falls below this value, the PV gives information that the plant needs to be cleaned. It was observed that the performance value of the PV power plant decreased by 2.4% at most due to dusting. In the case of exceeding this value, it is observed that temporary or permanent faults generally occur in the plant. This situation can be detected instantly by the SCADA software already in the system.

Artificial neural network
Artificial neural networks (ANNs) are a very popular method developed with inspiration from the neural networks in the brain. The backpropagation algorithm for the calculation of the weights to be used in the ANN was used [22]. The basic ANN structure is shown in Figure 2. Basically, an ANN consists of a single input layer, at least one hidden layer, a single output layer, and a weighted interconnection that connects the nodes in those layers. The number of nodes used in the input and output layers depends on the type of problem. In this study, the values of the properties to be used in the input layer and the class values of the output layer were encoded with binary values. The binary vectors for the input and output layers were used to train the ANN. The training of ANN was continued until the requested output value was obtained. After the training of the ANN was completed, two weight vectors called W 1 i,j and W 2 j,s between the layers were obtained. W 1 i,j is the weight vector of the nodes between input and hidden layers and W 2 j,s is the weight vector of the nodes between hidden and output layers. In addition, a sigmoid activation function was used for the activation calculation of the ANN.

Variable neighborhood search
Variable neighborhood search (VNS) is a metaheuristic method improved to analyze optimization problems based on the principle of relocating neighborhood structures during a local search in datasets [23]. The VNS starts processing on initial values y and a set of neighborhoods V l (l=1,2,…,l max ). In each iteration process of the VNS algorithm, a random value y' is calculated with respect to the lth neighborhood, V l (y). In the next step, a heuristic local search method is applied to obtain a new value y" from the current value y'. If the value y" is better than the initial value y, the new y" value is updated and the process continues with the first neighborhood structure V 1 (y); otherwise, the operations are repeated considering the next neighborhood structure, V l+1 . The final value obtained as a result of the iterations was accepted as the best local value according to all neighborhood values of the VNS method [24]. The pseudocode of the basic VNS method is specified in the following process steps [25]: 1. Process start. Detect the set of neighborhood structures V l , for (l=1,2,…, l max ), used in the VNS method; create initial values y; select a process stopping criterion; 2. Repeat the following steps until the stop criterion is satisfied: (a) Define the first neighborhood value, l =1; (b) Repeat the following steps until l = l max ; i. Create a random solution value y' from the lth neighborhood structure of y (y' ∈ V l (y) ); ii. If this solution value is better than the available initial value, y=y', and continue the search with the first neighborhood structure V 1 (l= 1); otherwise, define the valuel=l +1.

Classification accuracy
Accuracy calculation was used to measure the classification ability of the output class belonging to the ANN. Classification accuracy is the ratio of the total number of correct classification outputs to the total number of output samples. In this study, Eq. (1) was used to calculate the accuracy of the proposed algorithm:

Accuracy(%) = ( Total number of correct classification outputs
Total number of outputs .

Dataset
The SPP dataset obtained from the power plant of 1 MW installed in Konya was used for rules extraction from the trained ANN. The SPP dataset with 6 attributes for the input layer of the ANN and a class value for the output layer contains 866 data. The ID symbol of attributes ( P 1 , P 2 …, P 6 ), attribute names, data type (continuous/categorical) of attributes, number of binary variables, and subrange values of attributes are shown in Table 1.
Subrange values were determined to convert the continuous data of the attributes in the dataset into categorical data. Simple partition and simple binning methods were applied to obtain the subrange values. To present the input layer of the ANN, the values of the categorical attributes were converted to binary-encoded values, and the binary-encoded values are shown in Table 2. A number of methods have been used for binary encoding of attribute values in the SPP dataset [9]. In the SPP dataset, T was considered to have a number  [6]. Since the SPP dataset used in the study has two classes, the number of nodes (S) in the output layer of ANN was determined to be 2.

Results
In this study, a new method for rule extraction from the trained ANN was proposed using the SPP dataset. In the study, the VNS method was used to optimize the fitness function; thus, the most effective and correct rules were extracted. The aim of this study is to obtain optimum values that maximize the fitness function F s .
Using W 1 i,j and W 2 j,s weight vectors, the value of the sth output node F s was obtained by Eq. (2): where I, J, and S are the number of nodes in the input, hidden, and output layers, respectively. θ j is the threshold value for the jth neuron of the hidden layer. F s is the fitness function based on the sigmoid function used in the VNS method. The maximum value that the fitness function can receive is 1. The input vector was obtained that maximizes the fitness function F s for rules extraction from the trained ANN. The optimal binary coded values were used to acquire a rule for class s by Eq. (3): 1. After distinguishing input and output vectors from the dataset, the data are converted to binary-encoded values.
2. Using binary coded input and output vectors, the ANN is trained. 5. An initial population g is created. Detect the sets of neighborhood structures V l for (l=1,2,…, l max ); a stop criterion is established for terminating the system. 6. Apply the VNS method to obtain optimal solution values from neighborhood structures and take into account the best retention function value.
7. Maintain all solutions of F s values greater than the accepted threshold.
8. Find the solutions with maximum value from the solutions obtained for the rule extraction and add them to the rule list by deleting them from the solution set.
9. Convert these solutions into language rules in the (if-then) structure.
In this study, 866 input vectors ( G h ) were used for rule extraction from the trained ANN. In the ANN system, 776 input vectors were used for the training process and 90 input vectors were used for the testing process. An ANN structure with maximum fitness value was obtained by assuming the number of input nodes as 18, the number of hidden nodes as 7, the number of output nodes as 2, test coefficient as 0.001, momentum coefficient as 1, and learning coefficient as 1. The type of activations function (AF) is sigmoid function. The number of iterations and the minimum error rate were accepted for the stopping criteria of the ANN. These values were chosen as iteration number 20000 and minimum error rate 0.01. The ANN used in this study was trained with the trainlm training algorithm using the SPP dataset. The momentum parameter used in the ANN was used to prevent the application from converging and jamming at a local minimum point. The maximum value ( l max ) of neighborhood structures V l as 2 was considered. The neighborhood structures used in the study are random two interchange and random cross exchange structures, respectively. In the proposed method, changes in the number of rules were determined according to the accepted threshold values. The results showed that the proposed method has success and stability. In Figure 3, the graph of the results obtained from the experimental procedures and the graph of the predictions produced by the ANN method are shown. In Figure 4 the regression curve of the performance class value of the SPP dataset is shown.
The rules extracted from the trained ANN are indicated in Table 3 and Table 4. The binary-coded values in the dataset are used for rule extraction. If an attribute in the rule contains more than one value, the attribute values must be combined with the "OR" connector. The use of the operator "OR" is not desirable in the rule extraction process because "OR" is called the connector of uncertainty. Therefore, the rules without the operator "OR" have shown the consistency and stability of the proposed method.
When the rules in Table 3 and Table 4 are examined, no statements of uncertainty were observed in the structure of the rules. When an inappropriate rule is entered, the output "Not belonging to any class" is      0.157] is Performance = (-; 0.14]" is examined, the system will produce a performance prediction.
In this example rule, the estimated value generated is less than 0.14, as opposed to the input vector values entered. That is, the performance estimation value corresponding to the input values entered is below 77.77%. These rules are used in prediction, inference, and analysis.
When the regression curve of the performance attribute is examined, it is seen that the ANN method performs the training with an acceptable accuracy with R 2 = 0.834.
With the proposed method, an accuracy value of 95.55% was obtained by using Eq. (1) in the process of extracting the rules from the trained ANN.

Discussion
Rapid and unpredictable changes can be observed due to instant weather events in electricity generation from renewable energy sources. During the increase in demand, fossil fuel-powered power plants are put into operation in order to keep the mains voltage and frequency at the required values and to meet the energy demand. Due to the nature of such resources, the differences between the estimated production values and real-time production amounts vary and are uncertain. As a result of these volatility and uncertainties, the need for network flexibility regarding network planning and management appears. In the event that large-scale renewable energy sources are disconnected or connected to the network, it is difficult to respond to changes in demand in minutes in peak time. In order to ensure that this rapid descent and output is provided by the system, the generation plants outside of the renewable energy sources must be available in such a way as to be able to capture these accelerations. In this study, we investigated how estimation and rule learning process can be improved by using VNS in the trained ANN using solar power plant data. In the literature review, we did not find a study that used these methods for solar power plants. In the rule learning study, a more effective and efficient working environment was provided by using the VNS method. Researchers working on solar power plants can achieve the necessary estimates and inferences easily and quickly with this application. With the application developed, very successful and high quality results were obtained. The VNS metaheuristic method was used to optimize the ANN's prediction and rule extraction. It is an advantage that no rule extraction is performed on the ANN trained using the VNS method. It is a disadvantage that the number of neurons and layers in the hidden layer of ANN is determined by random methods. A study that was open to development and analysis was obtained for those who research performance monitoring and analysis of solar power plants. This study is planned to be tested on datasets with more input features in the future. In the next study, new neighborhood structures will be developed for the VNS method and better results will be obtained.

Conclusion
The impact of renewable energy sources, which are headed by solar and wind energy, on the electricity grid is increasing day by day. Network flexibility is an important parameter in real-time electrical network operations. In systems with low flexibility, integration of new wind and solar plants into the grid reduces system durability, leads to high costs, and makes it difficult to incorporate sustainable energy into the system. Having more reserve power plants increases the cost of unit energy. Failure to have as many spare power plants as necessary may result in loss of energy balance, undesirable changes in network frequency, and often service disruptions. This study is the first step in obtaining the data required for predetermining network requirements by predicting the energy that can be produced depending on the weather conditions of any solar power plant. With the proposed method, it is also aimed to create a substructure that allows estimation of auxiliary power plants and forces backup to be kept by estimating the production values of the existing other solar plants. This study investigated whether it is beneficial to use a variable neighborhood search method to perform performance estimation by using an artificial neural network for some data obtained from a solar power plant. Our study shows that it can benefit researchers by providing an interactive, interesting, and unique study experience. This study provides experimental support for estimation and optimization of solar power plant environments for the existing literature. From the results of the experimental study, it was seen that the VNS method is a potential support and improvement method, especially for trained ANN applications. In this study, a new method was used to extract the correct classification rules from the trained ANN using the VNS method in the SPP dataset. The proposed VNS method did not perform any approach, examination, or information removal for the fitness function. Computational results and experimental studies have shown the success of the proposed VNS method in achieving highly accurate and comprehensible rules. It is observed that all 6 input attributes are used in the rules specified in Table 3 and Table 4. This shows that the success of the rules achieved and the interpretability are strong. In addition, the operator "OR" that has uncertainty expression in extracted rules was not observed. Thus, using this algorithm, high-accuracy decision support information can be accessed in large datasets. Experimental studies have shown that the proposed VNS algorithm can establish accurate and understandable classification rules. This allows quick decision-making and correct results for problems. In the future, we plan to conduct more efficient and realistic studies by adapting the VNS method to some machine learning algorithms. We aim to move the current study into mixed reality technology. We plan to increase the motivation of researchers by preparing more realistic and interactive models in future studies. With this system, companies producing energy can realize their commitments against companies that demand by estimating the value of electricity they will produce in a more planned and responsible way. Because production depends on natural conditions, prediction is very important. Investors may be able to anticipate the potential energy to be generated based on weather forecasts, so that they can be informed in advance about the procedures required to manage their financial resources. In this study, energy provider companies will be able to perform weekly, monthly, and annual energy planning by using the predicted data from their own renewable energy power plants depending on the weather conditions. Again, with the help of software that they can develop, they will be able to create this work in a more profitable way. By making the predictions correctly, the network reliability will be stabilized, thus preventing unnecessary operation of the power plants, resulting in a significant increase in the rate of profitability. The aim of this study is to improve and develop different methods in the future.