Estimation of hourly global solar radiation using artificial neural network in Adana province, Turkey

Since global solar radiation (GSR) is an important parameter for the design, installation, and operation of solar energy-based systems, it is important to have precise information about it. As the indicating devices are expensive and their requirements such as operation and maintenance should be carried out, the measurement of solar radiation cannot be frequently taken. On the other hand, the measurements of different meteorological parameters such as relative humidity and ground surface temperature are more prevalent in meteorology stations. Therefore, the estimation of solar radiation is a significant parameter for the areas where the measurements could not be performed and to complete the missing information in databases. Many different models, software, and simulation programs are utilized to calculate solar radiation data, provide an economic advantage, and obtain high accuracy. The main purpose of this study is to perform an estimation of solar radiation in Adana, where is on the east of the Mediterranean in Turkey, by using an artificial neural network (ANN) model. The best estimation performance is obtained by optimizing the neuron numbers used in the network’s hidden layer with the trial and error method. With this aim, hourly data including wind speed, wind direction, humidity, actual pressure, and average temperature are taken as inputs while solar radiation is taken as a target. All these data, which is for 2018, has taken from the Turkish State Meteorological Service. A linear correlation coefficient value has been obtained to be about 0.87313 with the mean square error (MSE) of 5.8262x10 7 W/m 2 for the testing data set. The ANN’s testing/validation results show that it has a low MSE, indicating the accuracy and adequacy of the network model. Besides, the predicted ANN output is evaluated to be remarkably close to the measured target data by considering the linear correlation coefficient.


INTRODUCTION
Population growth, techno-economic development, and the depletion of fossil fuel reserves in the world have revealed the need for energy and increased the demand for renewable energy sources.Solar energy, which is one of the renewable resources providing sustainable energy using solar panels, is a cost-effective alternative to fossil fuels and it attracts more attention from researchers, governments, and industries in recent years.Solar energy has a wide range of applications such as generating electricity, comfort heating of spaces, production of hot water and steam for industries, water desalination, and solar drying [1].There are two main forms to harness solar energy: the use of solar photovoltaic (PV) and solar thermal collectors [2].
The literature contains numerous studies [1,3,4] emphasizing that using solar thermal collectors in the heating of buildings provides substantial energy savings.In [3], a transient simulation model of a solar-assisted heating system has been developed for a duplex house in a northern city of Iran.The solar collector angle has been optimized to absorb the most amount of solar irradiation for energy saving.In [4], a solar water heater has been modeled in MATLAB and simulated for a poultry farming home in Iran.The thermal performance of the system has been investigated by putting forward two auxiliary heaters within the storage tank and using them in three different cases.In [5], the energy and exergy efficiencies of a gas power plant integrated with solar collectors have been investigated.It is verified in [2,[6][7][8][9][10][11][12] that the nanofluid enhances the heat transfer and thermal conductivity properties of solar collectors.In [11], mathematical models have been developed using neural network (NN), adaptive neuro-fuzzy inference system (ANFIS), and experimental data to determine the thermophysical properties of nanofluids.In [12], statistical, regression, and experimental studies have been performed to evaluate the plausible application of an oil-based carbon nanotube nanofluid for solar thermal receivers.In [13], experimental research has been carried out to assess the thermal performance of an evacuated tube solar collector working with carbon nanotubes-water nanofluid.The thermal efficiency of the collector has been measured and then optimized using response surface methodology (RSM).In [14], the thermal energy consumption has been optimized by using electrochromic components with a new nanocomposite layer.
In recent years, the number of researches on the installation, technical characteristics, operation, and economic evaluation of solar photovoltaic systems have been reported.Hoseinzadeh et al. [15] analyzed the energy consumption of a zero-energy building using in Design Builder Software.The amount of monthly electric power consumption of the building, solar power generation capacity per month, and the surface area required by solar panels were calculated.Dey et al. [16] presented the design, simulation, and economic evaluation of a 90 kW grid-connected PV system in India.Cui et al. [17] created a techno-economic model of a grid-connected PV system for domestic building applications to analyze energy production and economic performance.
In the design of solar energy systems, the climatic conditions data of the region such as global solar radiation (GSR), ambient temperature, moisture content, wind speed, rainfall profile, and clearness index should be evaluated.Besides, attention should be paid to the proper selection of major system components (PV module, inverter, and battery) and the design of their size, the required installation area, shading conditions, and optimum tilt angle when planning solar power system installations.For design purposes or continuity of solar systems or many other processes, precise knowledge of GSR is needed.The total amount of solar energy received by the Earth's surface is described as GSR, which is generally expressed as W/m 2 .Pyranometers are used for measuring solar irradiance on a planar surface.However, the high cost of these devices, the requirements for calibration and maintenance, and the need for a specialist to use them reveal the need for alternative methods for collecting the GSR data.In the literature, there are many techniques developed for the prediction of GSR by using other meteorological parameters that are readily available such as humidity, temperature, and pressure.
Yıldırım et al. [18] performed a study on the estimation of global solar radiation by using an optimized model based on Artificial Neural Network (ANN) and Angström-Prescott methods.In this study, the accuracy of ten models with different functions was analyzed for the 11-years daily data collected from four stations in the Eastern Mediterranean Region in Turkey.In [19], two new methods were discussed to evolve parametric models for use in GSR estimation.With this aim, fourteen parametric models based on air temperature, maximum temperature, minimum temperature, precipitation, sunshine duration, and relative humidity in the literature were reviewed and evaluated.To perform the study, parametric data for Göksun, Tarsus, and Adana for the period between 2012 and 2015 were taken from the Turkish State Meteorological Service.Souza et al. [20] investigated eleven modified models (M1 to M11) based on the Agström-Prescott to estimate the daily global and monthly averaged solar irradiation for or Alagoas State, Northeastern Brazil by using sunshine duration data between 2007 and 2010.It was stated in this study that cloudiness had a bigger effect on high errors as compared to other geographical conditions.Results in [20] showed that the M1 model along with the M11 model had the best accuracy by virtue of their efficiency and simplicity.
There are some studies in the literature where GSR is estimated by making use of machine learning and deep learning methods.Khosravi et al. [21] took advantage of machine learning algorithms to make an accurate estimation of hourly GSR between the years 2010 and 2016 in Abu compared with thirteen models existing in the literature by using different error analysis methods.
Adana province is one of the places which has high solar energy potential in Turkey as can be seen from previous studies such as in [31].In this paper, a case study has been conducted in which the solar power potential of Adana province is revealed by the estimation of GSR.There is a lack of publications in the literature suggesting an ANNbased model developed using hourly meteorological data to estimate the GSR in that region.This study demonstrates the calculation of hourly GSR using the ANN model with the LM algorithm for Adana in order to fill the gap in the literature.With this aim, this paper is organized into four chapters.Section 1 provides a literature review on GSR.In Section 2, the algorithm used in this study has been presented.Section 3 focuses on how the collected data has been used and gives a discussion on the results.As a consequence, in Section 4, the significant contributions of the paper are highlighted.

MATERIALS AND METHODS
In the literature, ANN models based on the LM algorithm are mostly used for making accurate predictions of GSR.The ANN model is an alternative method to attain the predicted GSR data since it is not always possible to measure the GSR due to the reasons for the cost and maintenance of the measuring instruments and the installation difficulties depending on surface conditions.The ANN consists of three main layers: an input layer that takes possession of the measured data, an output layer generating estimated data after making calculations, and a hidden layer that creates a connection between input and output layers.In this study, actual pressure, wind speed, wind direction, relative humidity and average temperature have been chosen as inputs while GSR has been obtained as the output for Adana province as shown in Figure 1.Table 1 summarizes the ranges of the data used to train the ANN.

Meteorological Data
Adana is one of the biggest cities in Turkey.Summers are hot and dry whereas the city sees low temperatures and has rainy weather in winter.Adana is located in the Eastern Mediterranean region and bordered by Osmaniye from the east, Mersin from the west, Nigde and Kayseri from the north.Coordinates from the Northern Hemisphere are 37°00′06″ northern (N) latitude and 35°19′44″ eastern (E) longitude.The altitude of Adana is 23 m. and the distance from the sea is 160 km.The location of Adana province is demonstrated in Figure 2.

Details of the Employed Methods
There are many methods in the literature to get an accurate estimation of GSR.However, choosing the most appropriate model for a particular subject or area is very Musa Island.For this purpose, two different neural network models were built up, namely N1 and N2.The first model N1 consisted of 5 meteorological parameters whereas the second one was a time-series estimation.Khosravi et al. [22] utilized some artificial intelligence methods in Matlab to estimate daily GSR by taking data from twelve stations in distinct climatic zones of Iran.The results proved that the improved models used in that study could predict global horizontal irradiance.Kaba et al. [23] implemented deep learning methods on the daily GSR data taken from 34 stations in Turkey between 2001 and 2007.Astronomical factors, extraterrestrial radiation, and climatic parameters were put into account as inputs and daily GSR was obtained as the output.It was shown that usage of more minimum and maximum temperature data increased the correctness of estimation and a deep learning method was a good alternative due to its high precision by considering the previous studies.In [24], two different architectures of the Weighted Gaussian Process Regression (WGPR) method were advanced utilizing the data between 2013 and 2015 in the Saharan climate for the Ghardaia region in Algeria to forecast daily GSR.Results showed that the WGPR model could be chosen as a machine learning model with high accuracy.Hocaoğlu et al. [25] performed a study to estimate GSR of Afyonkarahisar and Antalya in Turkey enforcing the Mycielski Markov model on the 6-hours data.Janjai et al. [26] showed that monthly average hourly geostationary satellite data collected from 25 stations between 1995 and 2002 could be used as an alternative to performing GSR prediction in Thailand.It was seen in [26] that the improved model gave similar results as compared to real measurement by pyranometers.
The LM algorithm is one of the most popular algorithms used to make an accurate estimation of GSR.Çelik et al. carried out a study using an optimized ANN model to estimate monthly GSR for Adana, Mersin, Antakya, and Kahramanmaraş.The readily available data including the years between 2000 and 2010 were taken from Turkish Meteorology Services and results showed that the model was appropriate and convenient due to its high accuracy [27].Arslan et al. [28] developed a model by using ANN for the prediction of daily GSR of Mersin with measured data between April 2017 and March 2018 and compared the performance of the model with some models existing in the literature.Chang et al. [29] analyzed and improved two empirical solar models namely Nimiya and Zhang for all-sky conditions.A new solar model was developed by using hourly sunshine duration data taken from the National Climate Data Center in Beijing, China for 2017.It was stated in [29] that the hourly sunshine duration parameter was very effective in estimating solar radiation under all-sky conditions and the proposed model performs well.In [30], three new models were built up using Moving Least Squares Approximation (MLSA) to estimate monthly average daily GSR for Antalya.The developed models were  implemented to obtain a much better relationship.The output value is calculated by applying a mathematical function (transfer function) to the net input and is sent to other neurons.This process is expressed mathematically in the equation (Eq.1) given below [18,19].
Here, the 'output' is the results generated by the neuron with respect to inputs, 'x k '; where 'f ' is the activation function, 'n' is the number of signals, 'W k ' is the weight.
The ANN generates an output by processing the input data and determining the error after comparing the output with the target.The synaptic weights of the network are modified by the training algorithm proportional to the error.The goal of the training process is to reduce the error below a predetermined value on an iterative basis [32].

Levenberg-Marquardt Algorithm
The LM algorithm, composing of two minimization methods namely Gauss-Newton (GN) and Gradient important.In the literature, rectilinear and non-linear type of the LM algorithm and ANN models have been discussed.Taking into account the error, which was statistically made during estimation, the performance of the ANN has been improved and the best model has been chosen.

Artificial Neural Network
ANNs, which have been developed by the inspiration of the nerve cell structure of the human brain, are systems with algorithms that can be modeled and learned.An ANN consists of artificial neurons called process elements that receive inputs, performs a nonlinear operation, and transmits the results to neighboring processors.Modeling of rectilinear and non-linear systems can be performed easily using ANNs.The structure of Feed Forward Backpropagation (FFBP) ANN is simple and consists of three main layers: input, hidden and output as illustrated in Figure 3.
The inputs are transported to hidden layers after multiplying with weights.The sum of the weighted inputs provides the net input of the neuron.The activation threshold of the neuron (bias), with its positive or negative value, is added to the net input.Considering the neurons in hidden layers make a connection with the inputs, a bias is  Descent (GD), is applied in various disciplines to find solutions for nonlinear least-squares curve-fitting p roblems.With this algorithm, the sum of the squares of the errors between the estimated values and the measured data is reduced in an iterative process.The behavior of the LM algorithm depends on the difference between parameters and their optimal value.The LM method acts more like a gradient-descent method when the parameters are far from their optimal value.On the other hand, it acts more like the GN method when the parameters are close to their optimal value.Furthermore, LM algorithm has the following listed advantages [33]; i.It is more robust than the GN method and more convergent than the GD method.ii.Models with multiple free parameters that are not precisely known can be handled in this algorithm.iii.If the initial guess is far from the target, an optimal solution still can be found by the algorithm.
The LM algorithm adaptively modifies the parameter updates between the GD update and the GN update [34], Here, 'w' represents weight vector, 'I' indicates unit matrix, 'µ' expresses combination coefficient, 'J' states Jacobean matrix (PxMxN) and 'e' denotes error vector (PxMx1).
µ is a parameter that can be changed.If µ is high, the algorithm behaves like the GD method and if µ is low, it behaves like the GN method.
There are many statistical error assessment methods used in the literature to evaluate the performance of ANN models.Among them, mean squared error (MSE), rootmean-square error (RMSE), mean bias error (MBE), mean absolute percentage error (MAPE), coefficient of determination (R 2 ), the sum of squared relative errors (SSRE), and t-static (t-stat) are the most common evaluation metrics used to compare the estimated results with the actual data [35].These evaluation metrics can be calculated by using the following equations (Eq 4-10).) ) )

RESULT AND DISCUSSION
This study aims to make an accurate estimation of GSR by using five different inputs having the most influence on Adana.The total of 3285 input data has been handled out including hourly actual pressure, wind speed, wind direction, relative humidity, and average temperature for the year of 2018.FFBP ANN model has been used and initialized with random weight and biases.The train function is chosen as 'trainlm' (LM algorithm).Multiple models are trained instead of a single model to reduce the variance of the ANN and to obtain a better prediction.20 different ANN models are developed and tested with various structures and parameters as given in Table 2.
As can be seen in Table 2, the number of neurons in the first hidden layer (N1) and the second hidden layer (N2) are selected randomly from 10 to 30.The learning rate has been randomly assigned between 0.75 and 0.85.The activation function is applied as 'logsig' (sigmoid) or 'tansig' (hyperbolic tangent sigmoid).Among the 20 models, the best network model was chosen as "ANN -5", taking into account the performance criteria (MSE) of these models.
The optimum design parameters of the optimum ANN model developed in MATLAB are provided in Table 3.Two hidden layers are selected and the number of neurons used in hidden layers are determined as 10 and 30, respectively.'logsig' has been determined as an activation function for In these equations, m is the number of observations, 'y i ' and 'x i ' represent the predicted (output) and measured (target) data, respectively and 'x ¯i' represents the arithmetically mean of the data that measured.
MSE, which is obtained by dividing the sum of the squared errors by the number of observations, is probably the most commonly used error metric.It penalizes larger errors because squaring larger numbers has a greater impact than squaring smaller numbers.MSE is preferred for the evaluation of the performance of the network in this study.Using all these input values and functions mentioned above, GSR is predicted and the MSE is calculated as 5.8262 × 10 7 W/m 2 .
The weights and biases of the presented ANN model are given respectively in Table 4: From input layer to hidden layer, Table 5: From second hidden layer to first hidden layer, and Table 6: From second hidden layer to output layer.
Figures 4 and 5 provide the forecast and target values for the training and test data in terms of GSR.For the test data, the linear correlation between the target and the predicted results is calculated as 0.87313.According to the results, the regression line of the test and predicted data is given by y = 1.0 T + 710.As can be seen from both Figure 4 and Figure 5 that there is a healthy correlation between the target and prediction values which also means that the network settings are suitable for this type and amount of data.
To assess the effectiveness of the ANN model in the estimation of hourly global solar radiation, the validation   in the graph is the estimated GSR while the red curve is the measured hourly GSR data.The error between predicted and the target test values is demonstrated for each sample in Figure 8.The trained ANN model is also used to predict monthly average daily global solar radiation in 2018 as given in Figure 9.
The importance of input parameters on the output of the model is evaluated by using sensitivity analysis based on the Cosine Amplitude Method (CAM).The strength of the relationship between output and input parameters is acquired by using the following equation [36]:   where N is the number of data samples, x ik is the input parameter and x jk is the output parameter.The value of Rij lies between 0 and 1.If the value of R ij is found to be 0, then no relationship will be observed between input and output parameter, if the value is found as nearer to 1 then the strong relationship will be observed between input and output parameter [36].The importance of input variables of the proposed ANN model is provided in Fig. 10.It can be observed that the most sensitive input parameter is wind direction and least sensitive parameter is relative humidity.The sequence of sensitive input parameters are Wind Direction > Average Temperature > Actual Pressure > Wind Speed > Relative Humidity.

CONCLUSION
Some of the data such as relative humidity, pressure, wind speed and its direction, and intensity are always available in meteorology stations since they do not require measurement with expensive devices.Besides, there is no need for specialists who perform the measurements to obtain these data.Nevertheless, the measurement of GSR, which is an important parameter for the design, installation, and operation of solar energy-based systems, is pretty difficult for meteorology stations as the indicating device is very expensive and its calibration is hard.The studies in the literature reveal the importance of realizing high accuracy GSR prediction as it provides an economic advantage.In recent years, many studies have focused on the estimation of GSR in a specific region by developing various models and algorithms.In this study, an ANN model, which is known to be effective in forecasting, is developed to make an accurate estimation of hourly GSR for Adana province.For this objective, various meteorological variables such as wind speed, wind direction, relative humidity, actual pressure, and average temperature have been utilized as inputs in the developed ANN model.The neuron numbers used in the hidden layer of the ANN are optimized with trial and error to get a better estimation.The effects of the number of neurons employed in hidden layers and the input parameters have been evaluated by using statistical analysis methods namely MSE and R for both training and test/ validation stages.According to the obtained statistical test results, the linear correlation between the target and the predicted is found to be 0.87313 and the MSE is calculated as 5.8262 × 10 7 W/m 2 .The validation results indicate that the developed ANN model can be considered as an alternative to the existing estimation models in the literature with its satisfactory accuracy.Further studies will be run to develop the ANN model to have a better correlation.It is suggested to use algorithms such as Genetic Algorithm or Particle Swarm Optimization for tuning the parameters of ANN to yield satisfactory performance.

Figure 1 .
Figure 1.The flowchart of the ANN model.

Figure
Figure Correlation between prediction and target values of training data.Figure 5. Correlation between prediction and target values of test data.

Figure 5 .
Figure Correlation between prediction and target values of training data.Figure 5. Correlation between prediction and target values of test data.

Figure 6 .
Figure 6.The validation performance of estimation of hourly GSR with ANN model.

Figure 7 .
Figure 7. Comparative graph of the real and estimated values for hourly GSR in August 2018.

Figure 8 .
Figure 8. Individual errors graph of the ANN model.

Figure 9 .
Figure 9. Comparative graph of the real and estimated values for monthly average daily GSR in 2018.

Figure 10 .
Figure 10.Importance of input parameters to predict the GSR using ANN model.

Table 1 .
The minimum, maximum and mean values of the measured data

Table 2 .
The developed ANN models

Table 3 .
ANN design parameters for MATLAB implementation

Table 4 .
Weights of connections between neurons in the first and input layer of ANN5 with bias Figure 6 has been used.The graph shows that the best validation performance is achieved at epoch 118 where the MSE value becomes the minimum value.It is clear that the MSE values are decreasing with an increase in the epoch.Both training and test losses reach their minimum values after the 118 th epoch and remain at a constant value, which means that the model is well trained and a good fit is achieved.The comparison graph of the ANN model between the real and the estimated data of hourly GSR during the days in August 2018 is shown in Figure 7.The blue curve shown

Table 5 .
Weights of connections between neurons in the first and second hidden layer of ANN5 with bias

Table 6 .
Weights of connections between neurons in the second hidden and output layer of ANN5 with bias