A new model to determine the hierarchical structure of the wireless sensor networks

: Wireless sensor networks are one of the rising areas of scientific research. Common purpose of these investigations is usually constructing optimal structure of the network by prolonging its lifetime. In this study, a new model has been proposed to construct a hierarchical structure of wireless sensor networks. Methods used in the model to determine clusters and appropriate cluster heads are k-means clustering and fuzzy inference system (FIS), respectively. The weighted averaging based on levels (WABL) defuzzification method is used to calculate crisp outputs of the FIS. A new theorem for calculation of WABL values has been proved in order to simplify getting the crisp values from complex fuzzy outputs of the FIS. The proposed methodology is experimented via simulation example, and experiments confirm its validity.


Introduction
Investigating the wireless sensor networks is one of the rising areas of recent scientific research. There are many different approaches in the literature to optimize the topology and to increase the lifespan of the wireless network. The fuzzy logic approach is used for this aim in some studies. For example, in [1], the authors compared low energy adaptive clustering hierarchy (LEACH) protocol with a fuzzy logic model in order to estimate lifespans of wireless sensor networks. As a result, they decided that the fuzzy logic model is more suitable than LEACH. In their study, they used energy, centralization, and concentration as criteria. In [2], the authors compared LEACH, cluster head election mechanism using fuzzy logic (CHEF), and fuzzy-based muster cluster head election leach protocol (F-MCHEL) in order to estimate lifespans of wireless sensor networks.
As a result, they decided that F-MCHEL guarantees longer lifespan of the network than the other methods. In their study, they used energy and closeness to base station as criteria, then they ran a simulation using MATLAB. In [3], the authors compared LEACH and stable election protocol with a fuzzy logic model in order to estimate lifespans of wireless sensor networks. In their study, they used energy, closeness to base station, and area distance as criteria. In [4], the authors compared LEACH protocol with a fuzzy logic model in order to estimate lifespans of wireless sensor networks. They decided that the fuzzy logic including Mamdani Type Fuzzy Inference System is more efficient than LEACH. In their study, they used energy and closeness to base station as criteria. In [5], the authors compared LEACH, CHEF, and distributed fuzzy clustering algorithm protocols in order to estimate lifespans of wireless sensor networks. In their study, they used energy and concentration as criteria. In [6], the authors compared LEACH, CHEF, fuzzy clustering algorithm using * Correspondence: resmiye.nasiboglu@deu.edu.tr This work is licensed under a Creative Commons Attribution 4.0 International License. fuzzy logic approach for blending two clustering descriptors to select cluster heads (LEACH-ERE) and fuzzy logic based clustering algorithm protocols in order to estimate lifespans of wireless sensor networks. In their study, they used energy and closeness to base station as criteria. In [7], the authors compared LEACH, CHEF, energy aware distributed dynamic clustering protocol using fuzzy logic, energy aware unequal clustering using fuzzy approach and multiobjective fuzzy clustering algorithm protocols with fuzzy logic based energy efficient clustering hierarchy in order to estimate lifespans of wireless sensor networks. In their study, they used energy, closeness to base station and centralization as criteria. In [8], the authors compared a fuzzy logic model with some other methods and concluded that fuzzy logic model is more efficient than others. In [9], the authors compared a fuzzy logic model with power-efficient gathering in sensor information systems (PEGASIS) and PEGASIS-Topology Control (PEGASIS-TC) models. They concluded that the fuzzy-based model consumes less energy compared to PEGASIS and PEGASIS-TC because of smart scheduling of leader. In [10], the authors discussed different types of wireless sensor network topologies. In [11], the authors proposed an energy-efficient multihop hierarchical routing protocol for wireless sensor networks to enhance the network lifetime and avoid the formation of energy holes. In [12], the authors investigated behavior of fuzzy logic controllers of Mamdani type. In [13], the authors proposed several clustering protocols and compared them with each other. In [14], the author used LEACH and k-means clustering methods. By targeting nodes that are subject to threshold value and close to each other, second-level clustering was done and energy efficient close nodes clustering method was developed. In [15], the author presented an energy-driven architecture as a new architecture for minimizing the total energy consumption of wireless sensor networks. Based on the proposed architecture, he introduced a single overall model and propose a feasible formulation to express the overall energy consumption of a generic wireless sensor network application in terms of its energy constituents.
Fuzzy inference system (FIS) is widely used in recent sensor network studies. Defuzzification is an important part of the FIS to calculate a crisp output. Defuzzification operators, such as mean of maxima (MOM), centroid of area (COA), bisector of area (BOA), are widely used in FIS. However, we use the weighted averaging based on levels (WABL) method as a FIS defuzzification method. The WABL method is a more general method among defuzzification methods and is more flexible. Thus, it can produce results simulating other methods when appropriate adjustments are made. In [16], the authors investigated the concept of WABL for discrete trapezoidal fuzzy and proved analytical formulas to facilitate the calculation of WABL value for these fuzzy numbers. In [17], the authors presented a comparative analysis of such methods of defuzzification of fuzzy numbers as WABL, centroid, and MOM.
In this study, we propose a methodology to investigate the lifespan of a wireless sensor network. The methodology consists of k-means clustering using FIS with WABL defuzzification operator. For calculating crisp value by using WABL, a new approach has been proposed. The theorem giving opportunity to facilitate calculations has also been proved. A simulating application has been composed with Java programming language. The application provided us to compare the lifespan of the systems according to the size of the area, the consumed energy, the count of the nodes, and the cluster heads.
The rest of the paper is organized as follows. Section 2 contains preliminaries about wireless sensor networks, k-means clustering algorithm, and fuzzy inference system. The WABL method and related theorems facilitating calculations are given in Section 3. Section 4 contains the proposed methodology with simulated application results. Finally, the last section concludes the paper.

Wireless sensor networks
Wireless sensor network is a network of multiple nodes in a wireless environment. Wireless sensor networks are easy to install, flexible, and preferred in many areas. The main applications of wireless sensor networks are border security services, environmental monitoring and protection practices, field observations, air pollution observations, forest fire detection, landslide detection, water quality observations, etc.. If more specific examples are given, a temperature map can be constructed by dropping sensors from a plane in a forest fire; observations about living creatures can be done with sensors placed in wildlife areas; coordinates, progress directions, velocities, etc. of troops in military operations can be detected and observed. Wireless sensor networks can consist of various topological systems. In this study, the hierarchical model is used. In the hierarchical model, the nodes are placed in the lowest layer. Node points can be divided into one or more clusters. Each cluster sends its data to the node on the upper layer, called the cluster head node. Cluster head nodes perform their functions by transmitting the data they have collected to an upper point called a base station ( Figure 1). E DA : Energy exhausted in data fusion (nJ/bit/report). Then the energy consumption for transmitting m bits of data to a node at distance d from the transmitting node is calculated as: In Figure 2, a simple radio model of receiver and transmitter is shown. The energy consumption for transmitting m bits of data received from n nodes (each node transmits m bits of data) to the base station at distance d from a cluster head is calculated as:

K-means clustering
In the previous section, we have mentioned the clustering process to determine clusters in the hierarchy of the wireless sensor network. For this purpose, k-means clustering algorithm is used in our study.
K-means clustering is one of the simplest and popular unsupervised machine learning algorithms. The main idea is to define k centroids, one for each cluster. After we have these k centroids, a binding has to be done between the data set points and the nearest centroid. This process continues iteratively. As a result of this loop, we may notice that the k centroids change their location step by step until no more changes are done. In other words centroids do not move any more. The pseudocode of the k-means clustering algorithm can be described as below.
K-means clustering algorithm.
Step 1: The count (k) of clusters is determined by the user.
Step 2: Cluster centers m i , i = 1, , k, are determined by assigning random points.
Step 3: The distance to each cluster center is determined for each element and the element is assigned to the cluster with the nearest center.
Step 4: After each iteration the cluster center is recalculated for each cluster.
Step 5: Repeat steps 3 and 4 until the termination condition is reached.

End.
Euclidean distance is generally used to determine the distance specified in step 3. The algorithm is terminated when there is no change in cluster centers. Another way to end the algorithm is the convergence criterion. The algorithm can be terminated when there is no significant change between the consequent values of the minimized criterion J after iterations: where d(p, m i ) is the distance between a data point p ∈ C i and its cluster center m i .

Fuzzy inference system
In the wireless sensor network, another tool we use to calculate the chances of selecting the cluster head among the nodes in each cluster is the fuzzy inference system (FIS). FISs are one of the most important mathematical apparatus used in fuzzy logic (Figure 3). There are four widely used types of FISs. These are Mamdani-type fuzzy system, Sugeno-type fuzzy system, Tsukamoto-type fuzzy system, and Larsen-type fuzzy system. In calculating the chances of each node in the cluster as the cluster head, it is clear that the distance of this node from the cluster center and the energy level are very decisive factors, but it is difficult to give a definite formula. According to these factors, it is easier to say that the chances of being a cluster head are low, medium, or high. In this respect, Mamdani-type FIS was used in this study (Figure 4).
Linguistic variables (terms) are determined for input values in Mamdani-type systems. The output as a new linguistic value is obtained as a result of predefined rules. For example, in our FIS, input variables of the system are: X indicating the energy levels of the node and Y indicating its distance to the cluster center. The output of the FIS is the variable Z indicating the chance of the node to be the cluster head. In this case, an example fuzzy rule system of FIS can be defined as follows: Rule 1: if X is "low" and Y is "central", then Z is "medium chance", Rule 2: if X is "low" and Y is "distant from center", then Z is "small chance", Rule 3: if X is "high" and Y is "center", then Z is "high chance", Rule 4: ……., Mamdani-type FIS is a system that produces fuzzy outputs by processing fuzzy inputs. However, in today's technology, measurements from sensors as inputs are generally in the form of crisp values. In this respect, the conversion of these values to the fuzzy ones, i.e. calculating their membership degrees into fuzzy input terms must be done. This process is called a fuzzification process. On the other hand, it is seen from the rules that the output of the Mamdani-type system is a fuzzy set. However, the fuzzy output should be converted to the crisp number, in order to calculate the crisp chance value of the node. This process is called a defuzzification process. In this study, the fuzzification process was performed using fuzzy triangular and fuzzy trapezoidal membership functions. Moreover, we use the WABL defuzzification method to produce crisp outputs. In addition, in order to facilitate the calculation of the complex FIS outputs discussed, new theoretical investigations on the WABL method were conducted in this study. In this regard, WABL-related research is included in the next section.

WABL defuzzification method for complex fuzzy numbers
There are some most commonly used defuzzification operators, such as MOM, COA, BOA, in fuzzy inference systems. Another defuzzification operator WABL is more universal compared to other operators. The formula to calculate a defuzzified crisp representative value of a fuzzy number A is as follows: where c ∈ [0, 1] is the optimism coefficient of decision-maker, p(α) is the density function of degrees' importance (degree-importance function), and h is the height of the fuzzy number. We will consider the normal fuzzy numbers with h = 1 . Moreover, In the literature, parametric definition of p(α) is used as shown below [18]: There are various studies aiming to calculate WABL values easily for fuzzy numbers with p(α) functions as determined in (8) [16,17]. For example, [17] proved the following theorem: A(l, m l , m r , r) with degree-importance function (8) can be calculated as below ( Figure 5): In their study, [16] proved the following theorem in case of the equal weighted levels ( k = 0 ): A(l, m l , m r , r) in the case of degree-importance function with equal weighted levels ( k = 0 ) can be calculated as below:

Theorem 2 WABL value of a trapezoidal fuzzy number
where As shown in the theorems above, the calculation formulas of the WABL method is only valid for simple triangular and trapezoidal fuzzy numbers. In this study, the fuzzy numbers are formed as a composition of triangular and trapezoidal fuzzy numbers ( Figure 6). From this perspective, the analytic formulas above in Theorems 1 and 2 cannot be used directly. Thus, the novel following theorem is proven: Figure 6 in case of degree-importance function with equal weighted levels ( k = 0 ) can be calculated as follows:

Theorem 3 The WABL value of a complex fuzzy number consisting of fuzzy trapezoidal numbers in
NASİBOĞLU and ERTEN/Turk J Elec Eng & Comp Sci Proof The general definition of the WABL is: where When the appropriate x values are defined in place of l, r, m l , and m r (see Figure 5) and the integral of the formula is taken in the range [0, µ 1 ] and [µ 1 , h], the following can be written: Eq. (19) can be simplified as follows considering the condition k = 0 (hence, p(α) ≡ 1 ): When the trapezoidal fuzzy numbers A 1 and A 2 are considered separately as normal ones: is obtained. When the formula (20) is taken into consideration with the equalities (22) and (24), can be written, which completes the proof. In the proposed model, the sensor nodes periodically send data at certain time intervals to the cluster head node which they are members of. After each data transmission, the cluster head points are refreshed according to the chances to be cluster head calculating via the fuzzy inference system. Pseudocode of the algorithm to demonstrate the lifecycle of the sensor network is given below and a more detailed flowchart is given in Figure  7.

Lifecycle of the system:
Repeat until the count of nodes for active system is less than a fixed threshold: Step 1. Construct sensor clusters via k-means clustering.
Step 2. For each cluster: 2.1. For each node in the cluster: a. Calculate the energy level of the node; b. Calculate centrality of the node according to its location among the cluster nodes; c. Calculate the chance of the node to be the cluster head via FIS; 2.2. Mark the node with maximum chance in the cluster as the cluster head.

End repeat.
In our model, inputs for FIS are fuzzy linguistic variables reflecting the energy level ( Figure 8) and the centrality (Figure 9) of the node.
Output values of the rules reflecting the chance of the node to be the cluster head are given in Figure 10.   According to the rules predefined, the output of the FIS is defuzzified via WABL value, reflecting the overall chance of the given node to be the cluster head. Some of the rules constructed in the proposed model determining the chances of the node to be a cluster head according to the energy levels ( E ) and their distance to the center ( C ) are as follows: R 1 : if E is very low and C is very close then Chance is medium, R 2 : if E is very low and C is far then Chance is low, R 3 : if E is low and C is very close then Chance is medium, R 4 : if E is low and C is close then Chance is medium, R 5 : if E is medium and C is close then Chance is high, R 6 : if E is medium and C is medium then Chance is medium, R 7 : if E high and C is very close then Chance is medium, R 8 : if E high and C is far then Chance is medium, R 9 : if E very high and C is close then Chance is high, R 10 : if E very high and C is very close then Chance is very high.

Simulation of the model
A real-world application to simulate the lifecycle of the sensor network is handled in this study. In this application, the coordinates of the nodes are randomly generated in the area [0, 100] x [0, 100] to reflect the wireless sensor nodes, which are dropped from a plane into a certain area. Cluster validity indexes can be used to determine the optimal number of clusters, but in this simulation, for convenience, the number of clusters is handled as 2. The application is coded in Java programming language, and experiments have been performed on PC with Intel i7 CPU, 8 GB RAM. The input of the application is nodes with their (x,y) coordinates and their energy levels. By taking these inputs, our simulation process works in accordance with the model we proposed in the previous subsection. The other parameters used to perform the simulation are as follows: By launching the application, count of active and passive nodes can be tracked simultaneously. The energy level of a node can be seen by clicking the node on the graphic. When the count of active nodes is less than 10% of the all nodes, the simulation ends (Figures 11 and 12).
The summary values after each 30 iteration of the simulating application are presented in the table given in Figure 13. The table shows the number of active and passive nodes for each iteration, the minimum and  maximum energy levels, and the energy levels of the cluster head nodes. Nodes with remaining energy level below 5% of the initial energy level ( E < E 0 * 0.05 ) is removed from the system. If the number of nodes remaining within the system is above 10% of the number of initial nodes, the remaining nodes are subjected to k -means algorithm operation and this is iteratively repeated. If the number of active nodes remaining is less than 10% of all nodes, the system reaches the end of its life and the process ends. With the running application, the proposed model for establishing a hierarchical structure of wireless sensor networks has been tested. Investigating the results according to the table and according to the graphical results of the application we can conclude that: • It is seen that the wireless sensor network works successfully and the energy usage of the nodes is balanced effectively.
• When the visual graph of the program is examined, it is seen that the cluster head nodes are selected from the regions close to the center.
• It is seen that the distance between the nodes with the cluster head is at a reasonable distance and stable.
• Clusters appear around the cluster head nodes.
• The remaining energy levels of the nodes can be observed instantaneously on the graph, and it is seen that there is a decrease in parallel with the other nodes.
• It is seen that the nodes falling below the desired level of energy remain out of the system.
• When the total number of nodes in the system drops below a certain level, it is seen that the system has stopped.

Conclusion
In this study, a new model has been proposed to construct a hierarchical structure of wireless sensor networks.
In the model, the clusters are detected by using k-means clustering algorithm, and the selection of the node to be the cluster head is decided by means of FIS using a fuzzy rule system. In this system the decision-maker can use fuzzy linguistic values for input and output variables that is a great convenience for the decision-maker. Without making mathematical calculations, the decision can be made by the decision-maker on the basis of more understandable linguistic fuzzy rules.
The WABL defuzzification method is used to calculate crisp outputs of the FIS with fuzzy rules. A new theorem for calculation of WABL values has been proved in order to simplify to get the crisp values from complex fuzzy outputs of the FIS.
Finally, the proposed methodology is demonstrated through experimental simulated application and the practical usage of this approach is shown. In future studies, it is aimed to investigate the effects of WABL parameters on the lifetime of the wireless sensor network.