Decision-making for small industrial Internet of Things using decision fusion

: The industrial Internet of Things (IIoT) is a new field of Internet of Things (IoT) that has gained more popularity recently in industrial units and makes it possible to access information anywhere and anytime. In other words, geographic coordinates cannot prevent obtaining equipment and its data. Today, it is possible to manage and control equipment simply without spending time in an operational area and just by using the IIoT. This system collects data from manufacturing and production units by using wireless sensor networks or other networks for classification of fault detection. These data are then used after analysis to allow operational decisions to be made in shorter amounts of time. In fact, the IIoT increases the efficiency and accuracy of the “connection, collection, analysis, and operation” cycle. The information collected through different sensors in the IIoT is unreliable and uncertain due to the sensitivity of the sensors to noise, failure, and loss of information during transmission. One of the most important techniques offered to deal with this uncertainty in information is the decision fusion method. Among the decision fusion techniques, the Dempster–Shafer and improved Dempster–Shafer theory, which is also known as Yager theory, are efficient and effective ways to manage the uncertainty and have been used in many types of research. This paper offers an architecture for decision fusion in a small IIoT using Dempster–Shafer and Yager theories. In this architecture, data collected from the desired environment are fed to classifiers for classification. In this architecture, artificial neural networks and a dendrogram-based support vector machine are used as classifiers. To increase the accuracy of classifier results, the Dempster–Shafer and Yager theories are used to combine these results. To prove the performance, the proposed method was applied for detection of faults in an induction motor and human activity detection in an environment. This proposed method improved the accuracy of the system and decreased its uncertainty significantly according to obtained results from these two example use cases.

The IoT is an approach that promotes interaction between objects, of objects with people, and of people with objects, and with this system, new services will emerge [3]. One of the primary goals of the IoT is to increase intelligence in life, business, industry, and economics [4]. Also, it offers the right solutions for a vast range of applications such as smart cities, traffic, waste management, security, emergency services, logistics, retailing, industrial control, and healthcare [5]. The industrial Internet of Things (IIOT) stands out as one of the most important and most widely used areas of the IoT [6]. Use of this technology in industrial units can connect all the objects and create an integrated system for conducting all kinds of information exchange, control, and monitoring tasks. Also, it provides a way to gain better visibility and intuition into corporate operations and assets by integrating sensors, middleware, software, and cloud processing and storage systems [7].
One of the important issues of industrial units is having a proper system for management of the production system and making decisions about its future. In this regard, the use of mathematical models and other models in various sciences and industries will contribute to the decision-making and management of decision-making processes. However, the system parameters cannot be accurately determined due to the uncertainty and the uncertain details of the problem. Uncertainty arises in a variety of topics. For example, obtaining information for one issue is performed in many distrustful situations, and the existence of uncertainty will lead to a potential risk of failure for devices and industrial systems. One of the most important issues in industrial systems is understanding the effect of stopping devices and how risk can be controlled and reduced by checking and increasing the reliability of the control system. To calculate reliability, there is a need for the previous data from the function of the devices, and if there are not enough data available, there are uncertainties. Therefore, it is not possible to determine the probability of proper operation of the device in a particular period with traditional methods [8]. To measure the reliability of a system, based on available statistical data, a model for the failure rate is selected and its parameters are estimated based on existing data, but the statistical information is not always available and, in the real situation, the decision-making faces uncertainty.
The theory of information fusion uses multiple information sources to make optimal decisions. The information fusion involves the simultaneous integration of data or information from different sources for a better understanding of the conditions of problems and, consequently, to make more precise decisions. As mentioned before, in most cases the data received from different sources about an industrial environment are usually incomplete, ambiguous, and even contradictory. Accordingly, information fusion is used as a powerful tool for data mining and knowledge discovery [9]. At present, methods for combining information are employed in a wide range of applications, including web mining, biologic data processing, remote sensing, intelligent transportation systems, human activity recognition [10], and disaster relief [11]. Also, the IIoT equipped with a variety of sensors can monitor industrial environments. The information collected through different sensors is unreliable and uncertain due to the sensitivity of the sensors to noise, failure, and loss of information during transmission [12,13]. When dealing with uncertainty, there is a need to use methods and tools of information fusion to precisely predict system failures. Various approaches have been used to predict system failures in risk conditions. The most important of these are probability theory [14][15][16], evidence theory [17,18], the belief transfer model [19], and Bayesian inference [20]. Even though there is no consensus on the best model to deal with uncertainty, the method applied in this paper is the Dempster-Shafer [18,21], which is usually used to decide on the conditions of uncertainty and when there is scarce information about a particular decision.

Industrial Internet of Things (IIoT)
The IoT is an emerging technology that is expected to bring about dramatic changes in many existing industrial systems, such as transportation systems and production systems. The term "Internet of Things" was initially used for connected and identifiable objects that were equipped with RFID radio frequency identification systems [21]. However, today, the IoT is a dynamic public infrastructure with self-configuring capabilities that is based on sensor technologies, wireless communications, networking, and information processing techniques where physical and virtual objects that have physical identities, virtual attributes, and intelligent interfaces are integrated into an information network using standard and interactive protocols [22].
The presence of the IIoT everywhere, every time, and in everything provides intelligent solutions to industrial problems in various areas [23]. The IIoT is created by placing IoT devices inside industrial infrastructure. Many other manufacturers have also used the IoT in various industrial applications, including supply chain management [24], product life-cycle management [25], quality management [26], and much more [27].
Data generated by Internet technology devices from industrial objects require data processing and conversion to information and knowledge. The process of converting data into information is done by various data mining algorithms. Afterwards, the gathered data become more suitable for more efficient decision-making processes [28]. Decision-making systems are used in different fields. According to Wu et al. [29], cognitive IoT can be used in industrial systems efficiently, and data obtained from IoT devices are used in the cognitive decision-making process. Authors use various models such as the consensus model [30], agent-based model [31], Bayesian model [32], neural networks [33], game theory model [34], and many others for decision-making system. Kaur and Sood [35] provided a method for assessing industrial environment workers using game theory, in which information collected by sensors is embedded in intelligent industrial systems to identify various industrial activities of workers, and then the detected activity is divided into positive, negative, and neutral activities. Moreover, a decision made for employees' rewards and penalties is based on their actions.

Information fusion and Dempster-Shafer theory
In simple terms, the methods of information fusion combine data from different sensors to predict more precisely the properties and states of a system. Then it moves the combined data towards the best decision by linking the information obtained to the analyzed system conditions. The purpose of this method is to create an advanced and predictive model of the system based on the data obtained from some independent sensors. It can be said that the use of a larger number of sensors greatly reduces the probability of all errors, including random error, tool error, analysis error, and so on. Also, the use of a multisensor data fusion strategy can increase the stability of the system's performance since each sensor can provide at least its information to the user, even if other sensors are corrupted or unavailable. Increasing the degree of confidence, reducing noise, reducing the uncertainty of data rates, and many others are some significant advantages of using the data fusion system in all applications. Therefore, the use of data fusion is one of the most important requirements in increasing reliability in its various applications from managerial decisions to monitoring status and troubleshooting [9,11,36].

The proposed method
In the context of the IIoT, which was explained in the previous section, Figure 1 demonstrates the general architecture for the proposed method for a small IIoT according to the model suggested by the ITU [37]. It consists of the device, network, services and application support, and application layers. Figure 2 demonstrates a flowchart of the proposed method according to this architecture. The device layer includes sensors and actors, where data are collected from the industrial environment using the WSN or other sensors and are delivered to the network layer directly or through the WSN gateway. Actors receive decisions made by layers above them through the network layer and act according to them. The network layer includes different protocols to forward data through the Internet, such as the IPv6 protocol, which is a key protocol in the IIoT because of its long addressing range and autoconfigured property, and the 6LoWPAN protocol that enables the WSN to be a part of IIoT and exchanges information with industrial equipment and actors [38]. The services and application support layer is responsible for data conversion, classification, and the combination of decisions for increasing accuracy of the system for early warning and timely actions, and the application layer includes a specific application for the system. Details of service and application support layer are explained below.  The services and application support layer consists of four sections of data conversion and feature extraction, database, classification, and information fusion.

Data conversion and feature extraction
Feature extraction is a process that identifies the significant and decisive features of the data by using some operations and also reduces the redundancy of data. The purpose of feature extraction is to make raw data more accessible for subsequent processing. Thus, in this block, if necessary, the process of data conversion and feature extraction is done. In this paper, for extracting and diminishing feature vectors of collected signals from the system, some statistical functions such as kurtosis, skewness, standard deviation, mean value, fourth central moment, third central moment, figure of merit 4 (FM4), and root means square are used [39].

Database
Intelligent industry usually includes a database system that stores raw data, processed information, and knowledge. The database coordinates all industrial system operations by providing the information needed for now and the future.

Classification
Classification in machine learning and statistics is one of the areas of supervised learning and is a process in which samples are divided into categories whose members are similar to each other, referred to as classes. Therefore, a class is a set of objects or features in which objects or features are similar to each other and are not similar to objects or features in other classes. Many algorithms, including artificial neural networks [40], the adaptive neuro-fuzzy inference system (ANFIS) [41], dendrogram-based support vector machine (DSVM) [42], and fuzzy theory [43] are used for this purpose. In this block extracted features from signals are classified into some classes using the history of data and real-time data to decide on actions that must be taken in the environment. Since in this block different algorithms or human evidence can be used for classification, there may be uncertain and even opposing decisions between the various decisions obtained from evidence and algorithms. To reduce such discrepancies and uncertainties between decisions, another block in the name of information fusion is required, which will be explained in next section.
In this paper, DSVM and ANN are used for classification of input data.

Dendrogram-based support vector machine (DSVM)
The support vector machine (SVM) is a particular kind of neural network that minimizes operational risk instead of minimizing the error for classifying or modeling [44]. This tool is very powerful and can be used in various fields such as classification, clustering, and modeling (regression). DSVM is one of the supervised learning methods that use binary SVM for multiclass classification. In DSVM for multiclass classification, assume a set of input samples x1, x2, . . . , xn and label each one by y i ϵ {c 1 , c 2 , . . . , c k } . k is the number of classes (k ≤ n) . The first step of the DSVM method includes computing the k center of gravity for k classes.
Then agglomerative hierarchical clustering (AHC) [45] is used for these k centers. In the next step, each SVM is linked to the node and is trained with elements of two subtrees of this node [42].

Artificial neural networks (ANN)
Artificial neural networks (ANN) are important tools in the field of computational intelligence. Various types of artificial neural networks are introduced, which are mainly used in applications such as classification, clustering, pattern recognition, modeling, and approximation of functions, control, estimation, and optimization. One of the easiest and most efficient methods for realistic nerves modeling was the multilayer perceptron model (MLP) [46], which consists of an input layer, one or more hidden layers, and an output layer. In this structure, all neurons of one layer are connected to all of the next layer's neurons [47].

Information fusion
In this block the results of different classifiers are combined together for reaching high accuracy and decreasing uncertainty. For this, various inferential methods such as Bayesian, Dempster-Shafer, fuzzy logic, neural networks, abductive reasoning, and semantic information fusion are used [9,48]. In this paper Dempster-Shafer theory is used for decision fusion in the final step of the proposed method.

Dempster-Shafer theory
The Dempster-Shafer theory is one of the popular theories used in intelligent systems decision-making when there are uncertainty and inaccuracy. The Dempster combining principle is a powerful tool that is important for combining evidence from distinct information sources [49], and it is a potential tool used to evaluate risk and reliability in engineering applications when measuring the accuracy of experiments and obtaining knowledge from expert inferences is impossible. One of the important aspects of this theory is the combination of evidence from different sources and modeling the conflict between them [21].
where 2 Ω is the set of subsets of Ω, and ϕ is a set of nulls.
The belief function (Bel) and the plausibility function (P ls) are among the most important functions in the uncertainty argument and represent the upper and lower limits of the belief in the variable of reality desired. They are defined as: When there are several sources of information for decision-making, resource information should be combined with the appropriate method and a final decision should be made. Data collected from the sources should thus be combined using Dempster's rule of combination: where: k represents the degree of conflict between the views of various information resources about the desired event. If the k value for two sources is close to 1, these two sources are completely in contradiction, and as the k value approaches zero, the two sources become more compatible. Dempster's rule of combination completely ignores the conflict between information resources so this theory is not appropriate for highly conflicting information resources. In this regard, Yager improved this theory by classifying the conflicting information resources into set Θ, which means the classifier does not know the class of the input data [21], and it is obtained by the following equation: where α i is the importance factor of the ith evidence or information resource as shown in the flowchart of Figure 2. The average of each classifier is considered as the α i of it.
Also, Yager employed ground probability assignments (q) instead of basic probability assignment functions (m), obtained by the following equation: In the Yager method q(∅) must be greater than zero, which means the conflict has occurred between two classifiers. Otherwise, Dempster rules must be employed. Also, the new BPAs are defined as: where Oi is the output value of the i th information resource in above equation. Now the new combination rule is as follows: If the number of information resources is greater than 2, then the rules of the combination are as follows: In this section, we proposed an architecture according to the ITU model for IoT as was explained, we used information fusion in the services and application support layer to increase the accuracy of the system, and mainly we used the combination rules of Dempster and Yager. In the next section, the proposed method is simulated and evaluated in an industrial case.

Results and discussion 4.1. Case study 1
In this section, an induction motor with characteristics of 1800 rpm, four Polaris, three phases, 24 stator slots, and 0.7 mm air gap length 0.7 mm is used to illustrate the performance of the proposed method. Sound and vibration signals were collected under healthy and four fault conditions, namely bearing fault, mass unbalanced, stator faults, and broken rotor bar. In the remainder of this paper, class0, class1, class2, class3, and class4 are used for the healthy status and four fault conditions, respectively [50].

Result of proposed decision fusion method
The proposed method is simulated on the MATLAB platform according to the flowchart in Figure 2 to obtain results. As explained in the previous section, sound and vibration signals were collected under five conditions of the system (class0, class1…, class4). For each type of signals, wavelet transformation is used for transforming data to the time-frequency domain. As these signals contain a large set of data for each class, some statistical functions are utilized for feature extraction and reducing data. After that, 75% of this data are fed to thr ANN and DSVM, which are considered as evidence (classifier) here, to train and 25% of data are used for the test.
Results of applying the ANN and DSVM classifier on vibration and sound are demonstrated in Tables 1, 2, 3, and 4 and they are compared to each other in Figure 3.   As seen in Tables 1, 2, 3, and 4 and Figure 3, it is quite clear that by using this level of precision we cannot make a proper decision about the status of the system, and using one information resource and one classifier cannot provide the desired accuracy. Therefore, more information resources and more classifiers are needed to increase the level of decision accuracy. As mentioned in the previous sections, the composition rules of Dempster-Shafer or Yager are used to combine the classifiers results. In this section, combination rules of Dempster-Shafer and Yager are applied to each pair of results obtained from the classifiers, and the results are reported in Tables 5, 6, 7, and 8 and their comparisons are shown in Figure 4. As shown in Tables 5, 6, 7, and 8 and Figure 4, the accuracy is better than in the previous results, but it is still not enough to make a decision close to certainty. In the next step, decision fusion is applied to  the entire results of all classifiers. As shown in Table 9, when the results obtained from different classifiers that combine the data of several information resources are combined at the decision level, a result with acceptable accuracy is obtained. In this simulation the average accuracy is 98.3705 (Table 9). Table 10 shows the results of all classifiers' decision fusion. As seen, we obtained an acceptable accuracy to make a decision. To highlight this, we compare the results of decision fusion, DSVM, and ANN in Table 10 and Figure 5. As illustrated in Figure 5, decision fusion has high accuracy for all five classes.

Case study 2
In this section, the proposed algorithm is examined for further performance evaluation on the activity recognition system based on a multisensor data fusion (AReM) dataset [10]. This dataset contains received signal strength (RSS) data from cycling, lying down, sitting, standing, and walking activities, which were collected by wearable wireless sensors from system actors. These activities respectively are shown as class1, class2, class3, class4, and class 5 in the rest of the paper. There are 15 temporal sequences available for each activity in this dataset. Each of them contains 480 samples and is collected by three pairs of sensor nodes (i.e. chest-right ankle, chest-left ankle, right ankle-left ankle) worn by system actors. In [10], in order to achieve an effective classifier, a decision tree has been used to fuse the data flow from sensors, and in the end, it has classified the data using RNNs. In this section, the proposed algorithm is applied to the above dataset according to the flowchart shown in Figure 2 and the results are shown. All of the data obtained from the sensors are given to the ANN and DSVM algorithms and the results obtained from these classifiers are combined using the Dempster-Shafer or Yager rule, and the final decision is obtained. Tables 11 and 12 show the accuracy of the proposed algorithm for human activity detection using DSVM and ANN classifiers. In order to obtain these results, first, the RSS data obtained from sensors worn by five system actors are given to the above classifiers, and then the results obtained from each of them are combined. As can be seen, by combining the results of each of the DSVM and ANN categories, the average accuracy is 90.54179 and 91.5000, respectively, which is not an acceptable percentage of accuracy for decision-making. Table 13 shows the accuracy of the proposed algorithm for the fusion of the results of the previous step with each other. As can be seen, when the results of the two classifiers are combined with each other, the average accuracy of 99.3761 is obtained, which is an acceptable precision for decision-making. In order to highlight the results in the three tables above, they are compared in Figure 6, and the difference between the accuracy of the proposed method and the two classifiers used is significant.  Figure 7 show the compared accuracy of the proposed algorithm and the Leaky Integrator Echo State Networks (LI-ESN) and Input Delay Neural Networks (IDNNs) methods results obtained in the [10] for this dataset. As can be seen, the proposed method has better performance than the above two methods, and the results are close to each other for the different classes, but in the other two methods, the results for the different classes are obtained differently. Also, the average accuracy of 99.3 obtained in this paper for the given dataset is greater than the accuracy of the two mentioned methods, which have average accuracies of 98.8 and  96.90 for the same dataset.

Conclusion
In this paper, an architecture based on the ITU model is introduced for a small IIoT. In its services and applications support layer, one of the decision fusion methods named Dempster-Shafer is used for data analysis and making a sound decision. Before using decision fusion, different data classifiers are used to classify the data.
In this paper, DSVM and ANN are used for this aim. The data collected from an induction motor are used to prove the efficiency of the method. As shown in the previous section, with one information resource and even with one classifier, the intended accuracy for a precise decision cannot be obtained. Therefore, in this paper, two information resources and two classifiers are used. According to the comparisons performed in the previous section, it was observed that when the information obtained from several sources is evaluated using several classifiers and the results of these classifiers are combined using decision fusion methods like Dempster-Shafer, high accuracy in decision-making is obtained. The efficiency of this algorithm was studied using two use cases. As seen in the results section, using the proposed method average accuracies of 98.3705 percent and 99.3761 percent were obtained for the first and second case studies, respectively. To use this method in operational and real-world applications, this algorithm must be implemented by using a programming language that is suitable for real-time programming instead of MATLAB. This can help for decision-making in services and application support layer of small IIOT to make exact decisions and to apply correct action to prevent possible errors and high losses of industrial units, because in such units failure of a part can impose many losses to the entire system.

Future work
Our future work is going to focus on testing deep convolutional neural networks (DCNNs), model-based feature learning, and data fusion approaches on further mechanical objects, fault modes, and sensor types, which can confirm the effectiveness of approaches and allow us to find other useful application guidance. Finally, combinations of different deep learning architectures should improve the effectiveness of fault decision-making.
Adding recurrent architectures like RNNs may make the model convenient for predicting future faults, and combining with autoencoder architecture may improve the feature learning ability to capture more complex features. The training of complex models is possible by the fast graphics processing units of Nvidia and deploying pretrained neural network models to the new STM32CubeMx.AI by the STM32 MCU family.