Reconstruction of the Taguchi Orthogonal Arrays with the Support Vector Machines Method

Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece Abstract—Design of Experiment (DOE) is a widely used method for examining experiments especially in industrial production and robust design processes. This method is a set of statistical approaches in which mathematical models are developed through experimental testing to estimate possible outputs and given input values or parameters. The method aims to determine the main factors that affect the results with the smallest number of experimental studies. In this study, L16 (2) orthogonal array, which was used in the Taguchi parameter design was reconstructed with the Support Vector Machines learning model and the Pearson VII kernel function. With this model, array elements were successfully classified in 87.04%. The new and original array were compared and 3.8% difference was measured between their Signal to Noise (S / N) ratios in an exemplary experiment.


I. INTRODUCTION
ODAY, IN many manufacturing applications, Taguchi's orthogonal array catalogs are used for industrial designs. Genichi Taguchi redesigned offline quality control methods [1], which were developed in Japan after World War II in the 1980s under the name of robust design in AT & T Bell laboratories. This method is generally called Taguchi orthogonal array design. For instance, in this method, determining the optimum levels of the nine control variables requires 6.000 possible combinations, while only 18 different controllable variables are adequate [2]. To give another example, let's assume that eight factors are affecting the experiment during the development of a robust model. If one of these factors affects two and seven of them at three levels, 2x3 7 = 4.347 experiments should be performed to reach optimum values. If there are 15 parameters per three levels in this instance, 3 15 = 14.348.907 distinct experiments are required. Taguchi's orthogonal array method is the solution in this type of experiment designs in which applying is practically difficult. This technique is used conventionally for industrial production areas.
Also, Taguchi designs use in Environmental Sciences [3,4], Agricultural Sciences [5], Physics [6], Chemistry [7], Statistics [8], and Medicine [9] frequently. However, Genichi Taguchi did not share any information about the methods that are used to construct these arrays. Furthermore, it is difficult to distinguish the links between Taguchi's arrays and the published similar arrays elsewhere [10,11].
In this study, we aimed to regenerate the L16 (2 15 ) Taguchi array using support vector machines methods, based on twolevel L4, L8, L12 arrays which are found in the Taguchi arrays catalog. We compared the array that we obtained and the original array with a sample experiment. In addition, the variation of the factors are affecting the results is discussed in the sample experiment.
This paper is organized as follows: In Section 2 we present information about orthogonal arrays. In section 3 we briefly describe the support vector machines and the Pearson VII function. Also, we provide a description of the sequential minimal optimization algorithm that we used for classification. Finally, in section 4 we describe our proposal and discuss our results.

II. ORTHOGONAL ARRAYS AND EXPERIMENT DESIGN
Orthogonal arrays were first described by C.R.Rao as hypercube in 1946 [12] and introduced as statistical combinatorial arrangements methods. An orthogonal array consists of N rows and k columns, selected entries from a set of s symbols or levels defined as S = {0, 1, ... , s-1}. The term "Level" in the array definition is used to indicate the level of factors and variables that have an impact on the test components in the experimental designs using these arrays. An N × k matrix A formed in this way is called an orthogonal array of level s and strength t. In such a notation s denotes the number of levels. If the number s is equal to 2, there are two levels in the array, 0 and 1.
The t value is defined in the interval (0 ≤ t ≤ k). This value tells us the number of possible t-tuples belonging to N × t subarrays that can be defined in array A. These t-tuples are also equal to the number of rows of the array. In addition, the value of λ is used to define the index of these arrays. The λ value indicates the number of repetitions of t-tuples up to . When λ = 1, it is stated that the orthogonal array has the index unity. The orthogonal arrays are briefly denoted as OA (N, k, t, s) or OA (N, ,t). In summary, integers that are defined as N, k, s, t are parameters of orthogonal arrays. The number of rows N is the number of runs indicating the number of experiments performed. The number of k columns refers to the combination of factors affecting the experiment. s is the number of levels, t is the strength. Let OA (12,11,2,2) be the 12 rows (number of runs), 11 columns(number of factors), 2 levels, and with strength 2 and of index unity orthogonal array. This array is shown in Table 1.  For instance, let the first and last columns be selected of the exemplary orthogonal array described in Table 1. In this case, obtained new array shown in Table 2.  If pay attention to Table 2, four distinct rows can be seen. These are 0 0 , 0 1, 1 0, 1 1 matrices in order and can be seen they repeat three times. 0 and 1 values in this array can be defined statistically as two different elements. For instance, consider that an experiment is conducted on food ingredients. Designers have standardized all other ingredients except sugar amount and fat type. The values in Table 2 could be changed, "sugary", "sugarless", "margarine", "butter" respectively. It shows that there are two factors in this experiment and these factors have two levels. In this respect, this array is called a twolevel array. 11 columns in the matrix, 11 different variables, 12 rows tell 12 different experiments can be done depending on these 11 variables.
Also, Mixed-level orthogonal arrays are defined as well as sequences with two levels. In cases where factors affecting an experiment have different levels, the orthogonal arrays that are used may expand accordingly. In mixed-level orthogonal arrays, there are more than one S sets where the levels are obtained. In this case, the mixed-level arrays are shown as ( , 1 1 , 2 2 , … , , ). The first 1 column in the array is obtained from the 1 set, the next 2 column from the 2 set. However, there is no index definition in mixed-level arrays.
The algorithms for constructing standard orthogonal arrays have been clearly explained mathematically over time. Although Taguchi arrays are formed from existing arrays, there is no known method for all. Taguchi has a total of 21 orthogonal arrays of 2, 3, and 4 levels, in order. Unlike existing orthogonal notations, Taguchi arrays are denoted like below.

( ) ( )
Here, for example, the array denoted as L4(2 3 ), 4 is the numbers of experiments(runs), 3 factors with 2 levels (variable) for each factor. Similarly, mixed-level sequences are defined in the Taguchi catalog. For example, an L8 (2 3 4 1 ) means that the design has 8 runs, 3 factors with 2 levels, and 4 factors with 1 level. L4(2 3 ) and L8 (2 3 4 1 ) are shown in Table 3. In Taguchi designs, for the analysis of the results are obtained from the experiments, the performance statistics method called Signal to Noise (S / N) ratio is used. Performance statistics are used to measure the effects of uncontrollable factors (noise factors) in the experiment. Noise factors (uncontrollable factors) cause deviation in the values that are obtained from the results of the experiment. There are three S/N ratios in the Taguchi design shown in Table 4.  As the ratio S/N increases, the result variance decreases around the target value. While determining the factor level is preferred for effective factors, the value with the highest S/N ratio is taken into consideration in experiment design.

III. SUPPORT VECTOR MACHINES AND PEARSON VII KERNEL
Support vector machines (SVM) are located in supervised learning in the machine learning paradigm. It is used for case classification and pattern recognition in many different research disciplines such as medicine [13] and signal processing [14]. It is also an effective method to use decision trees with SVM to solve multi-class problems. Decision trees are preferred to determine measurement performances and make predictions [15]. The algorithms are used in SVM applications are very successful in classification problems. They are used to classify data sets that cannot be separated linearly. In summary, the algorithms are used herein look for a linearly separable hyperplane or a decision boundary that separates members of one class from another. If this decision boundary is found to be immensely successful, the desired classification result is achieved. An SVM classification model requires two main components: the support vector itself and the optimal decision limit.
Assuming that there is a set of two classes that can be easily separated from each other, there can be an infinite number of linear planes separating these two classes. If any i indexed point represented as ( , ) has definite values such as +1 or -1, it is necessary to identify the planes separating these points from each other and to determine which of these planes makes the best classification. The example classification result is shown in Fig. 1. When the values in classes 1 and 2 are displaced over time, there is a need to reconstruct the obtained planes to determine the optimum state and to rebuild the model. SVM algorithms are used for this purpose. Two classification methods are used in the SVM. These methods are called C-SVM and nu-SVM.
C-SVM is an error function is used conventionally to improve the compatibility of SVM. In the C-SVM model, the C value is used to systematically control the tolerance of discrete values. The coefficient C is an empirical parameter that works with grid search management. C-SVM function shown in equation 1, where w is the vector indicates the coefficients and is the slack variable and is also known as the training error for the training vector.
nu-SVM models have been developed as an alternative to the C parameter. In the solution of nonlinear problems, the C value is used as a weight parameter to measure learning errors and tolerance of plane and support vectors in the classification process. In equation 1 the parameter C is replaced by ϑ parameter. The parameter ϑ in the nu-SVM model represents the fraction of the support vectors expected in the solution of the problem. Thus, for any given value of (0 , 1], the shape of the classifier can be predetermined. nu-SVM classifier shown in equation 2.
subject to ≥ 0 , ≥ 0 In n-dimensional space, a hyperplane is an n-1-dimensional flat subspace. As it can be seen in Fig. 1, the hyperplane is a one-dimensional line, which is found as a separator in 2dimensional space. The hyperplane of the classes distributed in 3-dimensional space will be a 2-dimensional plane. If expressed mathematically; In a space with n dimensions, a separating Copyright © BAJECE ISSN: 2147-284X http://dergipark.gov.tr/bajece hyperplane will be the linear combination of dimensions with a sum of 0. 0 + 1 1 + 2 2 + ⋯ + = 0 The value 0 in equation 3 is often referred as a deviation. If the value 0 equals zero, the hyperplane will be at the origin point.
However, most classification problems cannot be solved by a linear function. There is various type of functions to obtain the hyperplane between the two classes and these functions are called the kernel. Besides the linear kernel function, polynomial, radial based, and sigmoid functions are also used. The kernel functions are expressed as ( , ). In the kernel functions, the data set classes are separated by using the C and ϑ parameters [16,17]. As shown in Fig. 2, when classification solutions cannot always be easily solved by a linear function, alternative solutions can be found in spaces with increased dimensions. The kernel functions and dependent parameters should be determined according to the solution. In addition to linear solutions, different kernel functions should be used to separate and classify data sets belonging to multiple and nested classes.

A. Pearson VII kernel function
The Pearson VII function was used for classification that was developed in 1895 for X-ray diffraction scanning and for more accurate estimation of Gauss and Lorentz graphs with the parameter ω. Thus, these graphs, which are formed by the control of , parameters, have behaved similarly to the sigmoid function which is frequently used in artificial neural network research [18]. The kernel functions which are used in support vector machines are symmetric and semi-defined positive. In general, any kernel function fulfilling the Mercer's theorem conditions belongs to the class of valid kernel functions. The Mercer's theorem implies that a valid kernel function must be symmetric [19]. The resulting symmetrical matrix representing the kernel matrix must be positive semidefinite. The Pearson VII function is as follows: where xi and xj are two vector arguments. The single variable in the original version of the function is replaced by two vector arguments so that the Euclidean distance is measured between them. If one of the symmetrical or positive semi-definite conditions is met, then the conditions of the Mercer's are supplied. As a consequence, the PUK function satisfies all these requirements.

B. Sequential minimal optimization-SMO
The SMO algorithm is used to train support vector machines classifier functions. The SMO is developed to solve the known optimization problem in quadratic programming [20]. The optimization problem is shown as follows,
This method solves the main optimization problem of SVM by accepting two Lagrange multipliers { 1 , 2 } as a starting value of 1 = 2 = 0 in a previous set of solutions { 1 , 2 , 3 , … . . , } for a classification problem. In equation 5, given = 1 2 , 1 1 + 2 2 = 1 1 + 2 2 = and 1 = − 2 values are replaced and also eliminating 2 via the first and second derivatives; The value is obtained in equation 6. Here prediction error and the value is defined as 2 − − . The value is the error that the ith training example and = ( − ) where is the plane of the classification.
The main problem encountered in the training of data sets in the SVM method is that time performance in the training sets into chunks is poor. This is called a constrained quadratic programming problem in the literature. The SMO algorithm greatly improves the training performance of sparse data sets in these classification tasks. Although SMO offers a fast solution for sparse datasets, especially for linear SVMs, this algorithm can be extremely slow in non-sparse datasets. In this paper, we did not have a performance problem as we are using a small dataset and a nonlinear kernel function.

IV. CLASSIFICATION RESULTS AND CONCLUSION
In this study, in order to reconstruct the L16 (2 15 ) Taguchi array, we formed the classification model by using the support vector machine model which belongs to L4 (2 3 ), L8 (2 7 ), L8 (2 4 ), and L12 (2 11 ) 2-level arrays [21]. During the creation of the model, we also tested with various kernel functions that were used for SVM. Moreover, we achieved the highest performance classification with the Pearson VII kernel (PUK) function that was used with Sequential Minimal Optimization algorithm. Firstly, we created the model for the L16 2-level orthogonal array with the PUK kernel. At the time of the model creation, the maximum successful classification results were obtained by changing the classifier parameters C, ϑ, and Pearson VII kernel function parameters omega ( ), sigma( ) is shown in Table 5. To classify the array data, we prepared a data set using the matrix elements of other 2-level orthogonal arrays. The confusion matrix of the results we obtained in Table 5 is shown in Fig. 3. As seen in Fig 3, most of the matrix elements are classified correctly thanks to the kernel function we use. During the study, we observed that nonlinear core functions had higher performance in the classification of orthogonal arrays. The performance of the RBF and PUK cores is noticeably better than the other kernels, especially while classifying in 2-level classes. The output in the twodimensional space obtained during the PUK classification process is shown in Fig. 4. In this study, we have used the sequential minimal optimization algorithm for the values in the classification of the support vector machine. We achieved higher classification results with the SMO rather than the model that we applied without this algorithm. The plot of the classification results using the SMO algorithm is shown in Fig. 5. We created the plot of the solution function and examined this graph in three dimensions to see if it is a sufficient hyperplane to differentiate the basic array values that we used and obtained during classification.   6 shows the values of the 2-level class shown in green, and the solution function of the classification with the highest performance is shown as a convex curve.
As a result of the most successful classification, we obtained the new orthogonal array is shown in Table 6. The red variables in Table 6 show values that differ from the original array. When we examined the distribution of elements in the array, we found it had a 40% difference from the original array.
Besides, the Taguchi arrays catalog contains mixed-level arrays as well as 2-level arrays which are defined in section 2.
We applied the model that is obtained for the L16 (2 15 ) array to find the L16 (2 3 4 4 ) mixed-level orthogonal array. To find the array L16 (2 3 4 4 ), we classified the arrays L16 (2 6 4 3 ), L16 (2 9 4 2 ), L16 (2 12 4 1 ), L8 (2 4 4 1 ), but we could not achieve satisfactory results. These results are shown in Table 7. The reason why our method fails on mixed-level arrays is the SMO algorithm is used to solve an optimization problem with two variables. According to the results in Table 7, we saw that the algorithm we use in the classification method is not suitable for different factors and levels in mixed-level arrays. On the other hand, the new array we obtained was not orthogonal. Therefore we could not fully analyze it in accordance with the principles of experimental design. However, in non-orthogonal variance analysis, we examined the S/N ratio values using the ANOVA (Analysis Of Variance) method. These results are shown in Table 8 and Table 9.  TABLE VIII  STANDARD L16 TAGUCHI 2-LEVELS ARRAY SIGNAL TO NOISE RATIOS / LARGER IS BETTER   Factor  A  B  C  D  E  F  G  H  J  K  L  M  N  O  P   1  3,763  3,763  4,515  1,505  2,258  3,763  3,010  3,763  3,010  3,010  2,258  2,258  3,010  3,  We examined the parameters of the sample experiment (Drought survey) shown in Table 8 for the reconstructed array. The important factor for the experiment here is, of course, which varies depending on the type of experiment, the characteristics of the components in the subject matter, and which change has a greater effect on the outcome of the experiment or process. Without changing the value and factor information in the original array, our results are obtained in the new array are shown in Table 9.
In addition to the results in Table 9, when we examined the data averages related to signal/noise ratios, we also observed that inactive parameters in the standard orthogonal array can be effective in the newly constructed array, taking into account the order of parameters. For instance, in the sample experiment in this study, the effect of the North-South aspect parameter (factor J), on the result of the newly constructed array is shown in Figs 7 and 8.  Another effect can be seen in Figures 7 and 8, in which we could not examine the effect of the last three parameters (N, O, P) on the experiment depending on the success of the classification. A better kernel function is used for the classification will eliminate the failure here. However, when these parameters are specific to the experiment are evaluated, it is seen if the effect of the results is low, and the classification is sufficient rather than unsuccessful. As a result of the classification we made under normal conditions, the deterioration of orthogonality seems to prevent the parameter effect research. Comparisons of our results are shown in Table  10. As in the studies we have compared in Table 10, orthogonal array researches are particularly focused on creating new variations of 2, 3, 4, and mixed-level arrays with mathematical approaches. It is envisaged to develop and simplify the experimental designs with the arrays to be found in this way. Thus, a more useful optimization of production designs can be achieved using multiple factor levels and minimal experimental studies. In this paper, we used a novel approach that includes machine learning methods to obtain existing 2-level and mixedlevel orthogonal arrays by classifying existing arrays. However, the model we obtained more successfully in fixed arrays, did not make the same performance in mixed-level arrays.
The effects of column values are more important in optimal design approaches in the non-orthogonal row-column structure [25]. It is known that non-orthogonal approaches are not suitable for optimal design methods. In addition, in some of the studies that inspired this study, artificial neural network training methods [26,27], compared with the Taguchi method give more realistic results than orthogonal arrays. In this study, depending on the type of experiment and the relationship between physically investigated values, we investigated whether a Taguchi array can be obtained to approach the results of artificial neural networks. As shown in Table 9, we could not obtain the second level values of the factors in the last three columns. The effects of these columns need to be examined in more depth. Considering the order of factors affecting the experiment, forming an orthogonal array can give more accurate results.
Nowadays, as the usage areas of artificial intelligence are increasing, it is possible to improve the robust design approach, which is an important value in product design with machine learning methods. This requirement in production activities is very important in terms of more efficient use of resources.