Classification of Wheat Rootstock and Their Hybrids According to Color Features by Machine Learning Algorithms

ABSTRACT


Introduction
The temperature of our world is increasing day by day.From 2000 to 2020, the world temperature increased by about 1 °C.[1].This increase prolongs the residence time of water in the atmosphere.This atmospheric phenomenon leads to two natural disasters.The first is the regional droughts, which are caused by the decrease in the number of rainy days in the year.The other is flood disasters that occur when the water accumulated in the atmosphere suddenly descends.[2], [3].These two natural disasters reduce agricultural production [4].Another factor affecting agricultural production is the Covid-19 pandemic, which still continues today.The rules applied by the countries for precautionary purposes lead to the lack of labor and input supply required for food production.This causes inadequacy of agricultural production and deterioration of the distribution network and increases the need for food.[5]- [7].On the other hand, food demands are increasing due to the increasing world population.The number of people living on Earth is approximately 7.79 billion.8.9% of this population, means 690 million, are people who are starving.In order to end this situation, it is aimed to end hunger by 2030 and worldwide studies are carried out for this.[8].Increasing food production and ensuring food security are of vital importance for humanity.Therefore, new strategies are being developed.
At the beginning of these strategies is to increase the production efficiency of cereals that meet a large part of the needed nutrients.It is aimed to increase the production and crop yield of cereals, which are cheaper and more satisfying than other food products.Wheat is one of the most cultivated cereals and resistant to environmental stresses and has a rich biodiversity.It contains carbohydrates, essential amino acids, vitamins, beneficial phytochemicals and fibrous components in its structure.[9], [10].It meets 20% of our daily energy needs [11].Considering the price and nutritional value of prune, it is an important grain that can be used to meet the increasing food demand.Therefore, it has a large market around the world.Global wheat production was 760.9 million tons on an area of approximately 219 million hectares in 2020.In the world, 43.9% of wheat production is produced in Asia, 33.1% in Europe, 16.3% in America, 3.3% in Africa and 3.3% in Oceania.The countries that produce the most wheat in the world have the largest wheat market with China (114.58 million tons), India (84.29 million tons) and Russia (56.18 million tons).[12].Wheat market has economic value for many countries in the world.The increase in the yield of the wheat crop means that the income is obtained from market increases, too.The suitability of the varieties planned to be produced for the growing environment increases the plant yield.In this way, more production and profit are achieved.On the other hand, it is important to choose the right rootstocks for new varieties to be developed in breeding studies.By crossing the right cultivars, new high yielding cultivars can be developed.In general, molecular scans are performed to determine the varieties.It is detected by methods such as Gel electrophoresis or Near Infrared Reflectance (NIR) spectroscopy.[13], [14].These methods require expensive instrumentation and long analysis times.On the other hand, the analyzed sample is not reused.
With the developments in image processing techniques and artificial intelligence in recent years, it offers the opportunity to analyze quickly and free from humaninduced errors without damaging the product.[15]- [17].These techniques, which have been widely used in wheat classification studies, aim to increase crop yield on a country basis.Ronge and Sardeshmukh [18] made a study to classify the wheats according to texture of grain.Wheat images of four different Indian local varieties were obtained and extracted 131 texture features with five different methods.For the classification process, k-NN and ANN algorithms are used.Their results showed that classification accuracy reached 100% success via ANN architecture.In another study, the width, height, area, perimeter, kernel asymmetry and diameter information of 210 wheat grains were extracted using image processing techniques.These data were processed in WEKA program using Neural Network (NN), Naive Bayes (NB), J48 and multilayer perceptron (MLP) machine learning algorithms.The highest classification success achieved was obtained with an MLP with 97.17% success.[19].In a similar study, the physical properties, which were extracted via image processing, were optimized with the artificial bee colony (ABC) algorithm and used in the classification process to detect the bread or durum wheat with an artificial neural network.Wheats were classified with 100% success with optimized network architecture.[20].In another study in which wheat was classified according to its physical properties, The classification process was carried out according to the similarity of the wheat grains to the known geometric shapes.In this study, in which Emmer (tetraploid) and Einkorn (diploid) varieties were used as well as modern varieties.It was observed that wheat with an aspect ratio equals to 1.6 was bread wheat, and wheat with a 2.4 aspect ratio was durum wheat.[21].The successes from deep learning studies that have been popularly used in the past few years are also high.Laabassi et al [22] classified simeto, Vitron, ARZ, and HD varieties from Algerian local wheat by using deep learning networks.DensNet201, MobilNet and inception V3, are called the state of art, were used in the classification made with the deep learning architectures of wheat grains.According to the results, the best classification was performed with the DensNet201 network with 95.68% success.
In this study, unlike other studies, Ahmetbugdayi and Cesare and also BC1F6, BC2F5 hybrids, obtained from Ahmetbugdayi x Cesare, were classified only according to their color characteristics.For this, images of the samples were taken in the imaging cabinet we developed.With image processing techniques, data consisting of 12 color channels belonging to four different color spaces were obtained.Squares were drawn in the center of each wheat grain in the images in these channels, and the average of the pixels in these squares was accepted as the color of that grain for machine learning.By using the created data set, high classification success was obtained with ANN, SVM, k-NN, DT, RF and Naive Bayes algorithms.It has been shown that using this method, wheat classification can be done without the need for fast and expensive analyzes and equipment.It can also be used as an alternative method for color analysis in wheat.

Plant material
In this study, Ahmetbugdayi and Cesare durum wheat varieties used as rootstocks were grown in Karamanoglu Mehmetbey University engineering faculty speed breeding greenhouse.(location: 37°10'35.8"N,33°15'09.7"E).Two different hybrids were obtained from these rootstocks by using the back-cross method.F1 hybrid was obtained by crossing Ahmetbuğdayi and Cesare.Then, BC1F6 was obtained by backcrossing F1 and Ahmetbugdayi one time and self-inbreeding of the hybrid six times.In the same way, BC2F5 was obtained by backcrossing F1 and Ahmetbugdayi and self-inbreeding of the hybrid five times.The hybrids were bred in the research land in Karaman.(location: 37°14'40"N, 33°20'08"E).Breeding studies BC1F6 and BC2F5 hybrids were sown in September 2019 and harvested in July 2020.

Collecting images
A custom imaging system is designed to get images (Fig. 1).The box is produced with a 3d printer to prevent environmental light noise and to take photos at constant light intensity.The box can move up and down.Light intensity and movement of the box are controlled by the Atmega328p chip.Samsung 4000 K Strip LED is used for lighting.Basler MP acA2500-14uc / um CMOS color camera and Basler C125-0618-5M F1.8 f6 mm are located on the box.The camera is fixed at a distance of approximately 15 cm from the viewing area.LattePanda minicomputer card (Intel Cherry Trail Z8300 Quad Core 1.8GHz CPU, 4GB DDR3L RAM, Intel HD Graphics, 12 EUs @200-500 MHz GPU) is used to get and collect the images.Specially designed black stands were used by using double-sided tape that prevents the wheat from slipping on it.
From each wheat sample, 70 grains were randomly selected for each image.For preparing the image data five different images of each Ahmetbugdayi, Cesare, BC1F6 and BC2F5 were used.A total of 1400 wheat grains are included in the images.All the images were gotten at the same light intensity.Pylon Viewer 6.2.0.8205 program was used to obtain images and saved in Tagged Image File Format (TIFF) format.

Image processing
Images were processed by using the python (version 3.8) programming language and the OpenCV-4.1.3library.First, RGB images were preprocessed in order to extract the color features of each wheat grain.For this, the images were converted to gray scale images.Then the Otsu threshold was applied.Salt-pepper noises caused by Otsu thresholding were eliminated by dilution with a 3x3 kernel size.The position of each wheat grain was determined using the contour finder library in OpenCV.Then, frames with a size of 25 x 25 pixels were drawn in the center of each grain (Fig. 2).After determining the position of the square, images were converted to RGB, HSV, Lab, YCrCb color spaces.Afterwards, the determined positions of the squares were drawn at the same position of all four color spaces applied images.a.
b. Fig. 1 Imaging system, a. 3D designed model, b.Manufactured imaging system Fig. 2 The process of finding the color frames of each wheat grains Fig. 3 Extracting color features Every pixel of inside the taken pieces were listed then the average of them was calculated.So, twelve color channels values were calculated one by one for 1400 grains in the images.

Dataset
The dataset consists of Ahmetbuğdayi, Cesare, BC1F6 and BC2F5 wheat grains.Each variety and hybrids comprise 350 grains.According to four different color spaces, totally 1400 grain samples have twelve color channel features.out of the total data, 840 data (60%) was randomly selected for training and the remaining 560 were data used as test data.We standardized all the values in the datasets by the z-score method before developing machine learning algorithms.

Classification Method
Confusion matrices were used to determine the performance of the machine learning algorithms [23].To obtain the confusion matrices values the formulas are shown in Eqs 1-5 [24], [25]were employed with the python SciKitlearn library.Additionally, receiver operating characteristic (ROC) score, is one of the probabilistic forecasting performance measurement method was determined [26].The roc curve has a false positive rate on the x-axis and a true positive rate on the Y-axis.The intersection of these two ratios gives the ROC value.It provides a lot of convenience in evaluating the performance of the machine learning algorithm, especially in an analysis with unbalanced data sets.The ROC score is defined as the area under the curve that ranges from 0 to 1.In general, prediction models with higher scores are considered more skillful.

Machine Learning Algorithms
Six machine learning algorithms were employed in this study.These Algorithms are ANN, SVM, k-NN, DT, RF and NB (Fig. 4).The algorithms used were compiled using the SciKit-learn library.Confusion matrices and ROC graph were drawn for all the algorithms.All the results were saved to excel file during the running program.Also, Confusion matrices and ROC graph were saved as jpeg file.

Artificial Neural Network (ANN)
Artificial neural networks mimic the biological neural structure of the human brain.ANN can derive or classify new information using previously learned or classified information.ANN consists of input layer, hidden layers that include computational neurons, and an output layer [27]- [29].In this study we used multi-layer perceptron (MLP).Our ANN architecture included two hidden layers.In our code, we used nested for loops.In every for loop the accuracy of the network was determined and listed.For each index of list was checked with if statements to find neurons number that had the higher accuracy result.

Fig.4 Machine learning algorithms used in the study
The neurons with the highest accuracy was determined and printed.The network initiated with 10 neurons for the first and second hidden layer and the neuron numbers decreased until 1 for each hidden layers.So, the best performed neurons number in the hidden layer were determined by the program automatically.ReLU activation function and 200 epochs were used.The architecture of the ANN included two hidden layers, twelve inputs (color channels) and four outputs (wheat varieties and hybrids).

Support Vector Machine (SVM)
Support vector machine is one of the non-parametric and supervised learning algorithms.SVM is characterized by using the kernel and acting on the margins [30].The most important advantage of SVM needs low processing time.Kernel functions and c values are the hyper-parameters that need tuning.In this study, we used Linear, Polynomial and Rbf kernel function.For all these kernels, c values were tuned automatically according to accuracy.We used for loop to test the c values.All the accuracy results were listed by the append function.The index had the highest accuracy gave the best c value.

k-Nearest Neighborhood (k-NN)
K nearest neighborhood is an instance based and supervised learning algorithm.The related classes is placed on the coordinate plane.Test data was placed later on the coordinate plane.Test data is classified by looking the nearest neighbors [31].In order to run k-NN algorithm k value is needed by the user.In this study, for the best k-NN algorithm performance, k value was determined automatically.Again, we used for loops and list method, the maximum k values were taken from user.In for loop all the k values are run and the accuracies were listed.According to the accuracy list index had the highest accuracy was determined the fine-tuned k value.Minkowski distance was used for metric parameter.

Decision Tree (DT)
Decision tree algorithm consists of roots, nodes and several leaves.Nodes makes binary decision to determine the categories.This algorithm is a non-parametric can affectively use for classification and regression studies [32].Here, we test Gini index and Entropy criterion, Eqs (6,7) were used to calculate, to find the optimum spilt of roots to reach the highest performance.

Random Forest (RF)
Random forest is an algorithm based on classification and regression tree, consisting of many decision trees.Each trees are built using a different set of randomly selected input data.This trees are voted to label the most likely class.The final label is determined by the majority voting [33], [34].In this study we used Gini and Entropy criterion to find the successful model.

Naïve Bayes (NB)
Naïve Bayes algorithm is a powerful probabilistic classification algorithm usually employed while working with huge data set.Naïve Bayes algorithm is based on Bayes theorem.This algorithm is work to find the probability of occurrence of each category under the condition of this occurrence.Classification is done according to the highest probability [35], [36].We used to see the wheat sample classification success of NB algorithm and compare the other machine learning algorithm.

Results and Discussions
In this study Ahmetbugdayi, Cesare, BC1F6 and BC2F5 varieties and their hybrids wheat grains were used for classification according to color features.For the first step, image processing techniques were used to convert the four different color space and separate all the color channels to make a data set for machine learning algorithms.After separating color channels, 25x25 square portions of the image, as named ROI, were extracted from all the grain centers.All the pixel values in the ROIs were listed and the average of them were determined.As a result, each grain color channels had become the single averaged value.All data were standardized and 60% of the data was reserved as train and the rest as test data.
We employed six different machine learning algorithms to reach the highest classification accuracy.In addition, we chance the hyper-parameters of some algorithms to increase the classification success.Five algorithms gave the accuracy results approximately 99% (Table 1).The highest accuracies were obtained more than 99% from both ANN, SVM and DT classifier models.The lowest accuracy determined with NB algorithm is 96%.
For the ANN algorithm, we used for loop and if statements to find the neuron numbers, had the highest accuracy, instead of randomly choosing [37], [38].The highest neurons numbers were taken from the user for all the hidden layers.Then the neurons numbers were decreased one by one for each epoch.We decreased the neurons because we observed that the same accuracy can obtain with different neurons number of the hidden layers.Therefore, the architecture of the ANN fine-tuned with the least neurons.The program started with 10 neurons for each hidden layer.
Table 1 Machine learning algorithm performance results.

Fig.5 ANN architecture
The highest accuracy results were 99.29%, obtained by the network consisting of 3 neurons in the first and second hidden layers.Achieved the best architecture of the network was demonstrated in Fig. 5.In the SVM model, Linear, Polynomial and Rbf kernel functions were used and their accuracy results were compared.For loop used to determine the optimum c values for all SVM models [39].C value increased by 0.1 per epoch.The for loop was started with c value equal to 0.1 and continued until 10, which was set by the user.
As shown in Table 1, we obtained 99.11%, the highest accuracy in SVM models, with the model tuned with linear kernel and c value equal to 3. The second model tuned with polynomial kernel and c=3 had 98.93% accuracy as the same as third model, had Rbf kernel and c=4.As seen Table 1.All SVM gave very closed accuracies.
In DT model, by using Gini index and Entropy criteria for node distribution optimization, we determined each contribution to accuracy.As seen in Table 1, the model that includes Gini index had a little bit lower accuracy than the other model.The obtained accuracy with model finetuned with Entropy was determined 99.11% and for the model with Gini index was 98.57%.
Confusion matrices were shown in Fig. 6   The areas under the curve (AUCs) received from the ROC analysis for test showed the highest value was 0.9984 for SVM, the other algorithm had 0.9971 (ANN) and 0.9843 (DT).The AUC results imply that, three algorithms have very good classification capacity for color feature included data.For three algorithms precision, F1 score and recall were calculated approximately 0.99, specificity values were determined about 0.945 for all the algorithms shown in Table 1.
For the k-NN algorithm For loop is again used to find the best number of neighborhoods instead of randomly choosing [41].The highest accuracy (98.57%) was achieved when k equaled 5. Precision, f1 score and recall were determined roughly 0.986 and specificity of k-NN was calculated 0.933.AUC was computed 0.998 It was used the criterions by one by to find the contribution of Gini index and Entropy to classification accuracy in RF algorithm.But the results of accuracy, precision, F1 score and recall were obtained almost same (0.988) that shown in the Table 1.Also, specificity and AUC of them were determined 0.934, 0.996 respectively.The least accuracy was obtained with NB algorithm that accuracy was approximately 96.0% and precision, F1 score, recall was determined roughly 0.96, specificity and AUC was calculated 0.93, 0.99 respectively.

Conclusions
Wheat classification is a very important process for producers and breeders.While breeding species suitable for environmental conditions means more profit for producers and choosing the right parents for breeders is one of the most important criteria for selecting desired traits in future generations.On the other hand, physical appearance of wheat such as color and grain size are also important reasons for preference for producers.Particularly for durum wheat, grain color is one of the important quality criteria.Conventional methods require expensive equipment and samples used during analysis cannot be reused.In this study, we presented a different analysis method for wheat classification, including image processing techniques and machine learning algorithms.In this way, product loss that occurred during the analysis was prevented.For different color spaces were used.Color channels of these spaces were obtained by image processing techniques.The color feature samples, 25x25 pixel squares (ROI), were collected from the center of the grain, instead of taking a single pixel.All the pixels in ROI were listed.The average of the list gave the color channel value for the grain.This method was applied to 1400 wheat grains.12 different color channel features were determined and used as input data for the classification algorithms.Six machine learning algorithms were employed to classify the wheat grains.Some of the hyperparameters of the algorithms were determined by the program, was coded in python.ANN, SVM and DT algorithms achieved more than 99% accuracies.In addition, k-NN and RF algorithms had almost 99% accuracy.NB algorithm also had higher accuracy, was approximately 96%.This study suggests that wheat grains can be classified using machine learning algorithms according to the color channels of different color spaces.In addition, Determining the color values of the pixels by drawing a square in the center of the grain can be used in the selection of wheat varieties with the desired color characteristics.

Fig. 6
Fig.6 Confusion matrices.a. ANN, b.SVM, c. DT In Fig.7, we showed the overlaps of tests and predict results for each class.ROC is another way to present the classifier model's performance.ROC graph gives details on true positive rate and false positive rates on Y and X axis.

Fig. 7 Fig. 8
Fig.7 Test -Predict blasted results.a. ANN, b.SVM, c. DT Point (0,1), means false positive rate is 0 and true positive rate is 1, specifies that the classifier reaches the best classification success and point (1,0) classification of all the data were placed wrong classes.Misclassified data in the ROC graph is considerable specifying the dip in the curve [40].ROC curves of three classifier models are shown in the Fig.8.These curves demonstrate that three classifier model performance reached the almost perfect classification.
, represent the classification of 560 test grains for the most successful