CLASSIFICATION OF FIVE DIFFERENT RICE SEEDS GROWN IN TURKEY WITH DEEP LEARNING METHODS

. The increase in the world population and harmful environmental factors such as global warming necessitate a change in agricultural practices with the traditional method. Precision agriculture solutions offer many innovations to meet this increasing need. Using healthy, suitable and high-quality seeds is the first option that comes to mind in order to harvest more products from the fields. Seed classification is carried out in a labor-intensive manner. Due to the nature of this process, it is error-prone and also requires a high budget and time. The use of state-of-the-art methods such as Deep Learning in computer vision solutions enables the development of different applications in many areas. Rice is the most widely used grain worldwide after wheat and barley. This study aims to classify five different rice species grown in Turkey using four different Convolutional Neural Network (CNN) architectures. First, a new rice image dataset of five different species was created. Then, known and widely applied CNN architectures such as Visual Geometry Group (VGG), Residual Network (ResNet) and EfficientNets were trained and results were obtained. In addition, a new CNN architecture was designed and the results were compared with the other three architectures. The results showed that the VGG architecture generated the best accuracy value of 97%.


Introduction
Since cereals are a source of protein, carbohydrates and fat, they have a crucial role in continuing of human life.Rice (Oryza sativa (Asian rice) or Oryza glaberrima (African rice)), which is one of the most fundamental food sources in the world, has an essential place in the food culture of many countries, especially far eastern countries [1].It is estimated that there are thousands of rice types grown by farmers.There are 33 registered rice species in Turkey, but only 15 of them are widely cultivated [2].Some species have more commercial value in terms of their nutritional value, taste, appearance and structure.The most cultivated species is Osmancık due to its high yield.Some other rice species produced in Turkey are Ergene, İpsala, Altınyazı, Yatkın, Cameo, Demir, Sarıkılçık and Özgür.
The commercial value of rice depends on its genetic characteristics and quality.Currently, the classification of seeds is carried out visually by experienced employees.This process, which is both laborious and slow, can cause financial losses because it cannot be done with adequate precision.Furthermore, even a skilled person can provide accurate and reliable quality control on only a few common rice varieties.If this problem can be solved with an automated process based on a computer vision system, it will provide a more accurate and stable seed classification.Thus, the incomes of both farmers and countries will increase with the increase in the quality of the crops to be harvested.Besides, the natural variety of seeds makes the process of classification with computer vision a challenging problem.
Computer vision gives computers the ability to recognize the world around them through visual inputs.Thus, the necessary actions can be taken.Basically, a computer vision system has two components.The first component is a sensing device that captures an image or video.The other is called an interpreting device that processes input and generates an output.Collaboratively, they both try to understand and interpret the objects and the events that occur around them.
Computer vision has been improving rapidly in parallel with the great developments in Artificial Intelligence (AI) and Deep Learning (DL) in recent years [3].It finds widespread application areas in many fields, from autonomous vehicles to disease detection.Moreover, DL has reached the capability to create new things like new human faces that didn't exist before [4].A traditional computer vision pipeline necessitates the following steps; visual data acquisition, pre-processing, feature extraction and model building.DL, on the contrary, eliminates the feature extraction phase from the traditional computer vision pipeline.DL methods can learn representations through a series of data transformations in their successive layers.
The main objective of this study is to create a new rice dataset that did not exist before and to compare the performances of various CNN architectures on this dataset.For this purpose, seeds of five different types of rice, which are widely grown and consumed in Turkey, were obtained from distinct locations.The images of the seeds were captured using a digital microscope.A database was created by randomly selecting approximately a thousand images of each type.
1.1.Related Studies.Before the widespread use of DL methods, the seed classification problem was solved with traditional computer vision methods.Huang and Chien extracted the shape features of rice seeds and trained the collected dataset using a Multilayer Neural Network to classify them [5].The highest accuracy value was achieved in the Tainan-11 type with 97.35%.Ali et al. derived all morphological, texture, and color features of six different rice species and applied fuzzy logic methods [6].According to their results, the lowest accuracy value is 94.2% and the highest value is 98.9%.There are also numerous comparative studies using traditional machine learning and DL methods.Qui et al. captured hyperspectral images of four rice species at two different spectral ranges.KNN, SVM and CNN models were chosen as a classifier [7].As expected, the CNN model gave the best results over the other methods.They also concluded that more accurate CNN models could be built by increasing the number of samples.Kiratiratanapruk et al. classified 14 rice varieties with four statistical machine learning methods (LR, LDA, k-NN, and SVM) and five different CNN architectures (VGG16, VGG19, Xception, InceptionV3, and InceptionResNetV2) [8].The SVM method gives the highest accuracy value among machine learning methods with 90.61%.On the other hand, InceptionResNetV2 model produces a 95.15% accuracy value much more successful than other models.
In parallel with the developments in DL methods and technology, various CNN architectures have started to be preferred in solving the seed classification problem.Hoang et al. classified six rice types using eleven different CNN architectures [9].DenseNet with 121 layers has a 99.05% accuracy value.Despite this, MobileNet gave the worst results, training fewer parameters than other CNN models.Gilanie et al. proposed their CNN model called RiceNet to classify Pakistani grown seven different rice seeds [10].They also trained VGG-19, ResNet50, and GoogleNet(Inception-V3) to compare the results with their model.RiceNet has succeeded in accurately classifying all types of rice with 100% success.

Convolutional Neural Networks
Recent developments in Artificial Intelligence and Deep Learning have allowed computer vision systems to build more accurate and reliable classifier systems than a human can do.DL methods, such as CNN, enhance the accuracy of classification models by using a large data set and sufficient computing power.CNN consists of a variety of layers, such as input, convolution, pooling, fully connected, softmax and output.In a DL architecture, the first layer that receives raw pixels of input data learns how to represent simple features.Each successive layer learns the complex features of the previous layer as they collect and recombine the features of the previous layers.Adding more layers to the network allows it to handle higher-dimensional data.
Basically, a CNN architecture has three types of layers that can be seen in almost any model.
• Convolutional layer: Extracts features from visual inputs.
• Pooling layer: Reduces the number of parameters passed to the next layer.
• Fully connected layer: A Multi-layer Perceptron (MLP) network that takes features learned from the previous layer as input and returns class labels as output.
2.1.Visual Geometry Group (VGG).Although VGG is not the winner of the 2014 ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) competition, it has been used in many applications due to its modular and simpler architecture [11].Unlike previous CNNs, VGG prefers smaller kernel sizes.This property causes VGG to increase its expressability.VGG architecture has several different variations according to the number of layers.The smallest one (VGG-11) has 11 layers and 133 million parameters.The largest one (VGG-19) consists of 19 layers and requires 144 million parameters to be tuned.The architecture of the VGG is displayed in Figure 1.

Residual Network (ResNet).
ResNet, which won all five categories of the ImageNet Challenge in 2016, was proposed by Microsoft Research Team [13].The top-5 error rate of ResNet is 3.57%.With the addition of more layers to deep learning models, the vanishing gradient problem arises.ResNet tackles this problem by devising a residual module with a skip connection.This approach allows them to train networks with 50, 101, and 152 layers without increasing the number of parameters.
2.3.EfficientNets.EffiencetNets suggests a new scaling technique that scales all dimensions using a compound coefficient [14].Previous CNN models prefer to scale only one of the three dimensions.They designed seven different versions of the model called EffiencetNets.Top-1 accuracy of EfficientNet-B7 is 84.3% on ImageNet.They also conducted experiments using five different learning data sets, including CIFAR-100.The results show that the proposed method can perform better using fewer parameters.The architecture of EfficientNet-B0 is given in Figure 2. Then there are successive convolutional and pooling layers.Finally, there are the fully connected, softmax and output layers.The architecture of the custom CNN architecture is given in Figure 3.In addition, a detailed description of the custom architecture can be found in Table 1.

Methodology
3.1.Image Acquisition.The rice seed images were acquired using a digital microscope with a spatial resolution of 0.01 mm/pixel and a color depth of 24 bit.The images were captured under natural lighting conditions.There are 6833 images of Turkey's five most extensively grown rice seed species.These species are Osmancık, Cameo, Özgür, Sarıkılçık and Yatkın.Between 703 and 1736 images of each type were randomly selected to ensure that the dataset was balanced proportionately.An expert on rice seed labelled all seed images according to their class.A sample of each rice type is displayed in Figure 4.

Performance Metrics.
The success of classification models should be evaluated using various criteria.Thus, the results of different architectures will be compared and the most successful one will be selected.Accuracy is the most common and frequently used criterion.The accuracy formula is as follows; Accuracy alone is not a sufficient criterion in some special cases.In addition, Precision and Recall values are also preferred to evaluate the performance and robustness of classification models.F1-Score is the geometric mean of both Precision and Recall.

P recision = T P T P + F P
(2)

Results and Discussion
Classification of rice seeds is carried out by traditional methods that require heavy human labor.However, rice seed types are very similar in terms of shape, texture and color, so the probability of making mistakes is high.Recently, many studies  3 and  4, respectively.As shown from Tables 3 and 4, VGG and the custom CNN architecture generated successful and robust results.The best performance belongs to VGG and has an accuracy value of 0.97.The custom model is more successful than both ResNet and EfficientNet.Even though it is worse than VGG, its performance is promising as a classifier.

Conclusions and Future Works
The main objective of this study is to classify the rice seeds data set with three different and a new custom CNN architecture.For this purpose, a data set consisting of five different types of rice grown in Turkey was created.According to the results, it was observed that the CNN architecture with the best accuracy value was VGG.The other architectures have also generated acceptably close results.Thus, it was shown that the classification of rice seeds could be carried out more successfully with advanced computer vision methods.This study used only images of five different seed types and four different CNN architectures.In future studies, we aim to broaden the data set by increasing both the number of images and the number of rice species.We also plan to get results using the Vision Transformers (ViT).

Declaration of Competing Interests
The author declare that he has no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figure 3 .
Figure 3.The architecture of the custom CNN model.
have been published showing that DL methods successfully classify different types of seeds.For this purpose, an image data set of five different species grown in Turkey was created.This dataset was trained with three different popular and custom CNN architecture.The accuracy and loss graphics of each CNN architecture's training and validation data are displayed in Figures 5 , 6, 7 and 8.The confusion matrix and overall performance of the CNN models are presented in Tables

Figure 9 .
Figure 9. Overall performance of CNN architectures.

Table 1 .
Description of the custom CNN model.

Table 2 .
Description of rice seed image dataset.

Table 3 .
Confusion matrix of CNN architectures.

Table 4 .
Overall performance of CNN architectures.