Classification of Serranide Species Using Color Based Statistical Features

In this study 6 species of Serranidae family (Epinephelus aeneus, Epinephelus caninus, Epinephelus costae, Epinephelus marginatus, Hyporthodus haifensis, Mycteroperca rubra) were classified by using a color based feature extraction method. A database which consists of 112 fish images was used in this study. In each image, a fish was located on a white background floor with the same position and the images were taken from different distances. A combination of manual processes and automatic algorithms were applied on images until obtaining colored fish sample images with a black background. Since the presented color based feature extraction method avoids including background, these images were processed by using an automatic algorithm in order to obtain a solid texture image from the fish and extract features. The obtained solid texture image was in HSV color space and used due to extract species-specific information from the fish samples. Each of the hue, saturation and value components of the HSV color space was used separately in order to extract 7 statistical features. Hence, totally 21 features were extracted for each fish sample. The extracted features were used within Nearest Neighbor algorithm and 112 fish samples from the 6 species were classified with an overall accuracy achievement of 86%.


Introduction
Computer vision provides basis for automated image analysis applications.In most applications, computers are preprogramed to perform a specific task that makes use of computer vision.Controlling processes, navigation, detecting events or existence of any objects, modelling environments or objects are some examples of computer vision based applications (Forsyth and Ponce, 2002;Manzoor et al., 2014;Atasoy et al., 2015;Iscimen et al., 2015).
Developing fish classification systems are getting more important due to reason that they can be used by researchers for the purposes of fish counting, stock assessments, evaluating ecological impacts, monitoring fish behavior, and ext.(Benson et al., 2009).In addition to these, the scientist believe that there are more than 24600 different fish species and also more than 1200 of these are venomous in the world.Fish classification systems can help avoiding from harms of the venomous species.In spite of requirement for classification of the fishes, performing this manually is a difficult task.In the light of these developments in computer technologies, computer vision and image processing methods are getting commonly used for the purposes of fish classification.
There are some studies using various features in order to classify fishes.Bothmann et al. (2016) presented a real time fish classification system from underwater sonar videos.The extracted features were based on shape and movement of the fishes.Shafait et al. (2016) presented an image set-based approach for fish species identification by taking advantage of Principal Component Analysis.Chuang et al. (2016) proposed an unsupervised feature learning and object recognition system by using geometric attributes of the fish body within a hierarchical partial classifier.Mizuno et al. (2015) proposed a new method by using high resolution acoustic video camera for fish classification.Classification purpose was performed by using normalized cross correlation.Ogunlana et al. (2015) presented a Support Vector Machine based method for fish classification.The extracted features were the body and five fin lengths of the fishes.Iscimen et al. (2015) presented a method by using centroid-contour distances of fish images in order to classify fish species with two dorsal fins.Iscimen et al. (2014) presented a Naive Bayesian classifier based system for fish families and species classification by using biometric measurement techniques.D 'Elia et al. (2014) used frequency response, morphometric, bathymetric or other energetic parameters as features within a multifrequency acoustics approach to detect and classify fish species.Fabic et al. (2013) described a method by using blob counting and Zernike moment-based shape analysis for fish detection, counting, and species classification from underwater video sequences.Alsmadi et al. (2010) presented a fish recognition system based on features extraction from size and shape measurements using neural network.Cabreira et al. (2009) applied methods on digital echo recordings of schools which involving artificial neural networks (ANNs) for the automatic recognition and classification.Energetic, morphometric, and bathymetric school descriptors were extracted from the echo-recordings as the input for the ANNs.Also, there are some studies taking advantage of color and texture features for fish classification purposes.Daramola et al. (2016) proposed a method by using Singular Value Decomposition in order to extract features of fish body pattern.Huang et al. (2015) presented a balance-enforced optimized tree with reject option for the purpose of live fish recognition.The proposed system is using features which are a combination of color, shape and texture properties.Chuang et al. (2014) presented an underwater fish species recognition system by using a systematic hierarchical partial classification algorithm.The extracted features are consist of size, shape and texture attributes of fish body parts.Hu et al. (2012) presented a method for classifying fish species based on color and texture features by using a multi-class support vector machine.Alsmadi et al. (2010) presented a fish recognition system based on features extraction from color texture measurements by using back-propagation classifier.
The aim of this study is proposing a color based feature extraction method and classification of species of Serranidae family.The proposed system is performing various image processing methods and using some statistical features which are extracted from colored images' hue, saturation and value components of the HSV color space.

Materials and Methods
The structure of the proposed system is consists of data acquiring, preprocessing, feature extraction, classification and performance measurement phases.Details of the processes are described in the following subsections.

Acquiring the database
A fish database which consist of 112 fish images was used in this study.In each image, a fish was located on a white background floor with the same position and the images were taken from different distances.Figure 1 shows a sample of taken images.
The database are totally consist of 6 fish species of Serranidae family.Each species include different numbers of samples.Table 1 shows variance of the collected database.The definition of species and systematic discrimination were performed according to Turan et al. (2007).

Preprocessing
As aforementioned all images were taken on a solid colored white background.However there might be some image noises arising from background or environmental conditions.With the intention of reducing negative environmental conditions before image processing, all images' backgrounds were manually converted to blue.A series of image processing algorithms were applied on each image due to evaluating the minimum image including the fish sample.Figure 2 shows the image processing phases from raw image to evaluated minimum image which includes fish sample.The following steps are performed for this task:  The colored input image was converted to grayscale image. The grayscale image was converted to binary image and Canny Filter was applied for edge detection. A morphometric structure element, which is shaped in a 3-by-3 matrix, was used to filter the binary image.By this process the edge disorders were fixed and edges became more apparent. Erosion and dilation processes were performed in order to fix missing or overflowing pixel issues. The whole apparent area of the fish was detected by filling the inner sides of edges. The minimum image was evaluated which includes the fish sample.The image was in HSV color space.The fish sample was colored and the background was black.

Feature extraction
Each of the fishes takes different sized areas in images (Figure 3).This case is rising an issue about selecting area of interest.Since proposed feature extraction method is based on color, it is necessary to interest on a solid fish texture and avoid considering black background.For this purpose, each four sides (left border, right border, top border, bottom border) were assumed as a frame.Each frame of the minimum image was subtracted in sequence from outside to inside until a solid fish texture remains.Figure 4 shows the subtraction process.
Actual process was performed by making subtracting frame-by-frame at each time.Since subtracting steps are not apparent in Figure 4, Figure 5 shows a representative illustration which explains subtracting process with wider frames.The obtained solid fish texture was an image in HSV color space.The hue, saturation and value components of the image were used separately in order to extract some statistical features.HSV components' values were in a matrix that having same dimension with the image which they belong to.Since each obtained fish texture image has different dimensions, the number of extracted features may be vary according to row and column sizes of the images.In order to fix this issue, values of the HSV components were converted from matrix to a vector before feature extraction.Afterwards 7 statistical features were extracted from these vectors for each of the HSV components respectively.The extracted features were combined in a new vector in order to obtain the feature set of the fish samples.Hence, there were totally 21 features in each feature set.The extracted features are minimum value, maximum value, mean value, variance, standard deviation, kurtosis, skewness.Figure 6 shows the feature extraction phases from solid fish texture to classification.

Classification and performance measurement
Classification purpose was performed by using Nearest Neighbor algorithm (Cunningham and Delany, 2007).In this method, an unclassified sample of data is compared with all classified samples applying the Euclidian Distance method on their features.The class of the unclassified sample is selected as the class of its nearest neighbor which has the nearest distance to unclassified sample.
Performance measurement was performed by using five-fold cross validation schema.The 5-fold cross validation process is performed by dividing randomly the dataset in 5 disjoint sets.The four of divided sets are used for the purposes of training and the remaining one is used for testing.The four of divided sets are used for training purposes and the remaining one is used for testing.This procedure is repeated along a period of time in a process of using each set for testing.The terms of recall (1), precision (2) and specificity (3) of each class, and also average accuracy (4) are measure the performance of classification tasks.These terms were calculated according to confusion matrix.The formulation of these terms are given as where the number of true positives is represented by TP, the number of true negatives is represented by TN, the number of false positives is represented by FP and the number of false negatives is represented by FN.The positive and the negative terms refer to the classifier's prediction.The true and the false terms refer to whether that prediction corresponds to the real class of samples.

Results
This study proposed a color based feature extraction method and classification system.First of all, the images were preprocessed in order to extract meaningful color based features.After obtaining a solid fish texture image, 7 statistical features were extracted separately for each hue, saturation and value component of the HSV image.Hence, totally 21 features were obtained for each of the fish samples.The fishes were classified according to species by using Nearest Neighbor algorithm with an overall accuracy achievement of 86%.The classification results of the other performance measurement terms are given in Table 2.
It is difficult that comparing the system which is constructed in this study with similar systems in the literature.There are some varieties in the classification techniques, fish species classified in these systems, number of samples and fish species and performance measures.Nevertheless, some conclusions can be given.Daramola and Omololu. (2016) used Singular Value Decomposition method in order to extract features from fish body pattern and 18 fish species were classified with overall accuracy of 94% by using Artificial Neural Network.Huang et al. (2015) used normalized color histograms of the red and green components and the hue component in HSV color space.Additionally texture was described by calculating the co-occurrence matrix, Fourier descriptor and Gabor filters.This work classifies 15 fish species with overall accuracy of 97% by using Support Vector Machine (SVM) as primary in a hierarchical classification system.Chuang et al. (2014) used the tail and eye textures of the fish as features.Texture properties were represented by the histogram of local binary patterns.In this study 7 fish species were classified by using SVM as primary in a hierarchical classification system.Each of the three mentioned studies were achieved over 94% overall accuracies and used different color or texture based features.Three of the mention studies have complex feature extraction methods and classification systems.Even though the proposed method was achieved a performance of 86% overall accuracy on classification of 6 species, the purposed feature extraction method and used classifier is much simpler than others.This study shows that simple statistical features are can be useful for classification.

Discussion
The previous studies which were presented by Iscimen et al. (2014Iscimen et al. ( , 2015) ) were also about classification of different fish species.These studies performed morphometric methods in order the classify fishes according to biological landmark points and centroid-contour distance features respectively.The features of Iscimen et al. (2014), biological landmark points, need to be marked by a supervisor manually and number of features depends on biological shape of the fish such as number of fins.Also, both of the studies Iscimen et al. (2014) and Iscimen et al. (2015) are dependent on calibration and scale due to they are shape based features.In this study, the automatic algorithm handles obtaining a solid fish texture and feature extraction.Also, purposed color based features are independent from scale and don't have a need of any calibration processes.Additionally, these features are independent from biological shape of the fishes and they can be used to classify fishes which have different number of fins.In the light of these information, it can be noted that the proposed color based feature extraction method is open to improvements and hope-inspiring for the future works.
Separating the fish from the background in image can be thought as a different problem or a research area.So, this study does not focus on this problem and background was extracted manually.Additionally, the fish images were taken within the same orientation on ground.In future works this study can be upgraded by applying other image processing algorithms in order to automatic segmentation of the fish from the background and including different oriented fishes.Also, appending new features, applying feature selection processes and using different classifiers can be increase the performance of classification.

Figure 1 .
Figure 1.A sample of taken images from database (photo C. Turan)

Figure 2 .
Figure 2. Image processing phases from raw image to evaluated minimum image which includes the fish sample

Figure 3 .Figure 4 .Figure 5 .
Figure 3. Fishes from different species take different sized areas in image.The species of the sample fishes: A) Epinephelus aeneus B) Epinephelus marginatus C) Mycteroperca rubra

Figure 6 .
Figure 6.Feature extraction phases from solid fish texture to classification

Table 2 .
Results of the performance measurement terms