elk-An automated eye disease recognition system from visual content of facial image 1 using machine learning techniques

.

The accuracy rate is 97% with specificity 98% and sensitivity 92%. Melih Gunay et al. [11] proposed an 23 automated diagnosing system of Adenoviral conjunctivitis using the facial picture of the illness face. They 24 measured the vascularization and intensity of redness in pink eyes after segmenting the sclera regions of eye 25 images to diagnose conjunctivitis with only 30 images (18 healthy and 12 adenoviral conjunctivitis eye images). 26 The average accuracy rate is 96%. 27 Most of the existing works present recognition results on one or two eye diseases. But there still lack of 28 research on recognition of some other important eye diseases. The main pitfall in this area is lack of benchmark 96.13%, 80.45% with sensitivity 90.5%, 93.5%, 12.5% and specificity 94.13%, 98.87%, 87.37%, respectively. We 1 also compare our method with some existing methods. We compare our method with the method in [9] for 2 diseases like Periorbital Cellulitis and Corneal Ulcer. The method average recognition rate is 76.6%. We also 3 experiment our method for adenoviral conjunctivitis recognition with the method in [11]. The method average 4 accuracy rate of 96%. Overall our method achieve better accuracy than others. We get the recognition rate of 5 our method is 98.79% with sensitivity 97% and specificity 99%. 6 Rest of the paper is organized as follows. In Section 2, we describe our proposed methodology. Section 3 7 shows, the experimental and quantitative analysis of the results obtained from the algorithms. Finally, Section 8 4 concludes the paper. 2. Proposed methodology 10 In this paper, an automated eye disease recognition system is designed using various machine learning techniques.

11
In our eye disease recognition system, human eye part is obtained from the facial image automatically. The 12 first stage is capturing of image. This procedure is acquired by digital camera. Initially, the original image has 13 been loaded as an input image and then the method detects the face from the input image by our algorithm. 14 Our method scales the facial images 500×500 pixels. Then the method segments the various facial components. 15 Once the eye part is segmented from the face part, we apply that eye parts of the image for learning. The

18
Facial feature points are generally obtained from facial components such as eye, nose, jaw, mouth, etc. In our 19 proposed system, the facial feature point's detection involves three steps: step 1: localizing face in the image; 20 step 2: detecting facial feature points of each face component; and step 3: segmenting facial component. We 21 apply the method for face detection that uses the histogram of oriented gradient (HOG) to construct feature 22 vector and linear SVM classifier for detection proposed in [12]. We use HOG based feature selection techniques 23 because it outperforms other existing feature selection technique for human detection [12]. The linear SVM is 24 used as a baseline classifier for face or non-face classification for its simplicity and higher speed.

25
The HOG descriptor technique computes occurrences of histogram gradient orientation in a small spatial 26 region of an image referred to as "cell" [12,13]. The image is partitioned into cells of size N ×N pixels. Then 27 it evaluates the vectors that represent histograms of orientated gradients of each cell in the detection window. 1 Gradient vector for x and y direction is calculated by Equation (1) and (2) respectively. In these equations, L is 2 a pixel intensity (grayscale) function for (x, y) direction in an image I. Then the gradients are used to calculate 3 gradient magnitude M x,y and the gradient orientation θ x,y by Equations (3) and (4) respectively.
The orientation of all pixels is computed and accumulated in M-bins histogram of orientations over N ×N 8 spatial cells. Then all the achieved histograms are concatenated into unique histogram vectors in order to 9 construct the final features vector. Finally the feature vectors are given to a linear SVM to classify face or non-10 face. The SVM will be discussed in Section 2.2.2. After detecting face, we apply the facial landmark extraction 11 method to localize and label facial components. These landmarks locate around edges of facial components 12 such as eye, nose, mouth, eyebrow, etc.

13
This facial landmark extraction method is based on an ensemble of regression trees proposed in [14]. Thus, 14 in this technique, each stage regressor in the cascaded shape regression framework is based on the ensemble of 15 regression trees [14]. The ensemble of regression trees can be used to regress the position of facial landmarks.

16
Each regressor in the cascade makes its predictions based on features such as pixels intensity values extracted 17 from the face image [14]. The features used in each regressor in the cascade returns a shape vector which is 18 used to update the current shape of estimate at each stage. The processes can be formulated as follows: where r t is a regressor at stage t, I is an input image and S t is the currently estimated shape vector. It will 20 update S t stage by stage where φ t (I, S t ) is a function that referred to as shape-index feature which depend on 21 the current estimate of S t .

22
In [14],   The main goal of facial features detection is to segment the facial components, in particularly for eye region 2 extraction and segmentation. We detect and locate the position of eyes using a facial landmark detector [14].  architecture is presented in Figure 5. Table 1 shows the descriptions of the proposed CNN architecture. As convolutional layer performs the mapping as follows: For each entry of C s (x,y) can be defined as shown in Equation 7 27 where σ is a nonlinear function. The result passes through the leaky ReLU. Although in CNN architecture  ReLU is the most used activation function but in our proposed architecture Leaky ReLU activation is applied to 1 fix the problem of dying neuron during back-propagation with gradient value of 0.1. And the feature extractor 2 with activation function leaky ReLU produced features vectors that consistently outperformed than the feature 3 extractor with other activation functions. We build several dropout layers to reduce model overfitting. The 4 output of the convolutional layer is a 2D vector, which flattens into a single dimensional vector that is used 5 as an input layer with three fully connected layers. The final layer i.e., fully connected layer performs the 6 classification followed by a softmax activation function [16]. For an input sample x, weight vector W and K 7 distinct linear functions, the softmax function can be defined for the i th class as follows: intelligent machine techniques for the purpose of condition monitoring and medical diagnosis using its excellent 12 ability in the classification process. In this paper focuses on the SVM using radial basis function (RBF) 13 kernels for solving non-linear separable classification problems [19]. Given a supervised soft-margin classification 14 problem and a training set of N data points {y i , x i } m i=1 where x i R n is the i th input pattern and y i R is the 15 i th output pattern, the SVM method aims at constructing a classifier, that is defined as follows: where α i and b are obtained from a quadratic optimization problem. Quadratic optimization problem has a 17 trade-off parameter C which is defined by the experiment or user. In this paper we applied Gaussian radial 18 basis function (RBF) kernel that is defined as follows: where the γ is a parameter as a Gaussian kernel function [18].We also apply two feature selection methods such 20 as PCA and t-SNE. The PCA is used to reduce dimensionality by eliminating redundant information of feature 21   In this section, we describe the dataset used for eye disease experimentation, present the experimental setup 2 and results, analyze different machine learning algorithm settings, and then compare our method to other eye 3 disease recognition techniques.  Table 2 shows the statistics of image dataset.  The entire experiment has been conducted on a system with Intel R Core-i7-7700HQ, an additional GPU  several performance metrics are used such as accuracy, precision, recall, and fscore are shown Table 3. Figure   8 7 depicts the CNN model accuracy and loss with the number of epochs, resulting in the mean error rate of the 9 proposed system is 3%.  Table 4. We select the range of regularization parameter C and the value of gamma γ and apply 10-fold 17 cross-validation technique on our dataset using SVM with RBF kernel.    We also compute specificity and sensitivity that are defined as follows:  Moreover, we see that the result of our methods are better than other eye disease recognition system.     Table 5. it is inconvenient to compare the performance directly. But we get an assumption of improvement results by 1 our method compared to that other methods. From the experimental results in Table 6, we see that our method 2 shows better results than existing methods. In this paper, we propose a visual content based eye disease recognition system from the facial image using 5 image processing and machine learning techniques. In this paper we have developed a benchmark visual content 6