A COMPARATIVE STUDY OF CLASSIFIERS FOR EARLY DIAGNOSIS OF GESTATIONAL DIABETES MELLITUS

. Gestational Diabetes Mellitus (GDM), usually found deploying a medical test called the Oral Glucose Tolerance Test (OGTT), is a prevalent complication during pregnancy. Early detection of GDM and identifying the most in(cid:135)uential risk factors of GDM pose to be a challenging problem and is found to be crucial as GDM has dreadful health indications for both mother and the baby. The performances of computational techniques like Radial Basis Function (RBF) neural network and Multilayer Perceptron Network (MLP) were collated with that of the statistical technique Discriminant Analysis (DA) on real time GDM datasets for diagnosis of GDM in multigravida pregnant women, speci(cid:133)cally women who have been pregnant more than once, without even a visit to the hospital. The most in(cid:135)uential risk factors were identi(cid:133)ed using DA while the overall performance of MLP beyond doubt established itself to be the most e⁄ective technique for early diagnosis of GDM in women during pregnancy.


Introduction
Diabetes Mellitus is causing havoc and concern amongst the health experts as it is greatly instrumental in the increasing burden of diseases which are noncommunicable. Sadly, India is no di¤erent. According to the World Health Organization (WHO), existence of diabetes mellitus (DM) in adults showed a rise of more than 120% from 135 million in the year 1995 to a staggering 300 million in 2025 [1]. In a survey conducted, the percentage of pregnant woman who was diagnosed with GDM in the urban population of Chennai was found to be 16.2% [2].
GDM is de…ned as intolerance of carbohydrate levels of di¤ering severity with onset or foremost identi…cation during pregnancy [3]. The birth of a child from GDM mother is susceptible in getting a¤ected by obesity while growing up and possibly with DM type ii during subsequent stages of their lifespan [4]. Moreover, their o¤spring are more prone to an added threat of health issues like jaundice, hypoglycemia and fetal macrosomia. Delivery complications like Caesarean section, pre-eclampsia and an extended danger of having type 2 diabetes or even Type1 after delivery are more incident amongst ladies with GDM. However, gestational diabetes is a treatable condition. The WHO has recommended using a 2 hour 75 g OGTT to systemize the diagnosis of GDM, which is generally performed between 24 and 28 weeks [5]. Thus a pregnant woman who may be prone to gestational diabetes shall undergo the conventional medical blood tests only in the period of six to seven months of her pregnancy. Discerning individuals who are at danger of developing GDM is the growing need of the hour. Various studies have put on record that early detection of gestational diabetes actually lowered mortality of mother and child and also helped improve the woman's well being in terms of health [6][7] [8]. More importantly, as the rate of babies who are born dead is relatively high in India and gestational diabetes mellitus is undoubtedly one of the causes, early diagnosis and awareness of GDM is an utmost priority in the society today [9].

Literature Review
Nanda et al. [10] used an analysis on predicting complications during pregnancy in the early stage to build a methodology for forecasting gestational diabetes using biochemical markers, characteristics of the pregnant women. The classi…cation power of the models for detection of GDM in pregnant women who were prone to developing GDM was collated by Tran et al. [11] using a few diagnostic norms on the basis of 75-g oral glucose tolerance test and …nally summarized for screening of GDM selectively in places like Vietnam, an ordinary prognostic model using Body Mass Index (BMI) and age at booking was adequate. Okeh et al. [12] applied a semi-parametric linear mixed model to determine the e¤ect of covariates on the precision of the results of diagnostic tests by deriving a general cut o¤ estimate for selecting patients to perform glucose tests during pregnancy explained implementing gestational diabetes data. Fuzzy integral was used by Zhang et al. [13] to develop the classi…cation model of GDM. Training of BPN was done to obtain the Sugeno measure and the BP neural network was optimized using the algorithm of simulated annealing to acquire an estimated global solution which was optimal. A universal screening program to detect GDM was extremely cost-e¤ective in Israel and India concluded Lohse et al. [14] by examining whether selection process of pregnant women for diagnosing GDM was economical and used published core diabetes model to estimate the long-term impact of screening through their study.
The above survey infers that while taking into consideration the facts and …gures needed to be collected for the analysis, there is certainly a minimum of one data for which the pregnant woman is in need of help of a medical sta¤ from the hospital. By providing newly designed input variables, the article aims to diagnose GDM in an early stage among pregnant women without performing a blood test. The article utilizes Arti…cial Neural Networks namely a supervised MLP network using Back propagation algorithm and RBF Network and the statistical technique Discriminant Analysis for classi…cation of GDM and compares the e¢ ciency of these diagnostic models.

Methodology
3.1. Arti…cial Neural Network. A computational arrangement which bears a strong resemblance to the biological networks consisting of neurons in the human brain basically explains an arti…cial neural network. Because of their ability to adapt easily, a salient feature of these networks, these networks go a long way in solving problems in diagnosis of diseases. Neural networks are known for recognizing the patterns which are hidden between predictor variables and dependent variables and are commonly applied to model complex relationships between them.
3.1.1. Fundamentals of Multilayer Perceptron Network. Using hidden layers, the separation of the relationship between the inputs and the output into a sequence of stages which are linearly separable is the most important essence of neural networks [15]. The diagnostic system comprises of three varied modules. The input module which receives data from the patient is the …rst module. It then transfers it to the second module, which classi…es the given input patient's case record. The classi…cation system output is displayed by the third module which is an output module. For an input pattern z p , with an only pass forward, the MLP Network's return is evaluated. For every output unit o k , the output is given by where the activation function for o k is f ok and the activation function for y j , a hidden unit is f yj ; the weight linking hidden unit y j and output unit o k is w kj ; the estimate of z i of input pattern z p is z i;p ; in the following layer the neurons' threshold estimates are indicated by the bias units. Back propagation Training Algorithm. The most powerful tool for training ANN is probably the hugely popular Back propagation algorithm. It coaches a Multilayer Perceptron network for a group of values of input whose outputs are already known. The network inspects the response of its output values to the given input values weighing up with the target output values for every entry of the sample set that is submitted and the error value is determined. Till the value of the error is brought to a minimum, these sample patterns are continuously handed over to the MLP network [16].

Fundamentals of Radial Basis Function
Network. RBF is one of the frequently implemented algorithms of neural networks in various medical and engineering domains because of their faster learning speed, more compact topology and universal approximation. These networks have been independently proposed by A parameter vector in the hidden module called center exists in every neuron. By evaluating distance between the inputs of the network and centers of the hidden module, the outputs of the …rst module are determined. The outputs of the linear hidden layer are the weighted forms of the returns of the …rst module. The general expression of the RBF network is [21]: The Euclidean distance is taken to be the norm while the most frequently used Gaussian function is assumed to be the radial basis function as it has well known mathematical features, is highly nonlinear and provides good locality as a local RBF [22] and is de…ned by: I denotes neuron count in the middle layer i 2 f1; 2; :::Ig J denotes neuron count in the middle layer J 2 f1; 2; :::Jg c i denotes centre vector of the i th neuron x denotes input data vector w ij denotes connecting value of the i th neuron and j th output y l j denotes output of the j th neuron Network denotes radial basis function i denotes spread parameter of the i th neuron j denotes value of the bias of the output j th neuron.

Figure 2. Architecture Design of RBF network
The structure of a radial basis function neural network is depicted in Fig. 2. The inputs of m dimensions (x 1 ; :::; x m ) situated in the input module are …rst passed on to the hidden module, which comprises of I neurons. The Euclidean distance connecting the centers and inputs are evaluated by each neuron which contains the basis function, which is an activation function. To shape the curve ( 1 ; :::; i ) the RBF contains a spread parameter and is very often taken to be the Gaussian function. The hidden layer's weighted outputs denoted by (w 11 ; :::; w ij ) are then broadcasted to the last module. Here the dimensions of the middle layer are given by I where i 2 f1; 2; :::Ig which depicts the number of neurons in the layer while the dimension of the output is denoted by J where j 2 f1; 2; :::Jg and bias parameters by ( 1 ; :::; j ). The linear combination of the bias parameters and returns of the second module are evaluated by the last layer. The results of the radial basis network are then eventually acquired (y l 1 ; :::; y l j ) . During the training period, the parameters of the RBF network are modulated in such a way that the data used for training is made to …t the network model in best possible way [23].

Discriminant Analysis.
In biomedicine models, one of the most commonly accepted statistical techniques extensively implemented is Discriminant Analysis [24]. It is basically a multivariate method which segregates di¤erent sets of observation values and assigns fresh observation values to already de…ned sets [25]. Based on the population size, the statistical problem is to build a classi…cation function. The score of the discriminant function can be generated with unstandardized discriminant function scores and raw scores. To maximize the di¤erences between the two groups, the discriminant function coe¢ cients are chosen, whose mean is equal to zero and standard deviation is one. For every group the mean discriminant function coe¢ cient known as centroids can be found which are generated by the discriminant function brought down from the starting independent variables.
The dimensions along which the groups di¤er are shown by di¤erences in the location of these centroids. Through their capacity to exactly discriminate every data point to their derived groups, the utility of these functions can be examined. When the classi…cation functions are ascertained groups are then di¤erentiated. In order to achieve this purpose, from the linear discriminant functions, the classi…cation functions are acquired.
The classi…cation function coe¢ cient C j for the j th group, j = 1; :::; k whose sample sizes are all equal is given by: where c j0 is a constant and x stands for the raw scores of each predictor. If M denotes mean column matrix for group j and W denotes within-group variancecovariance matrix, c j0 = ( 1=2)C j M j . When the size of the sample is unequal in every group, if in group j, size is denoted by n j and N denotes the entire size of the sample, then C j is as follows:

Data Analysis
The variables used in the study were selected based on the various characteristics which are relevant medically for a woman who is pregnant to have gestational diabetes on consultation with gynecologists. The real time data sets of 336 records of which 188 were of multigravida patients, every set containing ten variables, were collected from the records of outgoing patients in a Chennai multi-specialty hospital located in India during the period January to May 2013.  Table 1 shows the variables chosen for the study. Of the ten parameters, three include common details like BMI and age of the patient and history of diabetes in family amongst relatives of …rst degree. Details on previous pregnancy namely child born weighing above 3.8kg, presence or absence of GDM, the demise of a child within 5 months, a baby's birth which has ‡aws in major organs like the heart or brain, the birth of an infant that has died in the womb strictly after having survived through at least 5 months of pregnancy are included in …ve other variables. Particulars on history of infections and syndrome of polycystic ovaries are revealed in the remaining two variables [26]. The information on the statistics of the records containing history of the patients is shown in Figure 3 by means of a graph. It was observed that the age of the pregnant ladies on an average was 32.8 years while average BMI of the patients was 26.4. The prevalence rate of GDM was found to be an alarming 34.04% in this study.

Results
The results of the three diagnostic models are discussed below.

Results of MLP Model
MATLAB R2014a, a toolbox of Neural Network was implemented to construct the diagnostic models for both MLP and RBF. A typical FFNN using back-propagation was implemented to develop a classi…cation system. Ten input neurons constituted the input layer, …fteen hidden neurons were used in the middle layer while the output layer comprised of a single neuron. 1 or 0 were the only possible outputs of the model as diagnosing GDM was considered as a binary classi…cation problem i.e. Output 1 was regarded as "GDM patient" and a value of 0 was interpreted as "non-GDM patient". As the optimal neuron count lying in the middle layer cannot be predetermined, stopping criteria, the neurons in the second module and the network layer count was determined through trial and error procedure. Hence the neurons in the hidden layer were kept altering and tests were carried out on various architectures through which it was found that the architecture with hidden layer consisting of 15 neurons produced the best classi…cation results. 70% of the data set was selected for training, 15% of them were chosen for validation while the remaining 15% was allotted for testing. The learning rate for network training was set to 0.28 and the momentum was set to 0.8. Until an average squared error of minimum less than 0.045 was reached, the model was executed.  The regression testing outcomes performed on the MLP architecture for training, testing and validation and an amalgamation of all of them is depicted in Fig. 4. The performances of the MLP generated for training, validation and testing with respect to the mean square error is shown in Fig. 5. The mean square value was found to be 0.12506 and the performance of best validation was reached in the 3rd generation. As the generation proceeded, it was seen gradient descent learning algorithm minimized the error. The global minimum of mean square error was 0.075309 at the ninth generation as depicted in Fig. 6. A surge in the gradient value was noted right after ninth generation. Fig. 7 depicts the linear separability of the chosen data set classi…ed into 2 distinguished groups namely GDM pregnant women with output 1 and non GDM patients with output 0. An astonishing 92.86 % of the given data was classi…ed correctly while only the remaining 7.14 % were classi…ed incorrectly. These results of the MLP model proved that the system was trained e¤ectively and may very possibly be implemented for discerning women who are pregnant having high or low risk of gestational diabetes.

Results of RBF Model
The datasets were divided equally for training and testing. The outputs in the model were either 1 or 0 as detection of GDM was considered as a binary classi…cation problem.

Figure 6. Validation Performance
The graphs generated for trained dataset and tested dataset are shown above in …g.8 and …g.9. The performance of RBF neural networks was considered best at nine centers while 16 centers were maximum tried. Using the best centers, 0.1213 was found to be the root mean square error. Execution time of RBF network was lesser than MLP. The classi…cation accuracy of a model is used to analyze its discriminatory power. The measures of accuracy namely the sensitivity and speci…city brief about the test accuracy. The true positive rate or sensitivity of a model is the capacity to accurately discern the patients with GDM while the true negative rate or speci…city of the model is the capacity to accurately discern patients without gestational diabetes. The total of the number of true negative and true positive values divided by the overall size of the sample gives the overall accuracy of the model.  The classi…cation results using RBF is shown in Table 2. 50% of the records were used for testing. Sensitivity was found to be 96.00% in the RBF neural network model and speci…city was 33.33%. The overall accuracy was calculated to be a modest 50.00% for the model.

Results of Discriminant Analysis Model
To detect GDM and non GDM patients and also to determine most signi…cant parameters of GDM, Discriminant Analysis model was implemented using version 20 of SPSS, namely the Statistical Package for Social Sciences for Windows. In DA, Wilks' lambda is applied by the mean di¤erences ANOVA F test. Lambda value lies between 0 and 1, wherein 0 indicates that the group means di¤er and a value of 1 indicate that all means of the group are equal. Hence an independent variable will contribute more to the discriminant function as the lambda value gets smaller for the variable. Thus the signi…cance of the contributions of the variables is revealed through the Wilks'lambda's F test. Corresponding to each discriminant function, the Pearsonian correlations of all the variables are depicted by the structure matrix table in SPSS, which are known as discriminant loadings or correlations or structure coe¢ cients.
The signi…cance of discriminant analysis was indicated using Wilks' Lambda test. From the table, it is inferred that pre pregnancy body mass index, diabetes history in family and presence or absence of GDM history were the variables which were the most in ‡uential with GDM occurrence since they had the least p values.   Moreover, large infant delivery, age and infections in the past were the variables with 5% level of signi…cance whereas the variable history of miscarriage had 1% level of signi…cance [27]. Using structure matrix and the standardized coe¢ cients, discriminant functions are well explained. In each discriminant function, standardized beta coe¢ cients are given for every variable. The contribution of a variable to the discrimination between GDM and non GDM patients will be less if the value of the standardized coe¢ cient is less and vice-versa. It is concluded from table 4 that the most vital part in discriminating the two groups was contributed by history of GDM while a few other variables like infections history, history of diabetes in family and miscarriage history also played crucial roles. 64 of the 188 pregnant women in the study had GDM in current pregnancy. Table 5 shows that using the discriminant analysis model, 45 of the 64 pregnant women with GDM were correctly identi…ed while 112 pregnant women of the 124 patients who did not have GDM were correctly identi…ed.

Discussion
To determine the most e¢ cient model and the model with the best discriminatory power, the measures of accuracy of the three diagnostic models were compared and analyzed. Another measure which exhibits information on the classi…cation accuracy of the test namely Youden's index is calculated using the speci…city and sensitivity values of the model and is de…ned as follows: Youden's index = Speci…city + Sensitivity 1.
This index lies between -1 and 1. The test is considered ‡awless if there are no false negatives or false positives thereby yielding a value of 1. Thus, the accuracy of the model is higher when Youden's index value of the model is larger. For all the three classi…cation methods, table 6 displays a comparison of the measures namely accuracy, speci…city, sensitivity and Youden's index. All models had speci…city, sensitivity, accuracy and Youden's index range between 33.33-94.74%, 70.31-96.00%, 50.00-92.86% and 0.29-0.84 respectively. The sensitivity was more , the best classi…cation accuracy (92.86%) and the highest Youden's index (0.84). Based on the above comparison analysis carried out, the MLP model was found to be the best classi…cation method and has clearly outperformed RBF and discriminant analysis models.

Conclusion
GDM is a public health concern. Only women who have the traditional risk factors like obesity or family history of GDM are usually screened earlier on in pregnancy. Unfortunately, women who do not have these common risk factors and develop GDM often remain undiagnosed until the second trimester and a delay in diagnosis often leads to therapies for GDM becoming less e¤ective. Hence, there is a growing need for early detection of gestational diabetes. Nearly three-fourth of the population in India exists in rural environment and basic amenity for even diagnosis of DM is inadequate. Performing OGTT to diagnose GDM is burdensome and unfavorable in this current setting. Furthermore, the amount involved is exorbitant to undergo three medical tests. Therefore, the necessity is also for an inexpensive and uncomplicated procedure to detect gestational diabetes. To address these needs, the methods identi…ed in this study o¤er every pregnant woman the opportunity to know her risk early on without a visit to the hospital because of which the costs for the various blood tests are saved and hence would prove immensely favorable for all pregnant women. In conclusion, with a staggering 92.86% overall accuracy, MLP neural network with back propagation algorithm signi…cantly outperformed RBF and discriminant analysis models. Moreover, through discriminant analysis, it was found that the variables, diabetes history in family, pre pregnancy BMI and GDM history of the patient are the signi…cant factors which play the most crucial role in diagnosing gestational diabetes, which will assist pregnant women to be mindful of in an early stage and take precautionary measures like actively participate in physical exercise and make changes in dietary behavior so that gestational diabetes can be successfully warded o¤.