<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN"
        "https://jats.nlm.nih.gov/publishing/1.4/JATS-journalpublishing1-4.dtd">
<article  article-type="research-article"        dtd-version="1.4">
            <front>

                <journal-meta>
                                    <journal-id></journal-id>
            <journal-title-group>
                                                                                    <journal-title>Balkan Journal of Electrical and Computer Engineering</journal-title>
            </journal-title-group>
                            <issn pub-type="ppub">2147-284X</issn>
                                        <issn pub-type="epub">2147-284X</issn>
                                                                                            <publisher>
                    <publisher-name>MUSA YILMAZ</publisher-name>
                </publisher>
                    </journal-meta>
                <article-meta>
                                        <article-id pub-id-type="doi">10.17694/bajece.973129</article-id>
                                                                <article-categories>
                                            <subj-group  xml:lang="en">
                                                            <subject>Artificial Intelligence</subject>
                                                    </subj-group>
                                            <subj-group  xml:lang="tr">
                                                            <subject>Yapay Zeka</subject>
                                                    </subj-group>
                                    </article-categories>
                                                                                                                                                        <title-group>
                                                                                                                        <article-title>Machine Learning based Early Prediction of Type 2 Diabetes: A New Hybrid Feature Selection Approach using Correlation Matrix with Heatmap and SFS</article-title>
                                                                                                                                        </title-group>
            
                                                    <contrib-group content-type="authors">
                                                                        <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0000-0001-7844-3168</contrib-id>
                                                                <name>
                                    <surname>Buyrukoğlu</surname>
                                    <given-names>Selim</given-names>
                                </name>
                                                                    <aff>CANKIRI KARATEKIN UNIVERSITY</aff>
                                                            </contrib>
                                                    <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0000-0002-6425-104X</contrib-id>
                                                                <name>
                                    <surname>Akbaş</surname>
                                    <given-names>Ayhan</given-names>
                                </name>
                                                                    <aff>ABDULLAH GUL UNIVERSITY</aff>
                                                            </contrib>
                                                                                </contrib-group>
                        
                                        <pub-date pub-type="pub" iso-8601-date="20220430">
                    <day>04</day>
                    <month>30</month>
                    <year>2022</year>
                </pub-date>
                                        <volume>10</volume>
                                        <issue>2</issue>
                                        <fpage>110</fpage>
                                        <lpage>117</lpage>
                        
                        <history>
                                    <date date-type="received" iso-8601-date="20210719">
                        <day>07</day>
                        <month>19</month>
                        <year>2021</year>
                    </date>
                                                    <date date-type="accepted" iso-8601-date="20220104">
                        <day>01</day>
                        <month>04</month>
                        <year>2022</year>
                    </date>
                            </history>
                                        <permissions>
                    <copyright-statement>Copyright © 2013, Balkan Journal of Electrical and Computer Engineering</copyright-statement>
                    <copyright-year>2013</copyright-year>
                    <copyright-holder>Balkan Journal of Electrical and Computer Engineering</copyright-holder>
                </permissions>
            
                                                                                                <abstract><p>A new hybrid machine learning method for the prediction of type 2 diabetes is introduced and explained in detail. Also, outcomes are compared with similar researches. Early prediction of diabetes is crucial to take necessary measures (i.e. changing eating habits, patient weight control etc.), to defer the emergence of diabetes and to reduce the death rate to some extent and ease medical care professionals’ decision-making in preventing and managing diabetes mellitus. The purpose of this study is the creation of a new hybrid feature selection approach combination of Correlation Matrix with Heatmap and Sequential forward selection (SFS) to reveal the most effective features in the detection of diabetes. A diabetes data set with 520 instances and seven features were studied with the application of the proposed hybrid feature selection approach. The evaluation of the selected optimal features was measured by applying Support Vector Machines(SVM), Random Forest(RF), and Artificial Neural Networks(ANN) classifiers. Five evaluation metrics, namely, Accuracy, F-measure, Precision, Recall, and AUC showed the best performance with ANN (99.1%), F-measure (99.1%), Precision (99.3%), Recall (99.1%), and AUC (99.2%). Our proposed hybrid feature selection model provided a more promising performance with ANN compared to other machine learning algorithms.</p></abstract>
                                                                                    
            
                                                            <kwd-group>
                                                    <kwd>Artificial Neural Network</kwd>
                                                    <kwd>  Correlation Matrix</kwd>
                                                    <kwd>  Sequential Forward Selection</kwd>
                                                    <kwd>  Diabetes Mellitus</kwd>
                                                    <kwd>  Hybrid Feature Selection</kwd>
                                            </kwd-group>
                                                        
                                                                                                                                                    </article-meta>
    </front>
    <back>
                            <ref-list>
                                    <ref id="ref1">
                        <label>1</label>
                        <mixed-citation publication-type="journal">Stephanie Watson, “Everything You Need to Know About Diabetes,” 2020. [Online]. Available: https://www.healthline.com/health/diabetes</mixed-citation>
                    </ref>
                                    <ref id="ref2">
                        <label>2</label>
                        <mixed-citation publication-type="journal">K. Shailaja, B. Seetharamulu, and M. A. Jabbar, “Machine learning in healthcare: A review,” in 2018 Second International Conference on Electronics, Communication, and Aerospace Technology (ICECA), 2018, pp. 910–914.</mixed-citation>
                    </ref>
                                    <ref id="ref3">
                        <label>3</label>
                        <mixed-citation publication-type="journal">N. Peiffer-Smadja, T. Rawson, R. Ahmad, A. Buchard, G. Pantelis, F.- X. Lescure, G. Birgand, and A. Holmes, “Machine learning for clinical decision support in infectious diseases: A narrative review of current applications,” Clinical Microbiology and Infection, vol. 26, 09 2019.</mixed-citation>
                    </ref>
                                    <ref id="ref4">
                        <label>4</label>
                        <mixed-citation publication-type="journal">E. Sevinc, “A novel evolutionary algorithm for data classification problem with extreme learning machines,” IEEE Access, vol. 7, pp. 122 419– 122 427, 2019.</mixed-citation>
                    </ref>
                                    <ref id="ref5">
                        <label>5</label>
                        <mixed-citation publication-type="journal">K. D. Silva, W. K. Lee, A. Forbes, R. T. Demmer, C. Barton, and J. Enticott, “Use and performance of machine learning models for type 2 diabetes prediction in community settings: A systematic review and meta-analysis,” International Journal of Medical Informatics, vol. 143, no. August, p. 104268, 2020. [Online]. Available: https://doi.org/10.1016/j.ijmedinf.2020.104268</mixed-citation>
                    </ref>
                                    <ref id="ref6">
                        <label>6</label>
                        <mixed-citation publication-type="journal">J. Chaki, S. Thillai Ganesh, S. K. Cidham, and S. Ananda Theertan, “Machine learning and artificial intelligence-based Diabetes Mellitus detection and self-management: A systematic review,” Journal of King Saud University - Computer and Information Sciences, 2020. [Online]. Available: https://doi.org/10.1016/j.jksuci.2020.06.013</mixed-citation>
                    </ref>
                                    <ref id="ref7">
                        <label>7</label>
                        <mixed-citation publication-type="journal">I. Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, and I. Chouvarda, “Machine Learning and Data Mining Methods in Diabetes Research,” Computational and Structural Biotechnology Journal, vol. 15, pp. 104–116, 2017. [Online]. Available: https: //doi.org/10.1016/j.csbj.2016.12.005</mixed-citation>
                    </ref>
                                    <ref id="ref8">
                        <label>8</label>
                        <mixed-citation publication-type="journal">D. Jashwanth Reddy, B. Mounika, S. Sindhu, T. Pranayteja Reddy, N. Sagar Reddy, G. Jyothsna Sri, K. Swaraja, K. Meenakshi, and P. Kora, “Predictive machine learning model for early detection and analysis of diabetes,” Materials Today: Proceedings, 2020. [Online]. Available: https://doi.org/10.1016/j.matpr.2020.09.522</mixed-citation>
                    </ref>
                                    <ref id="ref9">
                        <label>9</label>
                        <mixed-citation publication-type="journal">H. Lai, H. Huang, K. Keshavjee, A. Guergachi, and X. Gao, “Predictive models for diabetes mellitus using machine learning techniques,” BMC Endocrine Disorders, vol. 19, no. 1, pp. 1–9, 2019.</mixed-citation>
                    </ref>
                                    <ref id="ref10">
                        <label>10</label>
                        <mixed-citation publication-type="journal">N. Nai-Arun and R. Moungmai, “Comparison of Classifiers for the Risk of Diabetes Prediction,” Procedia Computer Science, vol. 69, pp. 132–142, 2015. [Online]. Available: http://dx.doi.org/10.1016/j.procs. 2015.10.014</mixed-citation>
                    </ref>
                                    <ref id="ref11">
                        <label>11</label>
                        <mixed-citation publication-type="journal">Kaggle, “Pima Indians Diabetes Dataset,” 2021. [Online]. Available:
https://www.kaggle.com/uciml/pima- Indians- diabetes- database</mixed-citation>
                    </ref>
                                    <ref id="ref12">
                        <label>12</label>
                        <mixed-citation publication-type="journal">S. Pratama, A. Muda, Y.-H. Choo, and N. Muda, “Computationally in- expensive sequential forward floating selection for acquiring significant features for authorship invariances in writer identification,” International Journal of New Computer Architectures and their Applications (IJNCAA), vol. 1, pp. 581–598, 01 2011.</mixed-citation>
                    </ref>
                                    <ref id="ref13">
                        <label>13</label>
                        <mixed-citation publication-type="journal">Y. A. Christobel and P. Sivaprakasam, “A New Classwise k Nearest Neighbor ( CKNN ) Method for the Classification of Diabetes Dataset,” International Journal of Engineering and Advanced Technology, vol. 2, no. 3, pp. 396–400, 2013.</mixed-citation>
                    </ref>
                                    <ref id="ref14">
                        <label>14</label>
                        <mixed-citation publication-type="journal">Wikipedia, “Support vector machine,” 2021. [Online]. Available:
https://en.wikipedia.org/wiki/Support-vector{ }machine</mixed-citation>
                    </ref>
                                    <ref id="ref15">
                        <label>15</label>
                        <mixed-citation publication-type="journal">A. Guha, “Building Explainable and Interpretable model for Diabetes Risk Prediction,” International Journal of Engineering Research and Technology, vol. 9, no. 09, pp. 1037–1042, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref16">
                        <label>16</label>
                        <mixed-citation publication-type="journal">A. Kareem, L. Shi, L. Wei, and Y. Tao, “A Comparative Analysis and Risk Prediction of Diabetes at Early Stage using Machine Learning Approach A Comparative Analysis and Risk Prediction of Diabetes at Early Stage using Machine Learning Approach,” International Journal of Future Generation Communication and Networking, vol. 13, no. 3, pp. 4151–4163, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref17">
                        <label>17</label>
                        <mixed-citation publication-type="journal">K. Alpan and G. S. Ilgi, “Classification of Diabetes Dataset with Data Mining Techniques by Using WEKA Approach,” in 2020 4th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT). IEEE, Oct 2020, pp. 1–7.</mixed-citation>
                    </ref>
                                    <ref id="ref18">
                        <label>18</label>
                        <mixed-citation publication-type="journal">J. Xue, F. Min, and F. Ma, “Research on diabetes prediction method based on machine learning,” Journal of Physics: Conference Series, vol. 1684, no. 1, 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref19">
                        <label>19</label>
                        <mixed-citation publication-type="journal">L.Tapak, H.Mahjub, O.Hamidi, and.Poorolajal,“Real-data comparison of data mining methods in prediction of diabetes in Iran,” Healthcare Informatics Research, vol. 19, no. 3, p. 177, 2013.</mixed-citation>
                    </ref>
                                    <ref id="ref20">
                        <label>20</label>
                        <mixed-citation publication-type="journal">D. Reddy, B. Mounika, S. Sindhu, T. Reddy, N. Reddy, G. Sri, K. Swaraja, M. Kollati, and P. Kora, “Predictive machine learning model for early detection and analysis of diabetes,” Materials Today: Proceedings, 10 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref21">
                        <label>21</label>
                        <mixed-citation publication-type="journal">A. Mujumdar and V. Vaidehi, “Diabetes Prediction using Machine Learning Algorithms,” Procedia Computer Science, vol. 165, pp. 292– 299, 2019. [Online]. Available: https://doi.org/10.1016/j.procs.2020.01. 047</mixed-citation>
                    </ref>
                                    <ref id="ref22">
                        <label>22</label>
                        <mixed-citation publication-type="journal">M. Maniruzzaman, M. J. Rahman, B. Ahammed, and M. M. Abedin, “Classification and prediction of diabetes disease using machine learning paradigm,” Health Information Science and Systems, vol. 8, no. 1, Jan. 2020.</mixed-citation>
                    </ref>
                                    <ref id="ref23">
                        <label>23</label>
                        <mixed-citation publication-type="journal">D. Deng and N. Kasabov, “On-line pattern analysis by evolving self-organizing maps,” Neurocomputing, vol. 51, pp. 87–103, Apr 2003.</mixed-citation>
                    </ref>
                                    <ref id="ref24">
                        <label>24</label>
                        <mixed-citation publication-type="journal">M. Farahmandian, Y. Lotfi, and I. Maleki, “Data Mining Algorithms Application in Diabetes Diseases Diagnosis: A Case Study,” MAGNT Research Report, vol. 3, no. 1, pp. 989–997, 2015.</mixed-citation>
                    </ref>
                                    <ref id="ref25">
                        <label>25</label>
                        <mixed-citation publication-type="journal">M. Khashei, S. Eftekhari, and J. Parvizian, “Diagnosing diabetes type ii using a soft intelligent binary classification model,” Review of Bioinformatics and Biometrics, vol. 1, no. 1, pp. 9–23, 2012.</mixed-citation>
                    </ref>
                                    <ref id="ref26">
                        <label>26</label>
                        <mixed-citation publication-type="journal">N.Nai-Arun and R.Moungmai,“Comparisonofclassifiersfortheriskof diabetes prediction,” Procedia Computer Science, vol. 69, pp. 132–142, 2015.</mixed-citation>
                    </ref>
                                    <ref id="ref27">
                        <label>27</label>
                        <mixed-citation publication-type="journal">H. T. Abbas, L. Alic, M. Erraguntla, J. X. Ji, M. Abdul-Ghani, Q. H. Abbasi, and M. K. Qaraqe, “Predicting long-term type 2 diabetes with support vector machine using oral glucose tolerance test,” PLOS ONE, vol. 14, no. 12, p. e0219636, Dec. 2019.</mixed-citation>
                    </ref>
                            </ref-list>
                    </back>
    </article>
