<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN"
        "https://jats.nlm.nih.gov/publishing/1.4/JATS-journalpublishing1-4.dtd">
<article  article-type="research-article"        dtd-version="1.4">
            <front>

                <journal-meta>
                                    <journal-id></journal-id>
            <journal-title-group>
                                                                                    <journal-title>Erzincan University Journal of Science and Technology</journal-title>
            </journal-title-group>
                                        <issn pub-type="epub">2149-4584</issn>
                                                                                            <publisher>
                    <publisher-name>Erzincan Binali Yıldırım Üniversitesi</publisher-name>
                </publisher>
                    </journal-meta>
                <article-meta>
                                        <article-id/>
                                                                <article-categories>
                                            <subj-group  xml:lang="en">
                                                            <subject>Information Modelling, Management and Ontologies</subject>
                                                            <subject>Information Systems Development Methodologies and Practice</subject>
                                                            <subject>Information Systems (Other)</subject>
                                                    </subj-group>
                                            <subj-group  xml:lang="tr">
                                                            <subject>Bilgi Modelleme, Yönetim ve Ontolojiler</subject>
                                                            <subject>Bilgi Sistemleri Geliştirme Metodolojileri ve Uygulamaları</subject>
                                                            <subject>Bilgi Sistemleri (Diğer)</subject>
                                                    </subj-group>
                                    </article-categories>
                                                                                                                                                        <title-group>
                                                                                                                        <article-title>Feature Engineering for Parkinson’s Disease Diagnosis: A Hybrid Approach Using Random Forest Feature Selection and Correlation Analysis</article-title>
                                                                                                                                                                                                <trans-title-group xml:lang="tr">
                                    <trans-title>Parkinson Hastalığı Teşhisi için Özellik Mühendisliği: Rastgele Orman Özellik Seçimi ve Korelasyon Analizini Kullanan Hibrit Bir Yaklaşım</trans-title>
                                </trans-title-group>
                                                                                                    </title-group>
            
                                                    <contrib-group content-type="authors">
                                                                        <contrib contrib-type="author">
                                                                    <contrib-id contrib-id-type="orcid">
                                        https://orcid.org/0000-0003-1283-0530</contrib-id>
                                                                <name>
                                    <surname>Birdal</surname>
                                    <given-names>Ramiz Görkem</given-names>
                                </name>
                                                                    <aff>İstanbul Üniversitesi - Cerrahpaşa</aff>
                                                            </contrib>
                                                                                </contrib-group>
                        
                                        <pub-date pub-type="pub" iso-8601-date="20260330">
                    <day>03</day>
                    <month>30</month>
                    <year>2026</year>
                </pub-date>
                                        <volume>19</volume>
                                        <issue>1</issue>
                                        <fpage>331</fpage>
                                        <lpage>356</lpage>
                        
                        <history>
                                    <date date-type="received" iso-8601-date="20250215">
                        <day>02</day>
                        <month>15</month>
                        <year>2025</year>
                    </date>
                                                    <date date-type="accepted" iso-8601-date="20250717">
                        <day>07</day>
                        <month>17</month>
                        <year>2025</year>
                    </date>
                            </history>
                                        <permissions>
                    <copyright-statement>Copyright © 2008, Erzincan Üniversitesi Fen Bilimleri Enstitüsü Dergisi</copyright-statement>
                    <copyright-year>2008</copyright-year>
                    <copyright-holder>Erzincan Üniversitesi Fen Bilimleri Enstitüsü Dergisi</copyright-holder>
                </permissions>
            
                                                                                                <abstract><p>Feature selection is a crucial step in optimizing machine learning models, particularly in biomedical applications such as Parkinson’s disease classification based on speech data. This study employs multiple feature importance techniques to identify the most significant predictors and remove redundant variables, thereby improving model interpretability and efficiency. Four distinct methods—Permutation Importance, Mutual Information (MI), ANOVA F-score, and Random Forest Importance—are applied to assess the contribution of each feature to classification performance. Additionally, a correlation analysis is conducted to detect highly correlated features that may introduce multicollinearity. Many studies in existing literature on Parkinson’s disease classification overlook the impact of multicollinearity and redundant features, which can affect model stability and interpretability. Our study addresses this gap by systematically comparing four feature selection methods and incorporating correlation analysis to refine the feature set for improved accuracy and efficiency. By systematically refining the feature set, this approach ensures a balance between model complexity and predictive power, ultimately enhancing the reliability of automated Parkinson’s disease diagnosis from speech recordings.</p></abstract>
                                                                                                                                    <trans-abstract xml:lang="tr">
                            <p>Özellik seçimi, makine öğrenimi modellerini optimize etmede kritik bir adımdır ve özellikle konuşma verilerine dayalı Parkinson hastalığı sınıflandırması gibi biyomedikal uygulamalarda büyük önem taşır. Bu çalışma, en önemli öngörücü değişkenleri belirlemek ve gereksiz değişkenleri ortadan kaldırarak modelin yorumlanabilirliğini ve verimliliğini artırmak amacıyla birden fazla özellik önem derecelendirme tekniği kullanmaktadır.Sınıflandırma performansına her özelliğin katkısını değerlendirmek için Dizinleme Önem (Permutation Importance), Karşılıklı Bilgi (Mutual Information - MI), ANOVA F-skoru ve Rastgele Orman Önemi (Random Forest Importance) olmak üzere dört farklı yöntem uygulanmaktadır. Ayrıca, yüksek derecede ilişkili özellikleri tespit ederek çoklu bağlantı (multicollinearity) sorununu önlemek için bir korelasyon analizi gerçekleştirilmiştir.Mevcut literatürde Parkinson hastalığı sınıflandırmasına yönelik birçok çalışma, çoklu bağlantı ve gereksiz özelliklerin model kararlılığı ve yorumlanabilirliği üzerindeki etkisini göz ardı etmektedir. Bu çalışma, dört farklı özellik seçme yöntemini sistematik olarak karşılaştırarak ve korelasyon analizini entegre ederek bu boşluğu gidermeyi amaçlamaktadır. Özellik kümesini titizlikle rafine eden bu yaklaşım, model karmaşıklığı ile tahmin gücü arasında bir denge sağlayarak konuşma kayıtlarından otomatik Parkinson hastalığı teşhisinin güvenilirliğini artırmaktadır.</p></trans-abstract>
                                                            
            
                                                            <kwd-group>
                                                    <kwd>Parkinson’s Disease (PD)</kwd>
                                                    <kwd>  feature engineering</kwd>
                                                    <kwd>  ensemble learning</kwd>
                                                    <kwd>  random forest</kwd>
                                                    <kwd>  ANOVA</kwd>
                                                    <kwd>  Mutual Information</kwd>
                                            </kwd-group>
                                                        
                                                                            <kwd-group xml:lang="tr">
                                                    <kwd>Parkinson’s Disease (PD)</kwd>
                                                    <kwd>  feature engineering</kwd>
                                                    <kwd>  ensemble learning</kwd>
                                                    <kwd>  random forest</kwd>
                                                    <kwd>  ANOVA</kwd>
                                                    <kwd>  Mutual Information</kwd>
                                            </kwd-group>
                                                                                                            </article-meta>
    </front>
    <back>
                            <ref-list>
                                    <ref id="ref1">
                        <label>1</label>
                        <mixed-citation publication-type="journal">[1]	Shahid, A. H., &amp; Singh, M. P. (2020). A deep learning approach for prediction of Parkinson’s disease progression. Biomedical Engineering Letters, 10, 227-239.</mixed-citation>
                    </ref>
                                    <ref id="ref2">
                        <label>2</label>
                        <mixed-citation publication-type="journal">[2]	Bakar, Z. A., Ispawi, D. I., Ibrahim, N. F., &amp; Tahir, N. M. (2012, March). Classification of Parkinson&#039;s disease based on Multilayer Perceptrons (MLPs) Neural Network and ANOVA as a feature extraction. In 2012 IEEE 8th International Colloquium on Signal Processing and Its Applications (pp. 63-67). IEEE.</mixed-citation>
                    </ref>
                                    <ref id="ref3">
                        <label>3</label>
                        <mixed-citation publication-type="journal">[3]	Caliskan, A., Badem, H., Basturk, A., &amp; Yuksel, M. (2017). Diagnosis of the parkinson disease by using deep neural network classifier. IU-Journal of Electrical &amp; Electronics Engineering, 17(2), 3311-3318.</mixed-citation>
                    </ref>
                                    <ref id="ref4">
                        <label>4</label>
                        <mixed-citation publication-type="journal">[4]	Almeida, J. S., Rebouças Filho, P. P., Carneiro, T., Wei, W., Damaševičius, R., Maskeliūnas, R., &amp; de Albuquerque, V. H. C. (2019). Detecting Parkinson’s disease with sustained phonation and speech signals using machine learning techniques. Pattern Recognition Letters, 125, 55-62.</mixed-citation>
                    </ref>
                                    <ref id="ref5">
                        <label>5</label>
                        <mixed-citation publication-type="journal">[5]	Oktay, A. B., &amp; Kocer, A. (2020). Differential diagnosis of Parkinson and essential tremor with convolutional LSTM networks. Biomedical Signal Processing and Control, 56, 101683.</mixed-citation>
                    </ref>
                                    <ref id="ref6">
                        <label>6</label>
                        <mixed-citation publication-type="journal">[6]	Khojasteh, P., Viswanathan, R., Aliahmad, B., Ragnav, S., Zham, P., &amp; Kumar, D. K. (2018, October). Parkinson&#039;s disease diagnosis based on multivariate deep features of speech signal. In 2018 IEEE life sciences conference (LSC) (pp. 187-190). IEEE.
 
[7]	Appakaya, Leung, K. H., Salmanpour, M. R., Saberi, A., Klyuzhin, I. S., Sossi, V., Jha, A. K., ... &amp; Rahmim, A. (2018, November). Using deep-learning to predict outcome of patients with Parkinson’s disease. In 2018 IEEE Nuclear Science Symposium and Medical Imaging Conference Proceedings (NSS/MIC) (pp. 1-4). IEEE.</mixed-citation>
                    </ref>
                                    <ref id="ref7">
                        <label>7</label>
                        <mixed-citation publication-type="journal">[8]	Xiao, B., He, N., Wang, Q., Cheng, Z., Jiao, Y., Haacke, E. M., ... &amp; Shi, F. (2019). Quantitative susceptibility mapping based hybrid feature extraction for diagnosis of Parkinson&#039;s disease. NeuroImage: Clinical, 24, 102070.</mixed-citation>
                    </ref>
                                    <ref id="ref8">
                        <label>8</label>
                        <mixed-citation publication-type="journal">[9]	Zhang, Y. N. (2017). Can a smartphone diagnose parkinson disease? a deep neural network method and telediagnosis system implementation. Parkinson’s disease, 2017(1), 6209703.</mixed-citation>
                    </ref>
                                    <ref id="ref9">
                        <label>9</label>
                        <mixed-citation publication-type="journal">[10]	Tsanas, A.; Little, M.; McSharry, P.; Ramig, L. Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests. Nat. Preced. 2009.</mixed-citation>
                    </ref>
                                    <ref id="ref10">
                        <label>10</label>
                        <mixed-citation publication-type="journal">[11]	Frid, A.; Tsanas, A.; Little, M.A.; McSharry, P.E.; Ramig, L.O. Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson’s disease symptom severity. J. R. Soc. Interface 2011, 8, 842–855</mixed-citation>
                    </ref>
                                    <ref id="ref11">
                        <label>11</label>
                        <mixed-citation publication-type="journal">[12]	Appakaya, S.B.; Sankar, R. Classification of Parkinson’s disease Using Pitch Synchronous Speech Analysis. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 18–21 July 2018;
pp. 1420–1423.</mixed-citation>
                    </ref>
                                    <ref id="ref12">
                        <label>12</label>
                        <mixed-citation publication-type="journal">[13]	Appakaya, S.B.; Pratihar, R.; Sankar, R. Parkinson’s Disease Classification Framework Using Vocal Dynamics in Connected Speech. Algorithms 2023, 16, 509.</mixed-citation>
                    </ref>
                                    <ref id="ref13">
                        <label>13</label>
                        <mixed-citation publication-type="journal">[14]	Quan, C.; Ren, K.; Luo, Z. A deep learning based method for Parkinson’s disease detection using dynamic features of speech. IEEE Access 2021, 9, 10239–10252.</mixed-citation>
                    </ref>
                                    <ref id="ref14">
                        <label>14</label>
                        <mixed-citation publication-type="journal">[15]	Gunduz, H. Deep learning-based Parkinson’s disease classification using vocal feature sets. IEEE Access 2019, 7, 115540–115551.</mixed-citation>
                    </ref>
                                    <ref id="ref15">
                        <label>15</label>
                        <mixed-citation publication-type="journal">[16]	Chen, H.-L.; Huang, C.-C.; Yu, X.-G.; Xu, X.; Sun, X.; Wang, G.; Wang, S.-J. An efficient diagnosis system for detection of Parkinson’s disease using fuzzy k-nearest neighbor approach. Expert Syst. Appl. 2013, 40, 263–271.</mixed-citation>
                    </ref>
                                    <ref id="ref16">
                        <label>16</label>
                        <mixed-citation publication-type="journal">[17]	Hassin-Baer, S.; Cohen, O.S.; Israeli-Korn, S.; Yahalom, G.; Benizri, S.; Sand, D.; Issachar, G.; Geva, A.B.; Shani-Hershkovich, R.; Peremen, Z. Identification of an early-stage Parkinson’s disease neuromarker using event-related potentials, brain network analytics and machine-learning. PLoS ONE 2022, 17, e0261947.</mixed-citation>
                    </ref>
                                    <ref id="ref17">
                        <label>17</label>
                        <mixed-citation publication-type="journal">[18]	Vieira, S.; Pinaya, W.H.; Mechelli, A. Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: Methods and applications. Neurosci. Biobehav. Rev. 2017, 74, 58–75.
 
[19]	Güçlü, U.; Van Gerven, M.A. Modeling the dynamics of human brain activity with recurrent neural networks. Front. Comput. Neurosci. 2017, 11, 7.</mixed-citation>
                    </ref>
                                    <ref id="ref18">
                        <label>18</label>
                        <mixed-citation publication-type="journal">[20]	Nishimoto, S.; Vu, A.T.; Naselaris, T.; Benjamini, Y.; Yu, B.; Gallant, J.L. Reconstructing visual experiences from brain activity evoked by natural movies. Curr. Biol. 2011, 21, 1641– 1646.</mixed-citation>
                    </ref>
                                    <ref id="ref19">
                        <label>19</label>
                        <mixed-citation publication-type="journal">[21]	Riaz, A.; Asad, M.; Al-Arif, S.M.R.; Alonso, E.; Dima, D.; Corr, P.; Slabaugh, G. Fcnet: A convolutional neural network for calculating functional connectivity from functional mri. In Connectomics in NeuroImaging: Proceedings of the First International Workshop, CNI 2017, Held in Conjunction with MICCAI 2017, Quebec City, QC, Canada, 14 September 2017; Proceedings; Springer: Cham, Switzerland, 2017; pp. 70–78.</mixed-citation>
                    </ref>
                                    <ref id="ref20">
                        <label>20</label>
                        <mixed-citation publication-type="journal">[22]	Rehman, R.Z.U.; Del Din, S.; Guan, Y.; Yarnall, A.J.; Shi, J.Q.; Rochester, L. Selecting clinically relevant gait characteristics for classification of early Parkinson’s disease: A comprehensive machine learning approach. Sci. Rep. 2019, 9, 17269.</mixed-citation>
                    </ref>
                                    <ref id="ref21">
                        <label>21</label>
                        <mixed-citation publication-type="journal">[23]	Birdal, R., &amp; Sertbaş, A. (2023). 3-D Gait Identification Utilizing Latent Canonical Covariates Consisting of Gait Features. Computers, Materials and Continua, 76(3).</mixed-citation>
                    </ref>
                                    <ref id="ref22">
                        <label>22</label>
                        <mixed-citation publication-type="journal">[24]	Zheng, Y.; Weng, Y.; Yang, X.; Cai, G.; Cai, G.; Song, Y. SVM-based gait analysis and classification for patients with Parkinson’s disease. In Proceedings of the 2021 15th International Symposium on Medical Information and Communication Technology (ISMICT), Xiamen, China, 14–16 April 2021; pp. 53–58.</mixed-citation>
                    </ref>
                                    <ref id="ref23">
                        <label>23</label>
                        <mixed-citation publication-type="journal">[25]	Perumal, S.V.; Sankar, R. Gait monitoring system for patients with Parkinson’s disease using wearable sensors. In Proceedings of the 2016 IEEE Healthcare Innovation Point-of-Care Technologies Conference (HI-POCT), Cancun, Mexico, 9–11 November 2016; pp. 21–24.</mixed-citation>
                    </ref>
                                    <ref id="ref24">
                        <label>24</label>
                        <mixed-citation publication-type="journal">[26]	Joshi, D.; Khajuria, A.; Joshi, P. An automatic non-invasive method for Parkinson’s disease classification. Comput. Methods Programs Biomed. 2017, 145, 135–145.</mixed-citation>
                    </ref>
                                    <ref id="ref25">
                        <label>25</label>
                        <mixed-citation publication-type="journal">[27]	Lee, S.; Hussein, R.; McKeown, M.J. A Deep Convolutional-Recurrent Neural Network Architecture for Parkinson’s Disease EEG Classification. In Proceedings of the 2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP), Ottawa, ONT, Canada, 11–14 November 2019</mixed-citation>
                    </ref>
                                    <ref id="ref26">
                        <label>26</label>
                        <mixed-citation publication-type="journal">[28]	Tsanas, A.; Little, M.A.; McSharry, P.E.; Ramig, L.O. Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests. Nat. Preced. 2009, 57, 884–893</mixed-citation>
                    </ref>
                                    <ref id="ref27">
                        <label>27</label>
                        <mixed-citation publication-type="journal">[29]	Karabayir, I.; Goldman, S.M.; Pappu, S.; Akbilgic, O. Gradient boosting for Parkinson’s disease diagnosis from voice recordings. BMC Med. Inform. Decis. Mak. 2020, 20, 228.</mixed-citation>
                    </ref>
                                    <ref id="ref28">
                        <label>28</label>
                        <mixed-citation publication-type="journal">[30]	Zhang, Y.N. Can a Smartphone Diagnose Parkinson Disease? A Deep Neural Network Method and Telediagnosis System Implementation. Park. Dis. 2017, 2017, 6209703
 
[31]	Pereira, C.R.; Pereira, D.R.; Papa, J.P.; Rosa, G.H.; Yang, X.-S. Convolutional Neural Networks Applied for Parkinson’s Disease Identification. Lect. Notes Comput. Sci. 2016, 9605, 377–390</mixed-citation>
                    </ref>
                                    <ref id="ref29">
                        <label>29</label>
                        <mixed-citation publication-type="journal">[32]	Sancar, Y. (2024). Enhanced Classification of Skin Lesions Using Fine-Tuned MobileNet and DenseNet121 Models with Ensemble Learning. Erzincan University Journal of Science and Technology, 17(3), 870-883.</mixed-citation>
                    </ref>
                                    <ref id="ref30">
                        <label>30</label>
                        <mixed-citation publication-type="journal">[33]	Aksakallı, I., Kaçdıoğlu, S., &amp; Hanay, Y. S. (2021). Kidney x-ray images classification using machine learning and deep learning methods. Balkan Journal of Electrical and Computer Engineering, 9(2), 144-151.</mixed-citation>
                    </ref>
                                    <ref id="ref31">
                        <label>31</label>
                        <mixed-citation publication-type="journal">[34]	Isenkul, M.E.; Sakar, B.E.; Kursun, O. . &#039;Improved spiral test using digitized graphics tablet for monitoring Parkinson&#039;s disease.&#039; The 2nd International Conference on e-Health and Telemedicine (ICEHTM-2014), pp. 171-175, 2014.</mixed-citation>
                    </ref>
                                    <ref id="ref32">
                        <label>32</label>
                        <mixed-citation publication-type="journal">[35]	Das, R. (2010). A comparison of multiple classification methods for diagnosis of Parkinson disease. Expert Systems with Applications, 37(2), 1568–1572.</mixed-citation>
                    </ref>
                            </ref-list>
                    </back>
    </article>
