EEG TABANLI NÖBET SINIFLANDIRMASI: PCA’NIN ETKİSİ VE LOJİSTİK REGRESYON, SVM, XGBOOST KARŞILAŞTIRMASI

Burcu Kocarık Gacar

doi:10.30794/pausbed.1775480

Araştırma Makalesi

BibTex

RIS

Kaynak Göster

Yıl 2025, Sayı: Sayı:71 (EYS'25 Özel Sayısı), 21 - 40, 29.12.2025

Burcu Kocarık Gacar

https://doi.org/10.30794/pausbed.1775480

https://izlik.org/JA49YB89MC

Öz

Kaynakça

Abdi, H., ve Williams, L. J. (2010). Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459. https://doi.org/10.1002/wics.101.
Alpar, R. (2013). Çok Değişkenli İstatistiksel Yöntemler, Detay Yayıncılık: Ankara, Türkiye.
Berrich, Y. ve Guennoun, Z. (2025). EEG-Based Epilepsy Detection Using CNN-SVM and DNN-SVM with Feature Dimensionality Reduction By PCA. Sci Rep 15, 14313 (2025). https://doi.org/10.1038/s41598-025-95831-z.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning, Springer.
Bradley, A. P. (1997). The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms. Pattern Recognition, 30(7), 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
Carvajal-Dossman, J.P., Guio, L., García-Orjuela, D. vd. (2025). Retraining and Evaluation of Machine Learning and Deep Learning Models for Seizure Classification from EEG Data. Sci Rep 15, 15345 (2025). https://doi.org/10.1038/s41598-025-98389-y.
Chan, J.-L., Leow, S., Bea, K., Cheng, W., Phoong, S., Hong, Z.-W. ve Chen, Y. L. (2022). Mitigating the Multicollinearity Problem and its Machine Learning Approach: A Review, Mathematics, 10(8), 1283. https://doi.org/10.3390/math10081283.
Chen, T., ve Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System, In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://dl.acm.org/doi/pdf/10.1145/2939672.2939785.
Chen, T., ve He, T. (2020). XGBoost: Extreme Gradient Boosting. R Package Version 1.0.0.2.
Cortes, C., ve Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018.
Diler, S. ve Demir, Y. (2024). Çoklu Doğrusal Bağlantı Olması Durumunda Veri Madenciliği Algoritmaları Performanslarının Karşılaştırılması, Nicel Bilimler Dergisi, 6(1), 40-67. doi: 10.51541/nicel.1371834.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://people.inf.elte.hu/kiss/13dwhdm/roc.pdf
Friedman, J., Hastie, T., ve Tibshirani, R. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. https://tinyurl.com/4cpndh32.
Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, 29(5), 1189–1232.
Geron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd Edition. O’Reilly Media, Inc., Canada. https://www.oreilly.com/catalog/errata.csp?isbn=9781492032649.
Gupta, A. ve Goel, S. (2023). Deep Learning Empowered Weather Image Classification for Accurate Analysis. IEEE 2nd International Conference on Industrial Electronics: Developments & Applications (ICIDeA), Imphal, India, 2023, pp. 567-572, doi: 10.1109/ICIDeA59866.2023.10295216.
Han, J., Kamber, M., ve Pei, J. (2012). Data Mining: Concepts and Techniques (3rd ed.). Morgan Kaufmann. https://homes.di.unimi.it/ceselli/IM/2012-13/slides/02-KnowYourData.pdf.
Hanley, J. A., ve McNeil, B. J. (1982). The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve. Radiology, 143(1), 29–36. https://doi.org/10.1148/radiology.143.1.7063747.
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate Data Analysis, (8th ed.). Cengage. Hastie, T., Tibshirani, R., ve Friedman, J. (2009). The Elements of Statistical Learning, (2nd ed.). Springer.
Hosmer, D. W., Lemeshov, S. ve Sturdivant, R. X. (2013). Applied Logistic Regression, (Third Edition). John Wiley & Sons, Inc: New Jersey, USA.
Islam, M. S., Thapa, K., ve Yang, S.-H. (2022). Epileptic-Net: An Improved Epileptic Seizure Detection System Using Dense Convolutional Block with Attention Network from EEG. Sensors, 22(3), 728. https://doi.org/10.3390/s22030728.
Jolliffe, I. T. (2002). Principal Component Analysis (2nd ed.). Springer Series in Statistics. Springer-Verlag, New York.
Kecman, V. (2005). Support Vector Machines: Theory and Applications. In L.Wang (Eds). Support Vector Machines-An Introduction (pp. 1–47). Springer, Berlin Heidelberg. https://doi.org/10.1007/b95439.
Kendirkıran, G. ve Doğan, S. (2024). Makine Öğrenimi Teknikleri ile Kredi Risk Tahmininde Yeniden Örnekleme Yöntemlerinin Karşılaştırılması, Söke İşletme Fakültesi Dergisi, Yıl: 2024, Cilt: 1, Sayı: 2, ss.48-60 https://dergipark.org.tr/tr/download/article-file/4364826.
Kıvrak, O. (2025). Araç Fiyat Tahmininde Makine Öğrenmesi Algoritmalarının Karşılaştırılması ve Performans Analizi, İktisadi İdari ve Siyasal Araştırmalar Dergisi, Cilt: 10 Sayı: 27, 454-474, 29.06.2025 https://doi.org/10.25204/iktisad.1494020.
Kong, G., Ma, S., Zhao, W., Wang, H., Fu, Q., & Wang, J. (2024). A Novel Method for Optimizing Epilepsy Detection Features Through Multi-Domain Feature Fusion and Selection, Frontiers in Computational Neuroscience, 18, 1416838. doi:10.3389/fncom.2024.1416838.
Lewis, N. D. (2017), Machine Learning Made Easy with R: An Intuitive Step by Step Blueprint for Beginners, CreateSpace Independent Publishing Platform: Carolina, USA.
Li. Q., Cao., W, Zhang, A. (2025). Multi-Stream Feature Fusion of Vision Transformer and CNN for Precise Epileptic Seizure Detection from EEG Signals. J Transl Med, Aug 6;23(1):871. doi: 10.1186/s12967-025-06862-z. PMID: 40770757; PMCID: PMC12329966.
Mason, C. H. ve Perreault, W. D. (1991), Collinearity, Power, and Interpretation of Multiple Regression Analysis, Journal of Marketing Research, 28(3), 268–280. https://doi.org/10.2307/3172863.
Moguerza, J., ve Muñoz, A. (2006). Support Vector Machines with Applications, Statistical Science, 21, 322- 336. https://doi.org/10.1214/088342306000000493.
Montgomery, D. C., Peck, E. A. ve Vining, G. G. (2021). Introduction to Linear Regression Analysis. John Wiley ve Sons.
Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
Najmusseher ve Banu, P. K. N. (2024). BEED: Bangalore EEG Epilepsy Dataset. [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5K33B. https://archive.ics.uci.edu/dataset.
Najmusseher ve Banu, P. K. N. (2025). Feature Engineering for Epileptic Seizure Classification Using SeqBoostNet. International Journal of Computing and Digital Systems, VOL. 17, NO. 1, 1–15. http://dx.doi.org/10.12785/ijcds/1571020131.
Rahman, M. M., Ghasemi, Y., Suley, E., Zhou, Y., Wang, S. ve Rogers, J. (2021). Machine Learning Based Computer Aided Diagnosis of Breast Cancer Utilizing Anthropometric and Clinical Features, IRBM, 42(4), 215-226.
Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. Journal of Machine Learning Technologies, 2(1), 37–63. https://arxiv.org/abs/2010.16061.
Provost, F., ve Fawcett, T. (2013). Data Science for Business. O’Reilly Media. https://www.oreilly.com/library/view/data-science-for/9781449374273/.
Saito, T., ve Rehmsmeier, M. (2015). The Precision-Recall Plot is More Informative Than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432.
Shalev-Shwartz, S., ve Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.
Shlens, J. (2014). A Tutorial on Principal Component Analysis. arXiv preprint arXiv:1404.1100.
Smola, A. J. ve Schölkopf, B. (2004). A Tutorial on Support Vector Regression. Statistics and Computing, 14, 199- 222. https://doi.org/10.1023/B:STCO.0000035301.49549.88.
Wei, L. ve Mooney, C. (2020). Epileptic Seizure Detection in Clinical EEGs Using an XGboost-based Method, IEEE SPMB, v1.0:1-6. https://isip.piconepress.com/conferences/ieee_spmb/2020/papers/l02_03.pdf?utm.
Wu, J., Zhou, T., Li. T. (2020). Detecting Epileptic Seizures in EEG Signals with Complementary Ensemble Empirical Mode Decomposition and Extreme Gradient Boosting. Entropy (Basel), 24;22(2):140. doi: 10.3390/e22020140.

Yıl 2025, Sayı: Sayı:71 (EYS'25 Özel Sayısı), 21 - 40, 29.12.2025

Burcu Kocarık Gacar

https://doi.org/10.30794/pausbed.1775480

https://izlik.org/JA49YB89MC

Öz

Kaynakça

Abdi, H., ve Williams, L. J. (2010). Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459. https://doi.org/10.1002/wics.101.
Alpar, R. (2013). Çok Değişkenli İstatistiksel Yöntemler, Detay Yayıncılık: Ankara, Türkiye.
Berrich, Y. ve Guennoun, Z. (2025). EEG-Based Epilepsy Detection Using CNN-SVM and DNN-SVM with Feature Dimensionality Reduction By PCA. Sci Rep 15, 14313 (2025). https://doi.org/10.1038/s41598-025-95831-z.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning, Springer.
Bradley, A. P. (1997). The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms. Pattern Recognition, 30(7), 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
Carvajal-Dossman, J.P., Guio, L., García-Orjuela, D. vd. (2025). Retraining and Evaluation of Machine Learning and Deep Learning Models for Seizure Classification from EEG Data. Sci Rep 15, 15345 (2025). https://doi.org/10.1038/s41598-025-98389-y.
Chan, J.-L., Leow, S., Bea, K., Cheng, W., Phoong, S., Hong, Z.-W. ve Chen, Y. L. (2022). Mitigating the Multicollinearity Problem and its Machine Learning Approach: A Review, Mathematics, 10(8), 1283. https://doi.org/10.3390/math10081283.
Chen, T., ve Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System, In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://dl.acm.org/doi/pdf/10.1145/2939672.2939785.
Chen, T., ve He, T. (2020). XGBoost: Extreme Gradient Boosting. R Package Version 1.0.0.2.
Cortes, C., ve Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018.
Diler, S. ve Demir, Y. (2024). Çoklu Doğrusal Bağlantı Olması Durumunda Veri Madenciliği Algoritmaları Performanslarının Karşılaştırılması, Nicel Bilimler Dergisi, 6(1), 40-67. doi: 10.51541/nicel.1371834.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://people.inf.elte.hu/kiss/13dwhdm/roc.pdf
Friedman, J., Hastie, T., ve Tibshirani, R. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. https://tinyurl.com/4cpndh32.
Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, 29(5), 1189–1232.
Geron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd Edition. O’Reilly Media, Inc., Canada. https://www.oreilly.com/catalog/errata.csp?isbn=9781492032649.
Gupta, A. ve Goel, S. (2023). Deep Learning Empowered Weather Image Classification for Accurate Analysis. IEEE 2nd International Conference on Industrial Electronics: Developments & Applications (ICIDeA), Imphal, India, 2023, pp. 567-572, doi: 10.1109/ICIDeA59866.2023.10295216.
Han, J., Kamber, M., ve Pei, J. (2012). Data Mining: Concepts and Techniques (3rd ed.). Morgan Kaufmann. https://homes.di.unimi.it/ceselli/IM/2012-13/slides/02-KnowYourData.pdf.
Hanley, J. A., ve McNeil, B. J. (1982). The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve. Radiology, 143(1), 29–36. https://doi.org/10.1148/radiology.143.1.7063747.
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate Data Analysis, (8th ed.). Cengage. Hastie, T., Tibshirani, R., ve Friedman, J. (2009). The Elements of Statistical Learning, (2nd ed.). Springer.
Hosmer, D. W., Lemeshov, S. ve Sturdivant, R. X. (2013). Applied Logistic Regression, (Third Edition). John Wiley & Sons, Inc: New Jersey, USA.
Islam, M. S., Thapa, K., ve Yang, S.-H. (2022). Epileptic-Net: An Improved Epileptic Seizure Detection System Using Dense Convolutional Block with Attention Network from EEG. Sensors, 22(3), 728. https://doi.org/10.3390/s22030728.
Jolliffe, I. T. (2002). Principal Component Analysis (2nd ed.). Springer Series in Statistics. Springer-Verlag, New York.
Kecman, V. (2005). Support Vector Machines: Theory and Applications. In L.Wang (Eds). Support Vector Machines-An Introduction (pp. 1–47). Springer, Berlin Heidelberg. https://doi.org/10.1007/b95439.
Kendirkıran, G. ve Doğan, S. (2024). Makine Öğrenimi Teknikleri ile Kredi Risk Tahmininde Yeniden Örnekleme Yöntemlerinin Karşılaştırılması, Söke İşletme Fakültesi Dergisi, Yıl: 2024, Cilt: 1, Sayı: 2, ss.48-60 https://dergipark.org.tr/tr/download/article-file/4364826.
Kıvrak, O. (2025). Araç Fiyat Tahmininde Makine Öğrenmesi Algoritmalarının Karşılaştırılması ve Performans Analizi, İktisadi İdari ve Siyasal Araştırmalar Dergisi, Cilt: 10 Sayı: 27, 454-474, 29.06.2025 https://doi.org/10.25204/iktisad.1494020.
Kong, G., Ma, S., Zhao, W., Wang, H., Fu, Q., & Wang, J. (2024). A Novel Method for Optimizing Epilepsy Detection Features Through Multi-Domain Feature Fusion and Selection, Frontiers in Computational Neuroscience, 18, 1416838. doi:10.3389/fncom.2024.1416838.
Lewis, N. D. (2017), Machine Learning Made Easy with R: An Intuitive Step by Step Blueprint for Beginners, CreateSpace Independent Publishing Platform: Carolina, USA.
Li. Q., Cao., W, Zhang, A. (2025). Multi-Stream Feature Fusion of Vision Transformer and CNN for Precise Epileptic Seizure Detection from EEG Signals. J Transl Med, Aug 6;23(1):871. doi: 10.1186/s12967-025-06862-z. PMID: 40770757; PMCID: PMC12329966.
Mason, C. H. ve Perreault, W. D. (1991), Collinearity, Power, and Interpretation of Multiple Regression Analysis, Journal of Marketing Research, 28(3), 268–280. https://doi.org/10.2307/3172863.
Moguerza, J., ve Muñoz, A. (2006). Support Vector Machines with Applications, Statistical Science, 21, 322- 336. https://doi.org/10.1214/088342306000000493.
Montgomery, D. C., Peck, E. A. ve Vining, G. G. (2021). Introduction to Linear Regression Analysis. John Wiley ve Sons.
Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
Najmusseher ve Banu, P. K. N. (2024). BEED: Bangalore EEG Epilepsy Dataset. [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5K33B. https://archive.ics.uci.edu/dataset.
Najmusseher ve Banu, P. K. N. (2025). Feature Engineering for Epileptic Seizure Classification Using SeqBoostNet. International Journal of Computing and Digital Systems, VOL. 17, NO. 1, 1–15. http://dx.doi.org/10.12785/ijcds/1571020131.
Rahman, M. M., Ghasemi, Y., Suley, E., Zhou, Y., Wang, S. ve Rogers, J. (2021). Machine Learning Based Computer Aided Diagnosis of Breast Cancer Utilizing Anthropometric and Clinical Features, IRBM, 42(4), 215-226.
Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. Journal of Machine Learning Technologies, 2(1), 37–63. https://arxiv.org/abs/2010.16061.
Provost, F., ve Fawcett, T. (2013). Data Science for Business. O’Reilly Media. https://www.oreilly.com/library/view/data-science-for/9781449374273/.
Saito, T., ve Rehmsmeier, M. (2015). The Precision-Recall Plot is More Informative Than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432.
Shalev-Shwartz, S., ve Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.
Shlens, J. (2014). A Tutorial on Principal Component Analysis. arXiv preprint arXiv:1404.1100.
Smola, A. J. ve Schölkopf, B. (2004). A Tutorial on Support Vector Regression. Statistics and Computing, 14, 199- 222. https://doi.org/10.1023/B:STCO.0000035301.49549.88.
Wei, L. ve Mooney, C. (2020). Epileptic Seizure Detection in Clinical EEGs Using an XGboost-based Method, IEEE SPMB, v1.0:1-6. https://isip.piconepress.com/conferences/ieee_spmb/2020/papers/l02_03.pdf?utm.
Wu, J., Zhou, T., Li. T. (2020). Detecting Epileptic Seizures in EEG Signals with Complementary Ensemble Empirical Mode Decomposition and Extreme Gradient Boosting. Entropy (Basel), 24;22(2):140. doi: 10.3390/e22020140.

EEG TABANLI NÖBET SINIFLANDIRMASI: PCA’NIN ETKİSİ VE LOJİSTİK REGRESYON, SVM, XGBOOST KARŞILAŞTIRMASI

Yıl 2025, Sayı: Sayı:71 (EYS'25 Özel Sayısı), 21 - 40, 29.12.2025

Burcu Kocarık Gacar

https://doi.org/10.30794/pausbed.1775480

https://izlik.org/JA49YB89MC

Öz

Epilepsi, bireylerin sosyal ilişkilerini, toplumsal uyumunu ve yaşam kalitesini olumsuz yönde etkileyen bir hastalıktır. Bu çalışmada, çok öznitelikli (16 kanal), çok sınıflı (sağlıklı, jeneralize nöbet, fokal nöbet, nöbet aktivitesi) ve dengeli Bangalore Epilepsi veri kümesi üzerinde Lojistik Regresyon, Destek Vektör Makineleri (SVM) ve Aşırı Gradyan Artırma (XGBoost) modelleri; Temel Bileşenler Analizi (PCA) ile ve PCA’sız iki işlem hattında değerlendirilmiştir. Doğruluk-çalışma süresi dengesi bağlamında PCA’nın marjinal katkısı nicel olarak gösterilmiş; sınıflar arası ayrışmanın zorlaştığı durumlarda hangi modelin daha tutarlı ve etkin performans gösterdiği incelenmiştir. Bulgular, öznitelik korelasyonlarının yüksek olduğu veri kümelerinde XGBoost’un doğruluk, F1, ROC-AUC açısından SVM ve lojistik regresyona kıyasla daha iyi performans sergilediğine işaret etmektedir. Bu durum epileptik nöbet tespitinde PCA’nın her durumda model başarımını artırmadığını; dolayısıyla veri ön-işleme stratejilerinin model tipine göre dikkatle seçilmesi gerektiğini göstermektedir. Elde edilen sonuçlar, hem bilişsel sinyal işleme literatürüne metodolojik bir karşılaştırma katkısı sunmakta hem de model seçimi konusunda uygulayıcılara yol göstermektedir. Standart metrikler ile on katlı çapraz doğrulama ve farklı oranlarda ayrılmış test kümelerine dayalı bir kıyaslama sunması çalışmanın başlıca katkılarındandır.

Anahtar Kelimeler

Epileptik Nöbet Sınıflandırma , Bangalore Epilepsi verisi , Temel Bileşenler Analizi , Makine Öğrenmesi Teknikleri

Etik Beyan

Çalışmanın tüm süreçlerinin araştırma ve yayın etiğine uygun olduğunu, etik kurallara ve bilimsel atıf gösterme ilkelerine uyduğumu beyan ederim.

Kaynakça

Abdi, H., ve Williams, L. J. (2010). Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459. https://doi.org/10.1002/wics.101.
Alpar, R. (2013). Çok Değişkenli İstatistiksel Yöntemler, Detay Yayıncılık: Ankara, Türkiye.
Berrich, Y. ve Guennoun, Z. (2025). EEG-Based Epilepsy Detection Using CNN-SVM and DNN-SVM with Feature Dimensionality Reduction By PCA. Sci Rep 15, 14313 (2025). https://doi.org/10.1038/s41598-025-95831-z.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning, Springer.
Bradley, A. P. (1997). The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms. Pattern Recognition, 30(7), 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
Carvajal-Dossman, J.P., Guio, L., García-Orjuela, D. vd. (2025). Retraining and Evaluation of Machine Learning and Deep Learning Models for Seizure Classification from EEG Data. Sci Rep 15, 15345 (2025). https://doi.org/10.1038/s41598-025-98389-y.
Chan, J.-L., Leow, S., Bea, K., Cheng, W., Phoong, S., Hong, Z.-W. ve Chen, Y. L. (2022). Mitigating the Multicollinearity Problem and its Machine Learning Approach: A Review, Mathematics, 10(8), 1283. https://doi.org/10.3390/math10081283.
Chen, T., ve Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System, In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://dl.acm.org/doi/pdf/10.1145/2939672.2939785.
Chen, T., ve He, T. (2020). XGBoost: Extreme Gradient Boosting. R Package Version 1.0.0.2.
Cortes, C., ve Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018.
Diler, S. ve Demir, Y. (2024). Çoklu Doğrusal Bağlantı Olması Durumunda Veri Madenciliği Algoritmaları Performanslarının Karşılaştırılması, Nicel Bilimler Dergisi, 6(1), 40-67. doi: 10.51541/nicel.1371834.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://people.inf.elte.hu/kiss/13dwhdm/roc.pdf
Friedman, J., Hastie, T., ve Tibshirani, R. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. https://tinyurl.com/4cpndh32.
Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, 29(5), 1189–1232.
Geron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd Edition. O’Reilly Media, Inc., Canada. https://www.oreilly.com/catalog/errata.csp?isbn=9781492032649.
Gupta, A. ve Goel, S. (2023). Deep Learning Empowered Weather Image Classification for Accurate Analysis. IEEE 2nd International Conference on Industrial Electronics: Developments & Applications (ICIDeA), Imphal, India, 2023, pp. 567-572, doi: 10.1109/ICIDeA59866.2023.10295216.
Han, J., Kamber, M., ve Pei, J. (2012). Data Mining: Concepts and Techniques (3rd ed.). Morgan Kaufmann. https://homes.di.unimi.it/ceselli/IM/2012-13/slides/02-KnowYourData.pdf.
Hanley, J. A., ve McNeil, B. J. (1982). The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve. Radiology, 143(1), 29–36. https://doi.org/10.1148/radiology.143.1.7063747.
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate Data Analysis, (8th ed.). Cengage. Hastie, T., Tibshirani, R., ve Friedman, J. (2009). The Elements of Statistical Learning, (2nd ed.). Springer.
Hosmer, D. W., Lemeshov, S. ve Sturdivant, R. X. (2013). Applied Logistic Regression, (Third Edition). John Wiley & Sons, Inc: New Jersey, USA.
Islam, M. S., Thapa, K., ve Yang, S.-H. (2022). Epileptic-Net: An Improved Epileptic Seizure Detection System Using Dense Convolutional Block with Attention Network from EEG. Sensors, 22(3), 728. https://doi.org/10.3390/s22030728.
Jolliffe, I. T. (2002). Principal Component Analysis (2nd ed.). Springer Series in Statistics. Springer-Verlag, New York.
Kecman, V. (2005). Support Vector Machines: Theory and Applications. In L.Wang (Eds). Support Vector Machines-An Introduction (pp. 1–47). Springer, Berlin Heidelberg. https://doi.org/10.1007/b95439.
Kendirkıran, G. ve Doğan, S. (2024). Makine Öğrenimi Teknikleri ile Kredi Risk Tahmininde Yeniden Örnekleme Yöntemlerinin Karşılaştırılması, Söke İşletme Fakültesi Dergisi, Yıl: 2024, Cilt: 1, Sayı: 2, ss.48-60 https://dergipark.org.tr/tr/download/article-file/4364826.
Kıvrak, O. (2025). Araç Fiyat Tahmininde Makine Öğrenmesi Algoritmalarının Karşılaştırılması ve Performans Analizi, İktisadi İdari ve Siyasal Araştırmalar Dergisi, Cilt: 10 Sayı: 27, 454-474, 29.06.2025 https://doi.org/10.25204/iktisad.1494020.
Kong, G., Ma, S., Zhao, W., Wang, H., Fu, Q., & Wang, J. (2024). A Novel Method for Optimizing Epilepsy Detection Features Through Multi-Domain Feature Fusion and Selection, Frontiers in Computational Neuroscience, 18, 1416838. doi:10.3389/fncom.2024.1416838.
Lewis, N. D. (2017), Machine Learning Made Easy with R: An Intuitive Step by Step Blueprint for Beginners, CreateSpace Independent Publishing Platform: Carolina, USA.
Li. Q., Cao., W, Zhang, A. (2025). Multi-Stream Feature Fusion of Vision Transformer and CNN for Precise Epileptic Seizure Detection from EEG Signals. J Transl Med, Aug 6;23(1):871. doi: 10.1186/s12967-025-06862-z. PMID: 40770757; PMCID: PMC12329966.
Mason, C. H. ve Perreault, W. D. (1991), Collinearity, Power, and Interpretation of Multiple Regression Analysis, Journal of Marketing Research, 28(3), 268–280. https://doi.org/10.2307/3172863.
Moguerza, J., ve Muñoz, A. (2006). Support Vector Machines with Applications, Statistical Science, 21, 322- 336. https://doi.org/10.1214/088342306000000493.
Montgomery, D. C., Peck, E. A. ve Vining, G. G. (2021). Introduction to Linear Regression Analysis. John Wiley ve Sons.
Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
Najmusseher ve Banu, P. K. N. (2024). BEED: Bangalore EEG Epilepsy Dataset. [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5K33B. https://archive.ics.uci.edu/dataset.
Najmusseher ve Banu, P. K. N. (2025). Feature Engineering for Epileptic Seizure Classification Using SeqBoostNet. International Journal of Computing and Digital Systems, VOL. 17, NO. 1, 1–15. http://dx.doi.org/10.12785/ijcds/1571020131.
Rahman, M. M., Ghasemi, Y., Suley, E., Zhou, Y., Wang, S. ve Rogers, J. (2021). Machine Learning Based Computer Aided Diagnosis of Breast Cancer Utilizing Anthropometric and Clinical Features, IRBM, 42(4), 215-226.
Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. Journal of Machine Learning Technologies, 2(1), 37–63. https://arxiv.org/abs/2010.16061.
Provost, F., ve Fawcett, T. (2013). Data Science for Business. O’Reilly Media. https://www.oreilly.com/library/view/data-science-for/9781449374273/.
Saito, T., ve Rehmsmeier, M. (2015). The Precision-Recall Plot is More Informative Than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432.
Shalev-Shwartz, S., ve Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.
Shlens, J. (2014). A Tutorial on Principal Component Analysis. arXiv preprint arXiv:1404.1100.
Smola, A. J. ve Schölkopf, B. (2004). A Tutorial on Support Vector Regression. Statistics and Computing, 14, 199- 222. https://doi.org/10.1023/B:STCO.0000035301.49549.88.
Wei, L. ve Mooney, C. (2020). Epileptic Seizure Detection in Clinical EEGs Using an XGboost-based Method, IEEE SPMB, v1.0:1-6. https://isip.piconepress.com/conferences/ieee_spmb/2020/papers/l02_03.pdf?utm.
Wu, J., Zhou, T., Li. T. (2020). Detecting Epileptic Seizures in EEG Signals with Complementary Ensemble Empirical Mode Decomposition and Extreme Gradient Boosting. Entropy (Basel), 24;22(2):140. doi: 10.3390/e22020140.

EEG-BASED SEIZURE CLASSIFICATION: THE EFFECT OF PCA AND A COMPARISON OF LOGISTIC REGRESSION, SVM, AND XGBOOST

Yıl 2025, Sayı: Sayı:71 (EYS'25 Özel Sayısı), 21 - 40, 29.12.2025

Burcu Kocarık Gacar

https://doi.org/10.30794/pausbed.1775480

https://izlik.org/JA49YB89MC

Öz

Epilepsy is a disorder that adversely affects individuals’ social relationships, societal integration, and overall quality of life. In this study, Logistic Regression, Support Vector Machines (SVM), and XGBoost models were evaluated on the multivariate (16-channel), multiclass (healthy, generalized seizure, focal seizure, seizure activity), and balanced Bangalore EEG Epilepsy Dataset using two processing pipelines—with and without Principal Component Analysis (PCA). The marginal contribution of PCA was quantitatively examined in terms of the accuracy–runtime trade-off, and the consistency and efficiency of each model were analyzed in cases where class separation became more challenging. The findings indicate that, in datasets with high feature correlations, XGBoost outperformed SVM and Logistic Regression in terms of accuracy, F1, and ROC-AUC metrics. This suggests that PCA does not necessarily improve model performance in epileptic seizure detection, emphasizing that data pre-processing strategies should be carefully chosen according to the model type. The results provide a methodological comparison that contributes to the cognitive signal processing literature and offer guidance for practitioners in model selection. Presenting a benchmark based on standard metrics, ten-fold cross-validation, and test sets with different split ratios constitutes one of the main contributions of this study.

Anahtar Kelimeler

Epileptic Seizure Classification , BEED , Principal Component Analysis , Machine Learning

Kaynakça

Abdi, H., ve Williams, L. J. (2010). Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459. https://doi.org/10.1002/wics.101.
Alpar, R. (2013). Çok Değişkenli İstatistiksel Yöntemler, Detay Yayıncılık: Ankara, Türkiye.
Berrich, Y. ve Guennoun, Z. (2025). EEG-Based Epilepsy Detection Using CNN-SVM and DNN-SVM with Feature Dimensionality Reduction By PCA. Sci Rep 15, 14313 (2025). https://doi.org/10.1038/s41598-025-95831-z.
Bishop, C. M. (2006). Pattern Recognition and Machine Learning, Springer.
Bradley, A. P. (1997). The Use of the Area Under the ROC Curve in the Evaluation of Machine Learning Algorithms. Pattern Recognition, 30(7), 1145–1159. https://doi.org/10.1016/S0031-3203(96)00142-2
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
Carvajal-Dossman, J.P., Guio, L., García-Orjuela, D. vd. (2025). Retraining and Evaluation of Machine Learning and Deep Learning Models for Seizure Classification from EEG Data. Sci Rep 15, 15345 (2025). https://doi.org/10.1038/s41598-025-98389-y.
Chan, J.-L., Leow, S., Bea, K., Cheng, W., Phoong, S., Hong, Z.-W. ve Chen, Y. L. (2022). Mitigating the Multicollinearity Problem and its Machine Learning Approach: A Review, Mathematics, 10(8), 1283. https://doi.org/10.3390/math10081283.
Chen, T., ve Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System, In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). https://dl.acm.org/doi/pdf/10.1145/2939672.2939785.
Chen, T., ve He, T. (2020). XGBoost: Extreme Gradient Boosting. R Package Version 1.0.0.2.
Cortes, C., ve Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018.
Diler, S. ve Demir, Y. (2024). Çoklu Doğrusal Bağlantı Olması Durumunda Veri Madenciliği Algoritmaları Performanslarının Karşılaştırılması, Nicel Bilimler Dergisi, 6(1), 40-67. doi: 10.51541/nicel.1371834.
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://people.inf.elte.hu/kiss/13dwhdm/roc.pdf
Friedman, J., Hastie, T., ve Tibshirani, R. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. https://tinyurl.com/4cpndh32.
Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, 29(5), 1189–1232.
Geron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd Edition. O’Reilly Media, Inc., Canada. https://www.oreilly.com/catalog/errata.csp?isbn=9781492032649.
Gupta, A. ve Goel, S. (2023). Deep Learning Empowered Weather Image Classification for Accurate Analysis. IEEE 2nd International Conference on Industrial Electronics: Developments & Applications (ICIDeA), Imphal, India, 2023, pp. 567-572, doi: 10.1109/ICIDeA59866.2023.10295216.
Han, J., Kamber, M., ve Pei, J. (2012). Data Mining: Concepts and Techniques (3rd ed.). Morgan Kaufmann. https://homes.di.unimi.it/ceselli/IM/2012-13/slides/02-KnowYourData.pdf.
Hanley, J. A., ve McNeil, B. J. (1982). The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve. Radiology, 143(1), 29–36. https://doi.org/10.1148/radiology.143.1.7063747.
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate Data Analysis, (8th ed.). Cengage. Hastie, T., Tibshirani, R., ve Friedman, J. (2009). The Elements of Statistical Learning, (2nd ed.). Springer.
Hosmer, D. W., Lemeshov, S. ve Sturdivant, R. X. (2013). Applied Logistic Regression, (Third Edition). John Wiley & Sons, Inc: New Jersey, USA.
Islam, M. S., Thapa, K., ve Yang, S.-H. (2022). Epileptic-Net: An Improved Epileptic Seizure Detection System Using Dense Convolutional Block with Attention Network from EEG. Sensors, 22(3), 728. https://doi.org/10.3390/s22030728.
Jolliffe, I. T. (2002). Principal Component Analysis (2nd ed.). Springer Series in Statistics. Springer-Verlag, New York.
Kecman, V. (2005). Support Vector Machines: Theory and Applications. In L.Wang (Eds). Support Vector Machines-An Introduction (pp. 1–47). Springer, Berlin Heidelberg. https://doi.org/10.1007/b95439.
Kendirkıran, G. ve Doğan, S. (2024). Makine Öğrenimi Teknikleri ile Kredi Risk Tahmininde Yeniden Örnekleme Yöntemlerinin Karşılaştırılması, Söke İşletme Fakültesi Dergisi, Yıl: 2024, Cilt: 1, Sayı: 2, ss.48-60 https://dergipark.org.tr/tr/download/article-file/4364826.
Kıvrak, O. (2025). Araç Fiyat Tahmininde Makine Öğrenmesi Algoritmalarının Karşılaştırılması ve Performans Analizi, İktisadi İdari ve Siyasal Araştırmalar Dergisi, Cilt: 10 Sayı: 27, 454-474, 29.06.2025 https://doi.org/10.25204/iktisad.1494020.
Kong, G., Ma, S., Zhao, W., Wang, H., Fu, Q., & Wang, J. (2024). A Novel Method for Optimizing Epilepsy Detection Features Through Multi-Domain Feature Fusion and Selection, Frontiers in Computational Neuroscience, 18, 1416838. doi:10.3389/fncom.2024.1416838.
Lewis, N. D. (2017), Machine Learning Made Easy with R: An Intuitive Step by Step Blueprint for Beginners, CreateSpace Independent Publishing Platform: Carolina, USA.
Li. Q., Cao., W, Zhang, A. (2025). Multi-Stream Feature Fusion of Vision Transformer and CNN for Precise Epileptic Seizure Detection from EEG Signals. J Transl Med, Aug 6;23(1):871. doi: 10.1186/s12967-025-06862-z. PMID: 40770757; PMCID: PMC12329966.
Mason, C. H. ve Perreault, W. D. (1991), Collinearity, Power, and Interpretation of Multiple Regression Analysis, Journal of Marketing Research, 28(3), 268–280. https://doi.org/10.2307/3172863.
Moguerza, J., ve Muñoz, A. (2006). Support Vector Machines with Applications, Statistical Science, 21, 322- 336. https://doi.org/10.1214/088342306000000493.
Montgomery, D. C., Peck, E. A. ve Vining, G. G. (2021). Introduction to Linear Regression Analysis. John Wiley ve Sons.
Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. MIT Press.
Najmusseher ve Banu, P. K. N. (2024). BEED: Bangalore EEG Epilepsy Dataset. [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5K33B. https://archive.ics.uci.edu/dataset.
Najmusseher ve Banu, P. K. N. (2025). Feature Engineering for Epileptic Seizure Classification Using SeqBoostNet. International Journal of Computing and Digital Systems, VOL. 17, NO. 1, 1–15. http://dx.doi.org/10.12785/ijcds/1571020131.
Rahman, M. M., Ghasemi, Y., Suley, E., Zhou, Y., Wang, S. ve Rogers, J. (2021). Machine Learning Based Computer Aided Diagnosis of Breast Cancer Utilizing Anthropometric and Clinical Features, IRBM, 42(4), 215-226.
Powers, D. M. W. (2011). Evaluation: From Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. Journal of Machine Learning Technologies, 2(1), 37–63. https://arxiv.org/abs/2010.16061.
Provost, F., ve Fawcett, T. (2013). Data Science for Business. O’Reilly Media. https://www.oreilly.com/library/view/data-science-for/9781449374273/.
Saito, T., ve Rehmsmeier, M. (2015). The Precision-Recall Plot is More Informative Than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLOS ONE, 10(3), e0118432. https://doi.org/10.1371/journal.pone.0118432.
Shalev-Shwartz, S., ve Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.
Shlens, J. (2014). A Tutorial on Principal Component Analysis. arXiv preprint arXiv:1404.1100.
Smola, A. J. ve Schölkopf, B. (2004). A Tutorial on Support Vector Regression. Statistics and Computing, 14, 199- 222. https://doi.org/10.1023/B:STCO.0000035301.49549.88.
Wei, L. ve Mooney, C. (2020). Epileptic Seizure Detection in Clinical EEGs Using an XGboost-based Method, IEEE SPMB, v1.0:1-6. https://isip.piconepress.com/conferences/ieee_spmb/2020/papers/l02_03.pdf?utm.
Wu, J., Zhou, T., Li. T. (2020). Detecting Epileptic Seizures in EEG Signals with Complementary Ensemble Empirical Mode Decomposition and Extreme Gradient Boosting. Entropy (Basel), 24;22(2):140. doi: 10.3390/e22020140.

Toplam 44 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Ekonometrik ve İstatistiksel Yöntemler
Bölüm	Araştırma Makalesi
Yazarlar	Burcu Kocarık Gacar 0000-0001-5944-4456
Gönderilme Tarihi	1 Eylül 2025
Kabul Tarihi	16 Ekim 2025
Yayımlanma Tarihi	29 Aralık 2025
DOI	https://doi.org/10.30794/pausbed.1775480
IZ	https://izlik.org/JA49YB89MC
Yayımlandığı Sayı	Yıl 2025 Sayı: Sayı:71 (EYS'25 Özel Sayısı)

Kaynak Göster

APA	Kocarık Gacar, B. (2025). EEG TABANLI NÖBET SINIFLANDIRMASI: PCA’NIN ETKİSİ VE LOJİSTİK REGRESYON, SVM, XGBOOST KARŞILAŞTIRMASI. Pamukkale Üniversitesi Sosyal Bilimler Enstitüsü Dergisi, Sayı:71 (EYS’25 Özel Sayısı), 21-40. https://doi.org/10.30794/pausbed.1775480

Makale Dosyaları

Tam Metin

by-nc-nd.eu.svg Bu dergide yer alan çalışmalar Creative Commons Atıf 4.0 Uluslararası Lisansı ile lisanslanmıştır.