Research Article
BibTex RIS Cite

Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity

Year 2026, Volume: 14 Issue: 1, 72 - 85, 21.01.2026
https://doi.org/10.29130/dubited.1773372

Abstract

Accurate and reliable sleep staging from electroencephalography (EEG) is essential for both research and clinical applications. However, evaluation practices differ widely, and subtle methodological choices can strongly influence reported results. In this study, we examined how cross-validation strategies and normalization protocols affect the reliability and generalizability of EEG-based sleep staging models. Two benchmark datasets, SleepEDF and ISRUC, were used to systematically compare common approaches. We found that record-wise evaluation, often used in the literature, leads to overly optimistic results, while subject-wise and leave-one-subject-out (LOSO) evaluations provide more realistic estimates. On SleepEDF and ISRUC, record-wise median Macro-F1 was 0.70 and 0.71, respectively; under subject-wise it was lower by 9 and 7 percentage points. Similarly, normalization strategies matter: although fold-aware normalization performed better in standard tests, subject-aware normalization combined with test-time adaptation produced the most consistent and clinically relevant outcomes, which improves calibration (lower ECE) and supports safer decisions. In particular, it reduced errors and improved both classification accuracy and probability reliability; for example, on ISRUC, subject-aware further improved Macro-F1 by 0.08, reduced ECE by 0.02, and increased kappa by 0.10, compared with fold-aware normalization. We present a protocol-level, model-independent proof that evaluation and normalization decisions can compete with model selection, particularly when datasets change. Better-calibrated predictions and safer clinical decisions are obtained by using subject-wise/LOSO for internal assessment and subject-aware normalization with test-time adaptation for deployment.

Ethical Statement

This study does not involve human or animal participants. All procedures followed scientific and ethical principles, and all referenced studies are appropriately cited.

Supporting Institution

This research received no external funding.

Thanks

The author do not wish to acknowledge any individual or institution.

References

  • Albuquerque, I., Monteiro, J., Rosanne, O., & Falk, T. H. (2022). Estimating distribution shifts for predicting cross-subject generalization in electroencephalography-based mental workload assessment. Frontiers in Artificial Intelligence, 5, Article 992732. https://doi.org/10.3389/frai.2022.992732
  • Alsolai, H., Qureshi, S., Iqbal, S. M. Z., Vanichayobon, S., Henesey, L. E., Lindley, C., & Karrila, S. (2022). A systematic review of literature on automated sleep scoring. IEEE Access, 10(11), 79419–79443. https://doi.org/10.1109/ACCESS.2022.3194145
  • Berry, R. B., Quan, S. F., Abreu, A. R., Bibbs, M. L., DelRosso, L., Harding, S. M., Mao, M.-M., Plante, D. T., Pressman, M. R., Troester, M. M., & Vaughn, B. V. (2020). The AASM manual for the scoring of sleep and associated events: Rules, terminology and technical specifications (Version 2.6). American Academy of Sleep Medicine.
  • Buriro, A. B., Ahmed, B., Baloch, G., Ahmed, J., Shoorangiz, R., Weddell, S. J., & Jones, R. D. (2021). Classification of alcoholic EEG signals using wavelet scattering transform-based features. Computers in Biology and Medicine, 139, Article 104969. https://doi.org/10.1016/j.compbiomed.2021.104969
  • Cesari, M., Portscher, A., Stefani, A., Angerbauer, R., Ibrahim, A., Brandauer, E., Feuerstein, S., Egger, K., Högl, B., & Rodriguez-Sanchez, A. (2024). Machine learning predicts phenoconversion from polysomnography in isolated REM sleep behavior disorder. Brain Sciences, 14(9), Article 871. https://doi.org/10.3390/brainsci14090871
  • Chambon, S., Galtier, M. N., Arnal, P. J., Wainrib, G., & Gramfort, A. (2018). A deep learning architecture for temporal sleep stage classification using multivariate and multimodal time series. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(4), 758–769. https://doi.org/10.1109/TNSRE.2018.2813138
  • Chato, L., & Regentova, E. (2023). Survey of transfer learning approaches in the machine learning of digital health sensing data. Journal of Personalized Medicine, 13(12), Article 1703. https://doi.org/10.3390/jpm13121703
  • Cheng, X., Huang, K., Zou, Y., & Ma, S. (2024). SleepEGAN: A GAN-enhanced ensemble deep learning model for imbalanced classification of sleep stages. Biomedical Signal Processing and Control, 92, Article 106020. https://doi.org/10.1016/j.bspc.2024.106020
  • Collins, G. S., Moons, K. G. M., Dhiman, P., Riley, R. D., Beam, A. L., Van Calster, B., Ghassemi, M., Liu, X., Reitsma, J. B., van Smeden, M., Boulesteix, A.-L., Camaradou, J. C., Celi, L. A., Denaxas, S., Denniston, A. K., Glocker, B., Golub, R. M., Harvey, H., Heinze, G., … Logullo, P. (2024). TRIPOD+AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ, 385, Article e078378. https://doi.org/10.1136/bmj-2023-078378
  • Eldele, E., Chen, Z., Liu, C., Wu, M., Kwoh, C.-K., Li, X., & Guan, C. (2021). An attention-based deep learning approach for sleep stage classification with single-channel EEG. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 29, 809–818. https://doi.org/10.1109/TNSRE.2021.3076234
  • Eldele, E., Ragab, M., Chen, Z., Wu, M., Kwoh, C.-K., Li, X., & Guan, C. (2023). ADAST: Attentive cross-domain EEG-based sleep staging framework with iterative self-training. IEEE Transactions on Emerging Topics in Computational Intelligence, 7(1), 210–221. https://doi.org/10.1109/TETCI.2022.3189695
  • Fiorillo, L., Pedroncelli, D., Agostini, V., Favaro, P., & Di Faraci, F. (2023). Multi-scored sleep databases: How to exploit the multiple labels in automated sleep scoring. Sleep, 46(5), Article zsad028. https://doi.org/10.1093/sleep/zsad028
  • Fultz, N. E., Bonmassar, G., Setsompop, K., Stickgold, R. A., Rosen, B. R., Polimeni, J. R., & Lewis, L. D. (2019). Coupled electrophysiological, hemodynamic, and cerebrospinal fluid oscillations in human sleep. Science, 366(6465), 628–631. https://doi.org/10.1126/science.aax5440
  • Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. Ch., Mark, R. G., Mietus, J. E., Moody, G. B., Peng, C.-K., & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet. Circulation, 101(23), e215–e220. https://doi.org/10.1161/01.CIR.101.23.e215
  • He, Z., Du, L., Wang, P., Xia, P., Liu, Z., Song, Y., Chen, X., & Fang, Z. (2022). Single-channel EEG sleep staging based on data augmentation and cross-subject discrepancy alleviation. Computers in Biology and Medicine, 149, Article 106044. https://doi.org/10.1016/j.compbiomed.2022.106044
  • Huang, G., Zhao, Z., Zhang, S., Hu, Z., Fan, J., Fu, M., Chen, J., Xiao, Y., Wang, J., & Dan, G. (2023). Discrepancy between inter- and intra-subject variability in EEG-based motor imagery brain–computer interface: Evidence from multiple perspectives. Frontiers in Neuroscience, 17, Article 1122661. https://doi.org/10.3389/fnins.2023.1122661
  • Irwin, M. R. (2015). Why sleep is important for health: A psychoneuroimmunology perspective. Annual Review of Psychology, 66(1), 143–172. https://doi.org/10.1146/annurev-psych-010213-115205
  • Jirakittayakorn, N., Wongsawat, Y., & Mitrirattanakul, S. (2024). ZleepAnlystNet: A novel deep learning model for automatic sleep stage scoring based on single-channel raw EEG data using separating training. Scientific Reports, 14(1), Article 9859. https://doi.org/10.1038/s41598-024-60796-y
  • Khalighi, S., Sousa, T., Santos, J. M., & Nunes, U. (2016). ISRUC-Sleep: A comprehensive public dataset for sleep researchers. Computer Methods and Programs in Biomedicine, 124, 180–192. https://doi.org/10.1016/j.cmpb.2015.10.013
  • Kryger, M. H., Roth, T., & Dement, W. C. (2010). Principles and practice of sleep medicine (5th ed.). Elsevier Saunders.
  • Lee, H., Choi, Y. R., Lee, H. K., Jeong, J., Hong, J., Shin, H. W., & Kim, H. S. (2025). Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals. NPJ Digital Medicine, 8(1), Article 55. https://doi.org/10.1038/s41746-024-01378-0
  • Lee, Y. J., Lee, J. Y., Cho, J. H., & Choi, J. H. (2022). Interrater reliability of sleep stage scoring: A meta-analysis. Journal of Clinical Sleep Medicine, 18(1), 193–202. https://doi.org/10.5664/jcsm.9538
  • Liu, Y., Ghafoor, A. A., Hajipour, M., & Ayas, N. (2023). Role of precision medicine in obstructive sleep apnoea. BMJ Medicine, 2(1), Article e000218. https://doi.org/10.1136/bmjmed-2022-000218
  • Perslev, M., Darkner, S., Kempfner, L., Nikolic, M., Jennum, P. J., & Igel, C. (2021). U-Sleep: Resilient high-frequency sleep staging. NPJ Digital Medicine, 4, Article 72. https://doi.org/10.1038/s41746-021-00440-5
  • Phan, H., Andreotti, F., Cooray, N., Chén, O. Y., & De Vos, M. (2019). SeqSleepNet: End-to-end hierarchical recurrent neural network for sequence-to-sequence automatic sleep staging. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 27(3), 400–410. https://doi.org/10.1109/TNSRE.2019.2896659
  • Phan, H., & Mikkelsen, K. (2022). Automatic sleep staging of EEG signals: Recent development, challenges, and future directions. Physiological Measurement, 43(4), Article 04TR01. https://doi.org/10.1088/1361-6579/ac6049
  • Rasch, B., & Born, J. (2013). About sleep’s role in memory. Physiological Reviews, 93(2), 681–766. https://doi.org/10.1152/physrev.00032.2012
  • Saha, S., & Baumert, M. (2020). Intra- and inter-subject variability in EEG-based sensorimotor brain–computer interface: A review. Frontiers in Computational Neuroscience, 13, Article 87. https://doi.org/10.3389/fncom.2019.00087
  • Samaee, M., Yazdi, M., & Massicotte, D. (2025). Multi-modal signal integration for enhanced sleep stage classification: Leveraging EOG and 2-channel EEG data with advanced feature extraction. Artificial Intelligence in Medicine, 166, Article 103152. https://doi.org/10.1016/j.artmed.2025.103152
  • Sarafraz, G., Behnamnia, A., Hosseinzadeh, M., Balapour, A., Meghrazi, A., & Rabiee, H. R. (2024). Domain adaptation and generalization of functional medical data: A systematic survey of brain data. ACM Computing Surveys, 56(10), 1–39. https://doi.org/10.1145/3654664
  • Satapathy, S. K., & Loganathan, D. (2023). Automated classification of multi-class sleep stages classification using polysomnography signals: A nine-layer 1D-convolution neural network approach. Multimedia Tools and Applications, 82, 8049–8091. https://doi.org/10.1007/s11042-022-13195-2
  • Sentner, T., Wang, X., de Groot, E. R., van Schaijk, L., Tataranno, M. L., Vijlbrief, D. C., Benders, M. J. N. L., Bartels, R., & Dudink, J. (2022). The Sleep Well Baby project: An automated real-time sleep–wake state prediction algorithm in preterm infants. Sleep, 45(10), Article zsac143. https://doi.org/10.1093/sleep/zsac143
  • Supratak, A., Dong, H., Wu, C., & Guo, Y. (2017). DeepSleepNet: A model for automatic sleep stage scoring based on raw single-channel EEG. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25(11), 1998–2008. https://doi.org/10.1109/TNSRE.2017.2721116
  • Toma, T. I., & Choi, S. (2023). An end-to-end multi-channel convolutional Bi-LSTM network for automatic sleep stage detection. Sensors, 23(10), Article 4950. https://doi.org/10.3390/s23104950
  • Uçar, M. K., & Düzayak, S. (2020). Papüloskuamöz hastalıkların belirlenmesi için yapay zeka yöntemleriyle kural tabanlı teşhis algoritmalarının geliştirilmesi. Duzce University Journal of Science and Technology, 8(3), 1903–1922.
  • van der Plas, D., Verbraecken, J., Willemen, M., Meert, W., & Davis, J. (2021). Evaluation of automated hypnogram analysis on multi-scored polysomnographies. Frontiers in Digital Health, 3, Article 707589. https://doi.org/10.3389/fdgth.2021.707589
  • van Sweden, B., Kemp, B., Kamphuisen, H. A., & van der Velde, E. A. (1990). Alternative electrode placement in (automatic) sleep scoring (Fpz–Cz/Pz–Oz versus C4–A1). Sleep, 13(3), 279–283. https://doi.org/10.1093/sleep/13.3.279
  • van Twist, E., Hiemstra, F. W., Cramer, A. B. G., Verbruggen, S. C. A. T., Tax, D. M. J., Joosten, K., Louter, M., Straver, D. C. G., de Hoog, M., Kuiper, J. W., & de Jonge, R. C. J. (2024). An electroencephalography-based sleep index and supervised machine learning as a suitable tool for automated sleep classification in children. Journal of Clinical Sleep Medicine, 20(3), 389–397. https://doi.org/10.5664/jcsm.10880
  • von Ellenrieder, N., Peter-Derex, L., Gotman, J., & Frauscher, B. (2022). SleepSEEG: Automatic sleep scoring using intracranial EEG recordings only. Journal of Neural Engineering, 19(2), Article 026057. https://doi.org/10.1088/1741-2552/ac6829
  • Walker, M. P., & Stickgold, R. (2006). Sleep, memory, and plasticity. Annual Review of Psychology, 57(1), 139–166. https://doi.org/10.1146/annurev.psych.56.091103.070307
  • Wolpert, E. A. (1969). A manual of standardized terminology, techniques and scoring system for sleep stages of human subjects. Archives of General Psychiatry, 20(2), 246–247. https://doi.org/10.1001/archpsyc.1969.01740140118016
  • Zhang, W., Li, C., Peng, H., Qiao, H., & Chen, X. (2024). CTCNet: A CNN transformer capsule network for sleep stage classification. Measurement, 226, Article 114157. https://doi.org/10.1016/j.measurement.2024.114157
  • Zhao, M., Yue, S., Katabi, D., Jaakkola, T. S., & Bianchi, M. T. (2017). Learning sleep stages from radio signals: A conditional adversarial architecture. In Proceedings of the 34th International Conference on Machine Learning (pp. 4100–4109). Proceedings of Machine Learning Research, Vol. 70. https://proceedings.mlr.press/v70/zhao17d.html

EEG Tabanlı Uyku Evrelemede Çapraz Doğrulama ve Normalizasyon: Genelleme, Kalibrasyon ve Klinik Geçerlilik Üzerindeki Etkiler

Year 2026, Volume: 14 Issue: 1, 72 - 85, 21.01.2026
https://doi.org/10.29130/dubited.1773372

Abstract

Elektroensefalografi (EEG) tabanlı uyku evreleme, hem araştırma hem de klinik uygulamalar için kritik öneme sahiptir. Ancak değerlendirme yaklaşımları literatürde büyük farklılıklar göstermekte ve yöntemsel tercihler raporlanan sonuçları güçlü biçimde etkileyebilmektedir. Bu çalışmada, çapraz doğrulama stratejileri ve normalizasyon protokollerinin EEG tabanlı uyku evreleme modellerinin güvenilirliği ve genellenebilirliği üzerindeki etkileri incelenmiştir. İki temel veri seti (SleepEDF ve ISRUC) kullanılarak yaygın yaklaşımlar sistematik biçimde karşılaştırılmıştır. Bulgular, literatürde sık kullanılan kayıt-bazlı değerlendirmenin aşırı iyimser sonuçlara yol açtığını; buna karşın birey-bazlı ve bir-birey-dışlama yaklaşımlarının daha gerçekçi tahminler sunduğunu göstermektedir. Benzer şekilde normalizasyon stratejileri de kritik rol oynamaktadır: standart testlerde kat-düzeyinde normalizasyon daha başarılı bulunurken, birey-düzeyinde normalizasyon yöntemi test-zamanlı uyarlama ile birlikte en tutarlı ve klinik açıdan en anlamlı sonuçları vermiştir. Özellikle test-zamanlı uyarlama, hataları azaltmış ve farklı veri setlerinde hem sınıflandırma doğruluğunu hem de olasılık güvenilirliğini artırmıştır. Sonuç olarak, değerlendirme protokollerinin seçiminde algoritma seçimi kadar dikkatli olunması gerektiği ortaya konmuştur. Daha titiz ve klinik odaklı değerlendirme stratejilerinin benimsenmesiyle EEG tabanlı uyku evreleme sistemleri daha güvenilir hale gelebilir ve gerçek sağlık uygulamaları için daha uygun bir yapıya kavuşabilir.

References

  • Albuquerque, I., Monteiro, J., Rosanne, O., & Falk, T. H. (2022). Estimating distribution shifts for predicting cross-subject generalization in electroencephalography-based mental workload assessment. Frontiers in Artificial Intelligence, 5, Article 992732. https://doi.org/10.3389/frai.2022.992732
  • Alsolai, H., Qureshi, S., Iqbal, S. M. Z., Vanichayobon, S., Henesey, L. E., Lindley, C., & Karrila, S. (2022). A systematic review of literature on automated sleep scoring. IEEE Access, 10(11), 79419–79443. https://doi.org/10.1109/ACCESS.2022.3194145
  • Berry, R. B., Quan, S. F., Abreu, A. R., Bibbs, M. L., DelRosso, L., Harding, S. M., Mao, M.-M., Plante, D. T., Pressman, M. R., Troester, M. M., & Vaughn, B. V. (2020). The AASM manual for the scoring of sleep and associated events: Rules, terminology and technical specifications (Version 2.6). American Academy of Sleep Medicine.
  • Buriro, A. B., Ahmed, B., Baloch, G., Ahmed, J., Shoorangiz, R., Weddell, S. J., & Jones, R. D. (2021). Classification of alcoholic EEG signals using wavelet scattering transform-based features. Computers in Biology and Medicine, 139, Article 104969. https://doi.org/10.1016/j.compbiomed.2021.104969
  • Cesari, M., Portscher, A., Stefani, A., Angerbauer, R., Ibrahim, A., Brandauer, E., Feuerstein, S., Egger, K., Högl, B., & Rodriguez-Sanchez, A. (2024). Machine learning predicts phenoconversion from polysomnography in isolated REM sleep behavior disorder. Brain Sciences, 14(9), Article 871. https://doi.org/10.3390/brainsci14090871
  • Chambon, S., Galtier, M. N., Arnal, P. J., Wainrib, G., & Gramfort, A. (2018). A deep learning architecture for temporal sleep stage classification using multivariate and multimodal time series. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26(4), 758–769. https://doi.org/10.1109/TNSRE.2018.2813138
  • Chato, L., & Regentova, E. (2023). Survey of transfer learning approaches in the machine learning of digital health sensing data. Journal of Personalized Medicine, 13(12), Article 1703. https://doi.org/10.3390/jpm13121703
  • Cheng, X., Huang, K., Zou, Y., & Ma, S. (2024). SleepEGAN: A GAN-enhanced ensemble deep learning model for imbalanced classification of sleep stages. Biomedical Signal Processing and Control, 92, Article 106020. https://doi.org/10.1016/j.bspc.2024.106020
  • Collins, G. S., Moons, K. G. M., Dhiman, P., Riley, R. D., Beam, A. L., Van Calster, B., Ghassemi, M., Liu, X., Reitsma, J. B., van Smeden, M., Boulesteix, A.-L., Camaradou, J. C., Celi, L. A., Denaxas, S., Denniston, A. K., Glocker, B., Golub, R. M., Harvey, H., Heinze, G., … Logullo, P. (2024). TRIPOD+AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ, 385, Article e078378. https://doi.org/10.1136/bmj-2023-078378
  • Eldele, E., Chen, Z., Liu, C., Wu, M., Kwoh, C.-K., Li, X., & Guan, C. (2021). An attention-based deep learning approach for sleep stage classification with single-channel EEG. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 29, 809–818. https://doi.org/10.1109/TNSRE.2021.3076234
  • Eldele, E., Ragab, M., Chen, Z., Wu, M., Kwoh, C.-K., Li, X., & Guan, C. (2023). ADAST: Attentive cross-domain EEG-based sleep staging framework with iterative self-training. IEEE Transactions on Emerging Topics in Computational Intelligence, 7(1), 210–221. https://doi.org/10.1109/TETCI.2022.3189695
  • Fiorillo, L., Pedroncelli, D., Agostini, V., Favaro, P., & Di Faraci, F. (2023). Multi-scored sleep databases: How to exploit the multiple labels in automated sleep scoring. Sleep, 46(5), Article zsad028. https://doi.org/10.1093/sleep/zsad028
  • Fultz, N. E., Bonmassar, G., Setsompop, K., Stickgold, R. A., Rosen, B. R., Polimeni, J. R., & Lewis, L. D. (2019). Coupled electrophysiological, hemodynamic, and cerebrospinal fluid oscillations in human sleep. Science, 366(6465), 628–631. https://doi.org/10.1126/science.aax5440
  • Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. Ch., Mark, R. G., Mietus, J. E., Moody, G. B., Peng, C.-K., & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet. Circulation, 101(23), e215–e220. https://doi.org/10.1161/01.CIR.101.23.e215
  • He, Z., Du, L., Wang, P., Xia, P., Liu, Z., Song, Y., Chen, X., & Fang, Z. (2022). Single-channel EEG sleep staging based on data augmentation and cross-subject discrepancy alleviation. Computers in Biology and Medicine, 149, Article 106044. https://doi.org/10.1016/j.compbiomed.2022.106044
  • Huang, G., Zhao, Z., Zhang, S., Hu, Z., Fan, J., Fu, M., Chen, J., Xiao, Y., Wang, J., & Dan, G. (2023). Discrepancy between inter- and intra-subject variability in EEG-based motor imagery brain–computer interface: Evidence from multiple perspectives. Frontiers in Neuroscience, 17, Article 1122661. https://doi.org/10.3389/fnins.2023.1122661
  • Irwin, M. R. (2015). Why sleep is important for health: A psychoneuroimmunology perspective. Annual Review of Psychology, 66(1), 143–172. https://doi.org/10.1146/annurev-psych-010213-115205
  • Jirakittayakorn, N., Wongsawat, Y., & Mitrirattanakul, S. (2024). ZleepAnlystNet: A novel deep learning model for automatic sleep stage scoring based on single-channel raw EEG data using separating training. Scientific Reports, 14(1), Article 9859. https://doi.org/10.1038/s41598-024-60796-y
  • Khalighi, S., Sousa, T., Santos, J. M., & Nunes, U. (2016). ISRUC-Sleep: A comprehensive public dataset for sleep researchers. Computer Methods and Programs in Biomedicine, 124, 180–192. https://doi.org/10.1016/j.cmpb.2015.10.013
  • Kryger, M. H., Roth, T., & Dement, W. C. (2010). Principles and practice of sleep medicine (5th ed.). Elsevier Saunders.
  • Lee, H., Choi, Y. R., Lee, H. K., Jeong, J., Hong, J., Shin, H. W., & Kim, H. S. (2025). Explainable vision transformer for automatic visual sleep staging on multimodal PSG signals. NPJ Digital Medicine, 8(1), Article 55. https://doi.org/10.1038/s41746-024-01378-0
  • Lee, Y. J., Lee, J. Y., Cho, J. H., & Choi, J. H. (2022). Interrater reliability of sleep stage scoring: A meta-analysis. Journal of Clinical Sleep Medicine, 18(1), 193–202. https://doi.org/10.5664/jcsm.9538
  • Liu, Y., Ghafoor, A. A., Hajipour, M., & Ayas, N. (2023). Role of precision medicine in obstructive sleep apnoea. BMJ Medicine, 2(1), Article e000218. https://doi.org/10.1136/bmjmed-2022-000218
  • Perslev, M., Darkner, S., Kempfner, L., Nikolic, M., Jennum, P. J., & Igel, C. (2021). U-Sleep: Resilient high-frequency sleep staging. NPJ Digital Medicine, 4, Article 72. https://doi.org/10.1038/s41746-021-00440-5
  • Phan, H., Andreotti, F., Cooray, N., Chén, O. Y., & De Vos, M. (2019). SeqSleepNet: End-to-end hierarchical recurrent neural network for sequence-to-sequence automatic sleep staging. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 27(3), 400–410. https://doi.org/10.1109/TNSRE.2019.2896659
  • Phan, H., & Mikkelsen, K. (2022). Automatic sleep staging of EEG signals: Recent development, challenges, and future directions. Physiological Measurement, 43(4), Article 04TR01. https://doi.org/10.1088/1361-6579/ac6049
  • Rasch, B., & Born, J. (2013). About sleep’s role in memory. Physiological Reviews, 93(2), 681–766. https://doi.org/10.1152/physrev.00032.2012
  • Saha, S., & Baumert, M. (2020). Intra- and inter-subject variability in EEG-based sensorimotor brain–computer interface: A review. Frontiers in Computational Neuroscience, 13, Article 87. https://doi.org/10.3389/fncom.2019.00087
  • Samaee, M., Yazdi, M., & Massicotte, D. (2025). Multi-modal signal integration for enhanced sleep stage classification: Leveraging EOG and 2-channel EEG data with advanced feature extraction. Artificial Intelligence in Medicine, 166, Article 103152. https://doi.org/10.1016/j.artmed.2025.103152
  • Sarafraz, G., Behnamnia, A., Hosseinzadeh, M., Balapour, A., Meghrazi, A., & Rabiee, H. R. (2024). Domain adaptation and generalization of functional medical data: A systematic survey of brain data. ACM Computing Surveys, 56(10), 1–39. https://doi.org/10.1145/3654664
  • Satapathy, S. K., & Loganathan, D. (2023). Automated classification of multi-class sleep stages classification using polysomnography signals: A nine-layer 1D-convolution neural network approach. Multimedia Tools and Applications, 82, 8049–8091. https://doi.org/10.1007/s11042-022-13195-2
  • Sentner, T., Wang, X., de Groot, E. R., van Schaijk, L., Tataranno, M. L., Vijlbrief, D. C., Benders, M. J. N. L., Bartels, R., & Dudink, J. (2022). The Sleep Well Baby project: An automated real-time sleep–wake state prediction algorithm in preterm infants. Sleep, 45(10), Article zsac143. https://doi.org/10.1093/sleep/zsac143
  • Supratak, A., Dong, H., Wu, C., & Guo, Y. (2017). DeepSleepNet: A model for automatic sleep stage scoring based on raw single-channel EEG. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 25(11), 1998–2008. https://doi.org/10.1109/TNSRE.2017.2721116
  • Toma, T. I., & Choi, S. (2023). An end-to-end multi-channel convolutional Bi-LSTM network for automatic sleep stage detection. Sensors, 23(10), Article 4950. https://doi.org/10.3390/s23104950
  • Uçar, M. K., & Düzayak, S. (2020). Papüloskuamöz hastalıkların belirlenmesi için yapay zeka yöntemleriyle kural tabanlı teşhis algoritmalarının geliştirilmesi. Duzce University Journal of Science and Technology, 8(3), 1903–1922.
  • van der Plas, D., Verbraecken, J., Willemen, M., Meert, W., & Davis, J. (2021). Evaluation of automated hypnogram analysis on multi-scored polysomnographies. Frontiers in Digital Health, 3, Article 707589. https://doi.org/10.3389/fdgth.2021.707589
  • van Sweden, B., Kemp, B., Kamphuisen, H. A., & van der Velde, E. A. (1990). Alternative electrode placement in (automatic) sleep scoring (Fpz–Cz/Pz–Oz versus C4–A1). Sleep, 13(3), 279–283. https://doi.org/10.1093/sleep/13.3.279
  • van Twist, E., Hiemstra, F. W., Cramer, A. B. G., Verbruggen, S. C. A. T., Tax, D. M. J., Joosten, K., Louter, M., Straver, D. C. G., de Hoog, M., Kuiper, J. W., & de Jonge, R. C. J. (2024). An electroencephalography-based sleep index and supervised machine learning as a suitable tool for automated sleep classification in children. Journal of Clinical Sleep Medicine, 20(3), 389–397. https://doi.org/10.5664/jcsm.10880
  • von Ellenrieder, N., Peter-Derex, L., Gotman, J., & Frauscher, B. (2022). SleepSEEG: Automatic sleep scoring using intracranial EEG recordings only. Journal of Neural Engineering, 19(2), Article 026057. https://doi.org/10.1088/1741-2552/ac6829
  • Walker, M. P., & Stickgold, R. (2006). Sleep, memory, and plasticity. Annual Review of Psychology, 57(1), 139–166. https://doi.org/10.1146/annurev.psych.56.091103.070307
  • Wolpert, E. A. (1969). A manual of standardized terminology, techniques and scoring system for sleep stages of human subjects. Archives of General Psychiatry, 20(2), 246–247. https://doi.org/10.1001/archpsyc.1969.01740140118016
  • Zhang, W., Li, C., Peng, H., Qiao, H., & Chen, X. (2024). CTCNet: A CNN transformer capsule network for sleep stage classification. Measurement, 226, Article 114157. https://doi.org/10.1016/j.measurement.2024.114157
  • Zhao, M., Yue, S., Katabi, D., Jaakkola, T. S., & Bianchi, M. T. (2017). Learning sleep stages from radio signals: A conditional adversarial architecture. In Proceedings of the 34th International Conference on Machine Learning (pp. 4100–4109). Proceedings of Machine Learning Research, Vol. 70. https://proceedings.mlr.press/v70/zhao17d.html
There are 43 citations in total.

Details

Primary Language English
Subjects Bioinformatics
Journal Section Research Article
Authors

Ahmet Sertol Köksal 0000-0002-3452-828X

Submission Date August 28, 2025
Acceptance Date October 23, 2025
Publication Date January 21, 2026
Published in Issue Year 2026 Volume: 14 Issue: 1

Cite

APA Köksal, A. S. (2026). Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity. Duzce University Journal of Science and Technology, 14(1), 72-85. https://doi.org/10.29130/dubited.1773372
AMA Köksal AS. Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity. DUBİTED. January 2026;14(1):72-85. doi:10.29130/dubited.1773372
Chicago Köksal, Ahmet Sertol. “Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity”. Duzce University Journal of Science and Technology 14, no. 1 (January 2026): 72-85. https://doi.org/10.29130/dubited.1773372.
EndNote Köksal AS (January 1, 2026) Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity. Duzce University Journal of Science and Technology 14 1 72–85.
IEEE A. S. Köksal, “Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity”, DUBİTED, vol. 14, no. 1, pp. 72–85, 2026, doi: 10.29130/dubited.1773372.
ISNAD Köksal, Ahmet Sertol. “Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity”. Duzce University Journal of Science and Technology 14/1 (January2026), 72-85. https://doi.org/10.29130/dubited.1773372.
JAMA Köksal AS. Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity. DUBİTED. 2026;14:72–85.
MLA Köksal, Ahmet Sertol. “Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity”. Duzce University Journal of Science and Technology, vol. 14, no. 1, 2026, pp. 72-85, doi:10.29130/dubited.1773372.
Vancouver Köksal AS. Cross-Validation and Normalization in EEG Sleep Staging: Impacts on Generalization, Calibration, and Clinical Validity. DUBİTED. 2026;14(1):72-85.