Araştırma Makalesi

Machine Learning-Driven Metabolomic Biomarker Discovery for PCOS: An Interpretable Approach Using Random Forest and SHAP

Cilt: 7 Sayı: 3 9 Eylül 2025
PDF İndir
EN

Machine Learning-Driven Metabolomic Biomarker Discovery for PCOS: An Interpretable Approach Using Random Forest and SHAP

Abstract

Aim: This study aimed to predict Polycystic Ovary Syndrome (PCOS) using follicular fluid metabolomic data and the Random Forest algorithm, and to interpret the contributions of the most influential metabolites using SHapley Additive exPlanations (SHAP) analysis. Material and Method: An untargeted metabolomic dataset of follicular fluid from 35 PCOS patients and 37 age-matched controls was utilized. The dataset was partitioned into 70% training and 30% testing subsets using stratified sampling. A Random Forest algorithm was employed, with hyperparameter optimization performed using RandomizedSearchCV. Model performance was evaluated using accuracy, sensitivity, specificity, F1 score, balanced accuracy, and Brier score. SHAP analysis was then applied to interpret the model's predictions and identify key contributing metabolites. Results: The Random Forest model achieved robust classification performance, with an accuracy of 0.86, sensitivity of 0.82, specificity of 0.91, F1 score of 0.86, balanced accuracy of 0.85, and a Brier score of 0.13. SHAP analysis identified L-Histidine, L-Glutamine, and L-Tyrosine as the top three most influential metabolites. Specifically, decreased levels of L-Histidine and L-Tyrosine, and elevated levels of L-Glutamine, were associated with an increased risk of PCOS. Conclusion: Our findings demonstrate the potential of integrating machine learning with explainable AI to accurately predict PCOS based on metabolomic profiles. The identified metabolites, particularly alterations in amino acid metabolism, offer novel insights into the metabolic underpinnings of PCOS and highlight their promise as diagnostic biomarkers, paving the way for more precise and interpretable diagnostic strategies.

Keywords

Etik Beyan

As the research utilized only publicly available open-access data, ethical approval was not required under institutional and national guidelines.

Kaynakça

  1. Teede HJ, Tay CT, Laven JJ, et al. Recommendations from the 2023 international evidence-based guideline for the assessment and management of polycystic ovary syndrome. Eur J Endocrinol. 2023;189:G43-64.
  2. Lizneva D, Suturina L, Walker W, et al. Criteria, prevalence, and phenotypes of polycystic ovary syndrome. Fertil Steril. 2016;106:6-15.
  3. Goodarzi MO, Korenman SG. The importance of insulin resistance in polycystic ovary syndrome. Fertil Steril. 2003;80:255-8.
  4. Dumesic DA, Abbott DH, Chazenbalk GD. An evolutionary model for the ancient origins of polycystic ovary syndrome. J Clin Med. 2023;12:6120.
  5. Chang K-J, Chen J-H, Chen K-H. The pathophysiological mechanism and clinical treatment of polycystic ovary syndrome: a molecular and cellular review of the literature. Int J Mol Sci. 2024;25:9037.
  6. Sharma I, Dhawan C, Arora P, et al. Role of environmental factors in PCOS development and progression. Herbal Medicine Applications for Polycystic Ovarian Syndrome: CRC Press. 2023;281-300.
  7. Xuan Y, Hong X, Zhou X, et al. The vaginal metabolomics profile with features of polycystic ovary syndrome: a pilot investigation in China. PeerJ. 2024;12:e18194.
  8. Liu R, Bai S, Zheng S, et al. Identification of the metabolomics signature of human follicular fluid from PCOS women with insulin resistance. Dis Markers. 2022;2022:6877541.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Kadın Hastalıkları ve Doğum

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

9 Eylül 2025

Gönderilme Tarihi

13 Haziran 2025

Kabul Tarihi

22 Temmuz 2025

Yayımlandığı Sayı

Yıl 2025 Cilt: 7 Sayı: 3

Kaynak Göster

AMA
1.Yaşar Ş. Machine Learning-Driven Metabolomic Biomarker Discovery for PCOS: An Interpretable Approach Using Random Forest and SHAP. Med Records. 2025;7(3):763-7. doi:10.37990/medr.1718952