Research Article
BibTex RIS Cite

Year 2025, Volume: 7 Issue: 3, 763 - 7, 09.09.2025
https://doi.org/10.37990/medr.1718952

Abstract

References

  • Teede HJ, Tay CT, Laven JJ, et al. Recommendations from the 2023 international evidence-based guideline for the assessment and management of polycystic ovary syndrome. Eur J Endocrinol. 2023;189:G43-64.
  • Lizneva D, Suturina L, Walker W, et al. Criteria, prevalence, and phenotypes of polycystic ovary syndrome. Fertil Steril. 2016;106:6-15.
  • Goodarzi MO, Korenman SG. The importance of insulin resistance in polycystic ovary syndrome. Fertil Steril. 2003;80:255-8.
  • Dumesic DA, Abbott DH, Chazenbalk GD. An evolutionary model for the ancient origins of polycystic ovary syndrome. J Clin Med. 2023;12:6120.
  • Chang K-J, Chen J-H, Chen K-H. The pathophysiological mechanism and clinical treatment of polycystic ovary syndrome: a molecular and cellular review of the literature. Int J Mol Sci. 2024;25:9037.
  • Sharma I, Dhawan C, Arora P, et al. Role of environmental factors in PCOS development and progression. Herbal Medicine Applications for Polycystic Ovarian Syndrome: CRC Press. 2023;281-300.
  • Xuan Y, Hong X, Zhou X, et al. The vaginal metabolomics profile with features of polycystic ovary syndrome: a pilot investigation in China. PeerJ. 2024;12:e18194.
  • Liu R, Bai S, Zheng S, et al. Identification of the metabolomics signature of human follicular fluid from PCOS women with insulin resistance. Dis Markers. 2022;2022:6877541.
  • Alesi S, Ghelani D, Mousa A. Metabolomic biomarkers in polycystic ovary syndrome: a review of the evidence. Semin Reprod Med. 2021;39:102-10.
  • Marcos-Zambrano LJ, Karaduzovic-Hadziabdic K, Loncar Turukalo T, et al. Applications of machine learning in human microbiome studies: a review on feature selection, biomarker identification, disease prediction and treatment. Front Microbiol. 2021;12:634511.
  • Degenhardt F, Seifert S, Szymczak S. Evaluation of variable selection methods for random forests and omics data sets. Brief Bioinform. 2019;20:492-503.
  • Lundberg SM, Erion G, Chen H, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2:56-67.
  • Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30 (NIPS 2017). 2017;30.
  • Yuan Y. DataSet for PCOS. Mendeley Data. 2023;V1. doi: 10.17632/mh94mxn3nh.1.
  • Vishnu M, Rupak VV, Vedhapriyaa S, et al., Recurrent gastric cancer prediction using randomized search cv optimizer. 2023 International Conference on Computer Communication and Informatics (ICCCI), 23-25 Jan. 2023. Coimbatore, India, 1-5.
  • Xie N-N, Wang F-F, Zhou J, et al. Establishment and analysis of a combined diagnostic model of polycystic ovary syndrome with random forest and artificial neural network. Biomed Res Int. 2020;2020:2613091.
  • Van den Broeck G, Lykov A, Schleich M, Suciu D. On the tractability of SHAP explanations. Journal of Artificial Intelligence Research. 2022;74:851-86.
  • IBM Corp. SPSS Statistics for Windows. V. 26.0. IBM Corp Armonk, NY; 2019.
  • Srinath K. Python–the fastest growing programming language. International Research Journal of Engineering and Technology. 2017;4:354-7.
  • Verma P, Maan P, Gautam R, Arora T. Unveiling the role of artificial intelligence (AI) in polycystic ovary syndrome (PCOS) diagnosis: a comprehensive review. Reprod Sci. 2024;31:2901-15.
  • Nsugbe E. An artificial intelligence-based decision support system for early diagnosis of polycystic ovaries syndrome. Healthcare Analytics. 2023;3.100164.
  • Di F, Gao D, Yao L, et al. Differences in metabonomic profiles of abdominal subcutaneous adipose tissue in women with polycystic ovary syndrome. Front Endocrinol (Lausanne). 2023;14:1077604.
  • Wu G, Hu X, Ding J, Yang J. The effect of glutamine on Dehydroepiandrosterone-induced polycystic ovary syndrome rats. J Ovarian Res. 2020;13:57.
  • Cree-Green M, Carreau A-M, Rahat H, et al. Amino acid and fatty acid metabolomic profile during fasting and hyperinsulinemia in girls with polycystic ovarian syndrome. Am J Physiol Endocrinol Metab. 2019;316:E707-18.

Machine Learning-Driven Metabolomic Biomarker Discovery for PCOS: An Interpretable Approach Using Random Forest and SHAP

Year 2025, Volume: 7 Issue: 3, 763 - 7, 09.09.2025
https://doi.org/10.37990/medr.1718952

Abstract

Aim: This study aimed to predict Polycystic Ovary Syndrome (PCOS) using follicular fluid metabolomic data and the Random Forest algorithm, and to interpret the contributions of the most influential metabolites using SHapley Additive exPlanations (SHAP) analysis.
Material and Method: An untargeted metabolomic dataset of follicular fluid from 35 PCOS patients and 37 age-matched controls was utilized. The dataset was partitioned into 70% training and 30% testing subsets using stratified sampling. A Random Forest algorithm was employed, with hyperparameter optimization performed using RandomizedSearchCV. Model performance was evaluated using accuracy, sensitivity, specificity, F1 score, balanced accuracy, and Brier score. SHAP analysis was then applied to interpret the model's predictions and identify key contributing metabolites.
Results: The Random Forest model achieved robust classification performance, with an accuracy of 0.86, sensitivity of 0.82, specificity of 0.91, F1 score of 0.86, balanced accuracy of 0.85, and a Brier score of 0.13. SHAP analysis identified L-Histidine, L-Glutamine, and L-Tyrosine as the top three most influential metabolites. Specifically, decreased levels of L-Histidine and L-Tyrosine, and elevated levels of L-Glutamine, were associated with an increased risk of PCOS.
Conclusion: Our findings demonstrate the potential of integrating machine learning with explainable AI to accurately predict PCOS based on metabolomic profiles. The identified metabolites, particularly alterations in amino acid metabolism, offer novel insights into the metabolic underpinnings of PCOS and highlight their promise as diagnostic biomarkers, paving the way for more precise and interpretable diagnostic strategies.

Ethical Statement

As the research utilized only publicly available open-access data, ethical approval was not required under institutional and national guidelines.

References

  • Teede HJ, Tay CT, Laven JJ, et al. Recommendations from the 2023 international evidence-based guideline for the assessment and management of polycystic ovary syndrome. Eur J Endocrinol. 2023;189:G43-64.
  • Lizneva D, Suturina L, Walker W, et al. Criteria, prevalence, and phenotypes of polycystic ovary syndrome. Fertil Steril. 2016;106:6-15.
  • Goodarzi MO, Korenman SG. The importance of insulin resistance in polycystic ovary syndrome. Fertil Steril. 2003;80:255-8.
  • Dumesic DA, Abbott DH, Chazenbalk GD. An evolutionary model for the ancient origins of polycystic ovary syndrome. J Clin Med. 2023;12:6120.
  • Chang K-J, Chen J-H, Chen K-H. The pathophysiological mechanism and clinical treatment of polycystic ovary syndrome: a molecular and cellular review of the literature. Int J Mol Sci. 2024;25:9037.
  • Sharma I, Dhawan C, Arora P, et al. Role of environmental factors in PCOS development and progression. Herbal Medicine Applications for Polycystic Ovarian Syndrome: CRC Press. 2023;281-300.
  • Xuan Y, Hong X, Zhou X, et al. The vaginal metabolomics profile with features of polycystic ovary syndrome: a pilot investigation in China. PeerJ. 2024;12:e18194.
  • Liu R, Bai S, Zheng S, et al. Identification of the metabolomics signature of human follicular fluid from PCOS women with insulin resistance. Dis Markers. 2022;2022:6877541.
  • Alesi S, Ghelani D, Mousa A. Metabolomic biomarkers in polycystic ovary syndrome: a review of the evidence. Semin Reprod Med. 2021;39:102-10.
  • Marcos-Zambrano LJ, Karaduzovic-Hadziabdic K, Loncar Turukalo T, et al. Applications of machine learning in human microbiome studies: a review on feature selection, biomarker identification, disease prediction and treatment. Front Microbiol. 2021;12:634511.
  • Degenhardt F, Seifert S, Szymczak S. Evaluation of variable selection methods for random forests and omics data sets. Brief Bioinform. 2019;20:492-503.
  • Lundberg SM, Erion G, Chen H, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2:56-67.
  • Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30 (NIPS 2017). 2017;30.
  • Yuan Y. DataSet for PCOS. Mendeley Data. 2023;V1. doi: 10.17632/mh94mxn3nh.1.
  • Vishnu M, Rupak VV, Vedhapriyaa S, et al., Recurrent gastric cancer prediction using randomized search cv optimizer. 2023 International Conference on Computer Communication and Informatics (ICCCI), 23-25 Jan. 2023. Coimbatore, India, 1-5.
  • Xie N-N, Wang F-F, Zhou J, et al. Establishment and analysis of a combined diagnostic model of polycystic ovary syndrome with random forest and artificial neural network. Biomed Res Int. 2020;2020:2613091.
  • Van den Broeck G, Lykov A, Schleich M, Suciu D. On the tractability of SHAP explanations. Journal of Artificial Intelligence Research. 2022;74:851-86.
  • IBM Corp. SPSS Statistics for Windows. V. 26.0. IBM Corp Armonk, NY; 2019.
  • Srinath K. Python–the fastest growing programming language. International Research Journal of Engineering and Technology. 2017;4:354-7.
  • Verma P, Maan P, Gautam R, Arora T. Unveiling the role of artificial intelligence (AI) in polycystic ovary syndrome (PCOS) diagnosis: a comprehensive review. Reprod Sci. 2024;31:2901-15.
  • Nsugbe E. An artificial intelligence-based decision support system for early diagnosis of polycystic ovaries syndrome. Healthcare Analytics. 2023;3.100164.
  • Di F, Gao D, Yao L, et al. Differences in metabonomic profiles of abdominal subcutaneous adipose tissue in women with polycystic ovary syndrome. Front Endocrinol (Lausanne). 2023;14:1077604.
  • Wu G, Hu X, Ding J, Yang J. The effect of glutamine on Dehydroepiandrosterone-induced polycystic ovary syndrome rats. J Ovarian Res. 2020;13:57.
  • Cree-Green M, Carreau A-M, Rahat H, et al. Amino acid and fatty acid metabolomic profile during fasting and hyperinsulinemia in girls with polycystic ovarian syndrome. Am J Physiol Endocrinol Metab. 2019;316:E707-18.
There are 24 citations in total.

Details

Primary Language English
Subjects Obstetrics and Gynaecology
Journal Section Original Articles
Authors

Şeyma Yaşar 0000-0003-1300-3393

Publication Date September 9, 2025
Submission Date June 13, 2025
Acceptance Date July 22, 2025
Published in Issue Year 2025 Volume: 7 Issue: 3

Cite

AMA Yaşar Ş. Machine Learning-Driven Metabolomic Biomarker Discovery for PCOS: An Interpretable Approach Using Random Forest and SHAP. Med Records. September 2025;7(3):763-7. doi:10.37990/medr.1718952

17741

Chief Editors

MD, Professor. Zülal Öner
İzmir Bakırçay University, Department of Anatomy, İzmir, Türkiye

Assoc. Prof. Deniz Şenol
Düzce University, Department of Anatomy, Düzce, Türkiye

Editors
Assoc. Prof. Serkan Öner
İzmir Bakırçay University, Department of Radiology, İzmir, Türkiye
 
E-mail: medrecsjournal@gmail.com

Publisher:
Medical Records Association (Tıbbi Kayıtlar Derneği)
Address: Orhangazi Neighborhood, 440th Street,
Green Life Complex, Block B, Floor 3, No. 69
Düzce, Türkiye
Web: www.tibbikayitlar.org.tr