TY - JOUR T1 - Robust gene co-expression networks via partial robust M regression AU - Alın, Aylin AU - Ölmez, Ayça PY - 2025 DA - August Y2 - 2025 DO - 10.15672/hujms.1679033 JF - Hacettepe Journal of Mathematics and Statistics PB - Hacettepe University WT - DergiPark SN - 2651-477X SP - 1518 EP - 1532 VL - 54 IS - 4 LA - en AB - Gene expression data provide valuable information on the regulation and interactions of thousands of genes. However, constructing robust gene co-expression networks in the presence of outliers remains an open challenge. We propose a partial robust M regression based-method for building ene co-expression networks, which downweights extreme observations instead of discarding them. This preserves critical biological information while safeguarding the overall network structure from distortion. Through comprehensive simulations on the syntren300 dataset - including various outlier distributions (e.g. N(0, 5), N(1, 5), N(100, 10) and t(2)) and contamination levels up to 30\%, the partial robust M regression-based approach outperforms widely used methods (weighted gene co-expression network analysis, bi-weighted midcorrelation and partial least squares regression-based connectivity) in terms of precision, F1 and Matthews correlation coefficient. Real-data analysis of mouse liver gene expression further validates the stability and biological relevance of partial robust M regression-based gene co-expression networks, as it accurately identifies functionally enriched genes even under data contamination. These findings underscore the potential of partial robust M regression-based network construction to enhance reliability and uncover novel insights in high-dimensional genomic studies, offering a robust alternative to traditional correlation-based or partial least squares regression-based methods. KW - Bi-weight mid-correlation KW - gene co-expression network analysis KW - outliers KW - partial robust m regression KW - robust KW - weighted gene co-expression network analysis CR - [1] M. Ackermann and K. Strimmer. A general modular framework for gene set enrichment analysis. BMC Bioinform. 10, 120, 2009. CR - [2] I.J. Broce, L. Stein, E. Jones and T. Kwan. C9orf72 gene networks in the human brain correlate with cortical thickness in C9-FTD and implicate vulnerable cell types. Front. Neurosci. 18, 1258996, 2024. CR - [3] J.B. Brown. Classifiers and their metrics quantified. Mol. Inform. 37, 1700127, 2018. CR - [4] Q. Chen, R. Liu, X. Wang and H. Chen. Identification and analysis of spinal cord injury subtypes using weighted gene co-expression network analysis. Ann. Transl. Med. 9, 466, 2021. CR - [5] D. Chicco and G. Jurman. The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. BioData Min. 16, 4, 2023. CR - [6] R Core Team et al., R: A language and environment for statistical computing, R Found. Stat. Comput., Vienna, 2013. CR - [7] D.J. Cummins and C.W. Andrews. Iteratively reweighted partial least squares: a performance analysis by Monte Carlo simulation. J. Chemom. 9, 489507, 1995. CR - [8] S. Datta. E xploring relationships in gene expressions: a partial least squares approach. Gene Expr. 9, 249, 2018. CR - [9] Y. Di, D. Chen, W. Yu and L. Yan. Bladder cancer stage-associated hub genes revealed by WGCNA co-expression network analysis. Hereditas 156, 111, 2019. CR - [10] A.S. Feltrin, P.R. Castro, L.F. Silva and R. Costa. Assessment of complementarity of WGCNA and NERI results for identification of modules associated to schizophrenia spectrum disorders. PLoS One 14, e0210431, 2019. CR - [11] E. Galán-Vásquez and E. Perez-Rueda. Identification of modules with similar gene regulation and metabolic functions based on co-expression data. Front. Mol. Biosci. 6, 139, 2019. CR - [12] A. Ghazalpour, B. Bennett, S. Plaisier and J. Chen. Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genet. 2, e130, 2006. CR - [13] D.J. Guo, R. Lin and Y. Li. Identification of breast cancer prognostic modules via differential module selection based on weighted gene co-expression network analysis. BioSystems. 199, 104317, 2021. CR - [14] M. Hubert and S. Verboven. Robust methods for partial least squares regression. J. Chemom. 17, 537-549, 2003. CR - [15] S. Langfelder and S. Horvath. Fast R functions for robust correlations and hierarchical clustering. J. Stat. Softw. 46, 117, 2012. CR - [16] S. Langfelder and S. Horvath. WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 9, 113, 2008. CR - [17] B. Liebmann, P. Filzmoser and K. Varmuza. Robust and classical PLS regression compared. J. Chemom. 24, 111120, 2010. CR - [18] W. Li, Y. Zhang, L. Zhang and Q. Wang. Weighted gene co-expression network analysis to identify key modules and hub genes associated with atrial fibrillation. Int. J. Mol. Med. 45, 401416, 2020. CR - [19] S. Pan, Y. Lin, X. Liu and L. Wang. A comprehensive weighted gene co-expression network analysis uncovers potential targets in diabetic kidney disease. J. Transl. Intern. Med. 10, 359368, 2023. CR - [20] Y. Peng, X. Chen, Q. Zhang and D. Liu. Identification of immune-related biomarkers in adrenocortical carcinoma: immune-related biomarkers for ACC. Int. Immunopharmacol. 88, 106930, 2020. CR - [21] V. Pihur, S. Datta and S. Datta. Reconstruction of genetic association networks from microarray data: a partial least squares approach. Bioinformatics 24, 561568, 2008. CR - [22] E. Polat, The effects of different weight functions on partial robust m-regression performance: a simulation study. Comms. in Stats. - Sim. and Comput. 49:4, 1089- 1104, 2020. CR - [23] S. Serneels, K. Suykens, B. De Moor and S. Van Huffel. Partial robust M-regression. Chemom. Intell. Lab. Syst. 79, 5564, 2005. CR - [24] T. Van Den Bulcke, K. Van Leemput, B. Naudts and P. Van Remortel. SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinform. 7, 112, 2006. CR - [25] F. Wang, H. Miao, S. Zhang, X. Hu, Y. Chu, W. Yang and J. Chen. Weighted gene co-expression network analysis reveals hub genes regulating response to salt stress in peanut. BMC Plant Biol. 24, 425, 2024. CR - [26] B. Wang, X. Li, Y. Zheng and R. Zhang. Research on a weighted gene co-expression network analysis method for mining pathogenic genes in thyroid cancer. PLoS One 17, e0272403, 2022. CR - [27] S. Wold, M. Sjöström and L. Eriksson. PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 58, 109130, 2001. CR - [28] C.H. Zheng, Y. Li, L. Wang and W. Zhang. Gene differential coexpression analysis based on biweight correlation and maximum clique. BMC Bioinform. 15, 17, 2014. CR - [29] B. Zhang and S. Horvath. A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, 129, 2005. UR - https://doi.org/10.15672/hujms.1679033 L1 - https://dergipark.org.tr/en/download/article-file/4785780 ER -