Research Article
BibTex RIS Cite

Vine copula graphical models in the construction of biological networks

Year 2021, , 1172 - 1184, 06.08.2021
https://doi.org/10.15672/hujms.728352

Abstract

The copula Gaussian graphical model (CGGM) is one of the major mathematical models for high dimensional biological networks which provides a graphical representation, especially, for sparse networks. Basically, this model uses a regression of the Gaussian graphical model (GGM) whose precision matrix describes the conditional dependence between the variables to estimate the coefficients of the linear regression model. The Bayesian inference for the model parameters is used to overcome the dimensional limitation of GGM under sparse networks and small sample sizes. But from the application in bench-mark data sets, it is seen that although CGGM is successful in certain systems, it may not fit well for non-normal multivariate observations. In this study, we propose the vine copulas to relax the strict normality assumption of CGGM and to describe networks from a variety of copulas’ alternates besides the Gaussian copula. Accordingly, we evaluate the best fitted bivariate copula distribution for every pairwise gene and compute the estimated adjacency matrix which denotes the presence of an edge between the corresponding genes. We assess the performance of our proposed approach in three network data via distinct accuracy measures by comparing the outputs with the results of the CGGM.

Supporting Institution

European Union 7th Framework Project

Project Number

CA15109

Thanks

The second author thanks the COSTNET Project (No: CA15109) for their support.

References

  • [1] M. Ağraz and V. Purutçuoğlu, Extended lasso-type MARS (LMARS) model in the description of biological network, J. Stat. Comput. Simul. 89 (1), 1-14, 2019.
  • [2] Ö.S. Alp, E. Büyükbebeci, A. İşcanog, F.Y. Özkurt, P. Taylan and G.W. Weber, CMARS and GAM & CQP-modern optimization methods applied to international credit default prediction, J. Comput. Appl. Math. 235 (16), 4639-4651, 2011.
  • [3] S.K. Alparslan-Gök, S. Miquel and S.H. Tijs, Cooperation under interval uncertainty, Math. Methods Oper. Res. 69 (1), 99-109, 2009.
  • [4] E. Ayyıldız, M. Ağraz and V. Purutçuoğlu, MARS as an alternative approach of Gaussian graphical model for biochemical networks, J. Appl. Stat. 44 (16), 2858-2876, 2017.
  • [5] E. Ayyıldız and V. Purutçuoğlu, Modeling of various biological networks via LCMARS, J. Comput. Sci. 28, 148-154, 2018.
  • [6] B. Bahçivancı, V. Purutçuoğlu, E. Purutçuoğlu and Y. Ürün, Estimation of gynecologic cancer networks via target proteins, J. Multidiscip. Eng. Sci. Technol. 5 (12), 9296-9302, 2018.
  • [7] E.C. Brechmann and U. Schepmeier, Modeling dependence with C- and D-vine copulas: The R package CDVine, J. Stat. Softw. 52 (3), 1-25, 2013.
  • [8] C. Czado, U. Schepsmeier and A. Min, Maximum likelihood estimation of mixed Cvines with application to exchange rates, Stat. Model. 12 (3), 229-255, 2012.
  • [9] A. Çevik, G.W. Weber, B.M. Eyüboğlu, K.K. Oğuz and Alzheimers Disease Neuroimaging Initiative, Voxel-MARS: a method for early detection of Alzheimers disease by classification of structural brain MRI, Ann. Oper. Res. 258 (1), 31-57, 2017.
  • [10] E.A. Demirci, Inference of large-scale networks via statistical approaches, PhD thesis, Middle East Technical University, 2019.
  • [11] J. Dissmann, E.C. Brechmann, C. Czado and D. Kurowicka, Selecting and estimating regular vine copulae and application to financial returns, Comput. Statist. Data Anal. 59, 52-69, 2013.
  • [12] A. Dobra and A. Lenkoski, Copula Gaussian graphical models and their application to modeling functional disability data, Ann. Appl. Stat. 5 (2A), 969-993, 2011.
  • [13] H. Farnoudkia and V. Purutçuoğlu, Copula Gaussian graphical modeling of biological networks and Bayesian inference of model parameters, Scientia Iranica 26 (4), 2495- 2505, 2019.
  • [14] B. Fellinghauer, P. Bühlmann, M. Ryffel, M. Von Rhein and J.D. Reinhardt, Stable graphical model estimation with random forests for discrete, continuous, and mixed variables, Comput. Statist. Data Anal. 64, 132-152, 2013.
  • [15] J. Gebert, N. Radde and G.W. Weber, Modelling gene regulatory networks with piecewise linear differential equations, Challenges of Continuous Optimization in Theory and Applications of European Journal of Operational Research 181 (3), 1148-1165, 2007.
  • [16] B. Häussling Löwgren, J. Weigert, E. Esche and J.U. Repke, Uncertainty analysis for data-driven chance-constrained optimization, Sustainability 12 (6), 2450, 2020.
  • [17] P.D. Hoff, Extending the rank likelihood for semiparametric copula estimation, Ann. Appl. Stat. 1 (1), 265-283, 2007.
  • [18] A. Karacayir, Short term electricity Load forecasting with multiple linear regression and artificial neural network, MSc. Term Project Report/Thesis, Middle East Technical University, 2012.
  • [19] I. Kojadinovic and J. Yan, Modeling multivariate distributions with continuous margins using the copula R package, J. Stat. Softw. 34 (9), 1-20, 2010.
  • [20] D. Koller and N. Friedman, Probabilistic Graphical Models Principles and Techniques, MIT Press, Massachusetts, 2009.
  • [21] E. Kropat, G.W. Weber and B. Akteke-Öztürk, Eco-finance networks under uncertainty, in: Proceedings of the International Conference on Engineering Optimization, Rio de Janeiro, Brazil, 2008.
  • [22] S. Kuter, B.B. Ciftci and G.W. Weber, Snow cover mapping from satellite data by artificial neural networks and support vector machines - An OR contribution to land-use, water management and development, International Conference on OR for Development ICORD 2017, Quebec, Canada, July 13-14, 2017.
  • [23] S. Kuter, G.W. Weber and Z. Akyurek, Artificial neural networks vs. multivariate adaptive regression splines for sub-pixel snow mapping from satellite data, Workshop on the State of the Art and Future Development, Poznan, Poland, July 3-6, 2016.
  • [24] A. Mohammadi and E.C. Wit, BDgraph: Bayesian structure learning of graphs in R, Bayesian Analysis 10 (1), 109-138, 2015.
  • [25] J.M. Mulvey, R.J. Vanderbei and S.A. Zenios, Robust optimization of large-scale systems, Operations Research 43 (2), 264-281, 1995.
  • [26] M.A. Nielsen, Neural Networks and Deep Learning, Determination Press, San Francisco, CA, 2015.
  • [27] A. Özmen, Robust Optimization of Spline Models and Complex Regulatory Networks, Springer International Publishing, Switzerland, 2016.
  • [28] A. Özmen, İ. Batmaz and G.W. Weber, Precipitation modeling by polyhedral RCMARS and comparison with MARS and CMARS, Environ. Model. Assess. 19 (5), 425-435, 2014.
  • [29] A. Özmen, G.W. Weber, İ. Batmaz and E. Kropat, RCMARS: Robustification of CMARS with different scenarios under polyhedral uncertainty set, Commun. Nonlinear Sci. Numer. Simul. 16 (12), 4780-4787, 2011.
  • [30] A. Özmen, G.W. Weber and E. Kropat, Robustification of conic generalized partial linear models under polyhedral uncertainty, Methods 20 (21), 22, 2012.
  • [31] H. Parkinson, M. Kapushesky, M. Shojatalab, N. Abeygunawardena, R. Coulson, A. Farne, E. Holloway, N. Kolesnykov, P. Lilja, M. Lukk and R. Mani, ArrayExpressa public database of microarray experiments and gene expression profiles, Nucleic Acids Res 35 (suppl-1), D747-D750, 2007.
  • [32] V. Purutcuoglu and H. Farnoudkia, Copula Gaussian graphical modelling of biological networks and Bayesian inference of model parameters, Scientia Iranica 26 (4), 2495- 2505, 2019.
  • [33] V. Purutçuoğlu and H. Farnoudkia, Gibbs sampling in inference of copula gaussian graphical model adapted to biological networks, Acta Physica Polonica A 132 (3), 2017.
  • [34] Y. Rahmatallah, F. Emmert-Streib and G. Glazko, Gene sets net correlations analysis (GSNCA): A multivariate differential coexpression test for gene sets, Bioinformatics 30 (3), 360368, 2014.
  • [35] K. Sachs, O. Perez, D. Pe’er, D.A. Lauenburger and G.P. Nolan, Causal proteinsignaling networks derived from multiparameter single-cell data, Science 308 (5721), 523-529, 2005.
  • [36] E. Savku and G.W. Weber, A stochastic maximum principle for a Markov regimeswitching jump-diffusion model with delay and an application to finance, J. Optim. Theory Appl. 179 (2), 696-721, 2018.
  • [37] D. Seçilmiş and V. Purutçuoğlu, Modeling of biochemical networks via classification and regression tree methods, Mathematical Methods in Engineering, 87-102, 2019.
  • [38] I. Shmulevich, E.R. Dougherty and K. Seungchan, Sparse inverse covariance estimation with the graphical lasso, Bioinformatics 18, 261274, 2002.
  • [39] J. Stöber, H.G. Hong, C. Czado and P. Ghosh, Comorbidity of chronic diseases in the elderly: Patterns identified by a copula design for mixed responses, Comput. Statist. Data Anal. 88, 28-39, 2015.
  • [40] V. Strijov, G.W. Weber, R. Weber and S.O. Akyuz, Editorial of the special issue in data analysis and intelligent optimization with applications, Machine Learning 101, 1-4, 2015.
  • [41] E. Todorov, Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system, Neural Comput. 17 (5), 1084-1108, 2005.
  • [42] G. Üstünkar, S.Ö. Akyüz, G.W. Weber and Y.A. Son, Analysis of SNP-complex disease association by a novel feature selection method, in: Operations Research Proceedings 2010, Springer, Berlin, Heidelberg, 21-26, 2011.
  • [43] H. Wang and S. Zhengzi, Efficient Gaussian graphical model determination under G-Wishart prior distributions, Electron. J. Stat. 6, 168-198, 2012.
  • [44] G.W. Weber, Z. Çavuşoğlu and A. Özmen, Predicting default probabilities in emerging markets by new conic generalized partial linear models and their optimization, Optimization 61 (4), 443-457, 2012.
  • [45] J. Whittaker, Graphical Models in Applied Multivariate Statistics, Wiley Publishing, 1990.
  • [46] F. Yerlikaya-Özkurt, C. Vardar-Acar, Y. Yolcu-Okur and G.W. Weber, Estimation of the Hurst parameter for fractional Brownian motion using the CMARS method, J. Comput. Appl. Math. 259, 843-850, 2014.
Year 2021, , 1172 - 1184, 06.08.2021
https://doi.org/10.15672/hujms.728352

Abstract

Project Number

CA15109

References

  • [1] M. Ağraz and V. Purutçuoğlu, Extended lasso-type MARS (LMARS) model in the description of biological network, J. Stat. Comput. Simul. 89 (1), 1-14, 2019.
  • [2] Ö.S. Alp, E. Büyükbebeci, A. İşcanog, F.Y. Özkurt, P. Taylan and G.W. Weber, CMARS and GAM & CQP-modern optimization methods applied to international credit default prediction, J. Comput. Appl. Math. 235 (16), 4639-4651, 2011.
  • [3] S.K. Alparslan-Gök, S. Miquel and S.H. Tijs, Cooperation under interval uncertainty, Math. Methods Oper. Res. 69 (1), 99-109, 2009.
  • [4] E. Ayyıldız, M. Ağraz and V. Purutçuoğlu, MARS as an alternative approach of Gaussian graphical model for biochemical networks, J. Appl. Stat. 44 (16), 2858-2876, 2017.
  • [5] E. Ayyıldız and V. Purutçuoğlu, Modeling of various biological networks via LCMARS, J. Comput. Sci. 28, 148-154, 2018.
  • [6] B. Bahçivancı, V. Purutçuoğlu, E. Purutçuoğlu and Y. Ürün, Estimation of gynecologic cancer networks via target proteins, J. Multidiscip. Eng. Sci. Technol. 5 (12), 9296-9302, 2018.
  • [7] E.C. Brechmann and U. Schepmeier, Modeling dependence with C- and D-vine copulas: The R package CDVine, J. Stat. Softw. 52 (3), 1-25, 2013.
  • [8] C. Czado, U. Schepsmeier and A. Min, Maximum likelihood estimation of mixed Cvines with application to exchange rates, Stat. Model. 12 (3), 229-255, 2012.
  • [9] A. Çevik, G.W. Weber, B.M. Eyüboğlu, K.K. Oğuz and Alzheimers Disease Neuroimaging Initiative, Voxel-MARS: a method for early detection of Alzheimers disease by classification of structural brain MRI, Ann. Oper. Res. 258 (1), 31-57, 2017.
  • [10] E.A. Demirci, Inference of large-scale networks via statistical approaches, PhD thesis, Middle East Technical University, 2019.
  • [11] J. Dissmann, E.C. Brechmann, C. Czado and D. Kurowicka, Selecting and estimating regular vine copulae and application to financial returns, Comput. Statist. Data Anal. 59, 52-69, 2013.
  • [12] A. Dobra and A. Lenkoski, Copula Gaussian graphical models and their application to modeling functional disability data, Ann. Appl. Stat. 5 (2A), 969-993, 2011.
  • [13] H. Farnoudkia and V. Purutçuoğlu, Copula Gaussian graphical modeling of biological networks and Bayesian inference of model parameters, Scientia Iranica 26 (4), 2495- 2505, 2019.
  • [14] B. Fellinghauer, P. Bühlmann, M. Ryffel, M. Von Rhein and J.D. Reinhardt, Stable graphical model estimation with random forests for discrete, continuous, and mixed variables, Comput. Statist. Data Anal. 64, 132-152, 2013.
  • [15] J. Gebert, N. Radde and G.W. Weber, Modelling gene regulatory networks with piecewise linear differential equations, Challenges of Continuous Optimization in Theory and Applications of European Journal of Operational Research 181 (3), 1148-1165, 2007.
  • [16] B. Häussling Löwgren, J. Weigert, E. Esche and J.U. Repke, Uncertainty analysis for data-driven chance-constrained optimization, Sustainability 12 (6), 2450, 2020.
  • [17] P.D. Hoff, Extending the rank likelihood for semiparametric copula estimation, Ann. Appl. Stat. 1 (1), 265-283, 2007.
  • [18] A. Karacayir, Short term electricity Load forecasting with multiple linear regression and artificial neural network, MSc. Term Project Report/Thesis, Middle East Technical University, 2012.
  • [19] I. Kojadinovic and J. Yan, Modeling multivariate distributions with continuous margins using the copula R package, J. Stat. Softw. 34 (9), 1-20, 2010.
  • [20] D. Koller and N. Friedman, Probabilistic Graphical Models Principles and Techniques, MIT Press, Massachusetts, 2009.
  • [21] E. Kropat, G.W. Weber and B. Akteke-Öztürk, Eco-finance networks under uncertainty, in: Proceedings of the International Conference on Engineering Optimization, Rio de Janeiro, Brazil, 2008.
  • [22] S. Kuter, B.B. Ciftci and G.W. Weber, Snow cover mapping from satellite data by artificial neural networks and support vector machines - An OR contribution to land-use, water management and development, International Conference on OR for Development ICORD 2017, Quebec, Canada, July 13-14, 2017.
  • [23] S. Kuter, G.W. Weber and Z. Akyurek, Artificial neural networks vs. multivariate adaptive regression splines for sub-pixel snow mapping from satellite data, Workshop on the State of the Art and Future Development, Poznan, Poland, July 3-6, 2016.
  • [24] A. Mohammadi and E.C. Wit, BDgraph: Bayesian structure learning of graphs in R, Bayesian Analysis 10 (1), 109-138, 2015.
  • [25] J.M. Mulvey, R.J. Vanderbei and S.A. Zenios, Robust optimization of large-scale systems, Operations Research 43 (2), 264-281, 1995.
  • [26] M.A. Nielsen, Neural Networks and Deep Learning, Determination Press, San Francisco, CA, 2015.
  • [27] A. Özmen, Robust Optimization of Spline Models and Complex Regulatory Networks, Springer International Publishing, Switzerland, 2016.
  • [28] A. Özmen, İ. Batmaz and G.W. Weber, Precipitation modeling by polyhedral RCMARS and comparison with MARS and CMARS, Environ. Model. Assess. 19 (5), 425-435, 2014.
  • [29] A. Özmen, G.W. Weber, İ. Batmaz and E. Kropat, RCMARS: Robustification of CMARS with different scenarios under polyhedral uncertainty set, Commun. Nonlinear Sci. Numer. Simul. 16 (12), 4780-4787, 2011.
  • [30] A. Özmen, G.W. Weber and E. Kropat, Robustification of conic generalized partial linear models under polyhedral uncertainty, Methods 20 (21), 22, 2012.
  • [31] H. Parkinson, M. Kapushesky, M. Shojatalab, N. Abeygunawardena, R. Coulson, A. Farne, E. Holloway, N. Kolesnykov, P. Lilja, M. Lukk and R. Mani, ArrayExpressa public database of microarray experiments and gene expression profiles, Nucleic Acids Res 35 (suppl-1), D747-D750, 2007.
  • [32] V. Purutcuoglu and H. Farnoudkia, Copula Gaussian graphical modelling of biological networks and Bayesian inference of model parameters, Scientia Iranica 26 (4), 2495- 2505, 2019.
  • [33] V. Purutçuoğlu and H. Farnoudkia, Gibbs sampling in inference of copula gaussian graphical model adapted to biological networks, Acta Physica Polonica A 132 (3), 2017.
  • [34] Y. Rahmatallah, F. Emmert-Streib and G. Glazko, Gene sets net correlations analysis (GSNCA): A multivariate differential coexpression test for gene sets, Bioinformatics 30 (3), 360368, 2014.
  • [35] K. Sachs, O. Perez, D. Pe’er, D.A. Lauenburger and G.P. Nolan, Causal proteinsignaling networks derived from multiparameter single-cell data, Science 308 (5721), 523-529, 2005.
  • [36] E. Savku and G.W. Weber, A stochastic maximum principle for a Markov regimeswitching jump-diffusion model with delay and an application to finance, J. Optim. Theory Appl. 179 (2), 696-721, 2018.
  • [37] D. Seçilmiş and V. Purutçuoğlu, Modeling of biochemical networks via classification and regression tree methods, Mathematical Methods in Engineering, 87-102, 2019.
  • [38] I. Shmulevich, E.R. Dougherty and K. Seungchan, Sparse inverse covariance estimation with the graphical lasso, Bioinformatics 18, 261274, 2002.
  • [39] J. Stöber, H.G. Hong, C. Czado and P. Ghosh, Comorbidity of chronic diseases in the elderly: Patterns identified by a copula design for mixed responses, Comput. Statist. Data Anal. 88, 28-39, 2015.
  • [40] V. Strijov, G.W. Weber, R. Weber and S.O. Akyuz, Editorial of the special issue in data analysis and intelligent optimization with applications, Machine Learning 101, 1-4, 2015.
  • [41] E. Todorov, Stochastic optimal control and estimation methods adapted to the noise characteristics of the sensorimotor system, Neural Comput. 17 (5), 1084-1108, 2005.
  • [42] G. Üstünkar, S.Ö. Akyüz, G.W. Weber and Y.A. Son, Analysis of SNP-complex disease association by a novel feature selection method, in: Operations Research Proceedings 2010, Springer, Berlin, Heidelberg, 21-26, 2011.
  • [43] H. Wang and S. Zhengzi, Efficient Gaussian graphical model determination under G-Wishart prior distributions, Electron. J. Stat. 6, 168-198, 2012.
  • [44] G.W. Weber, Z. Çavuşoğlu and A. Özmen, Predicting default probabilities in emerging markets by new conic generalized partial linear models and their optimization, Optimization 61 (4), 443-457, 2012.
  • [45] J. Whittaker, Graphical Models in Applied Multivariate Statistics, Wiley Publishing, 1990.
  • [46] F. Yerlikaya-Özkurt, C. Vardar-Acar, Y. Yolcu-Okur and G.W. Weber, Estimation of the Hurst parameter for fractional Brownian motion using the CMARS method, J. Comput. Appl. Math. 259, 843-850, 2014.
There are 46 citations in total.

Details

Primary Language English
Subjects Statistics
Journal Section Statistics
Authors

Hajar Farnoudkia This is me 0000-0001-9201-663X

Vilda Purutcuoglu 0000-0002-3913-9005

Project Number CA15109
Publication Date August 6, 2021
Published in Issue Year 2021

Cite

APA Farnoudkia, H., & Purutcuoglu, V. (2021). Vine copula graphical models in the construction of biological networks. Hacettepe Journal of Mathematics and Statistics, 50(4), 1172-1184. https://doi.org/10.15672/hujms.728352
AMA Farnoudkia H, Purutcuoglu V. Vine copula graphical models in the construction of biological networks. Hacettepe Journal of Mathematics and Statistics. August 2021;50(4):1172-1184. doi:10.15672/hujms.728352
Chicago Farnoudkia, Hajar, and Vilda Purutcuoglu. “Vine Copula Graphical Models in the Construction of Biological Networks”. Hacettepe Journal of Mathematics and Statistics 50, no. 4 (August 2021): 1172-84. https://doi.org/10.15672/hujms.728352.
EndNote Farnoudkia H, Purutcuoglu V (August 1, 2021) Vine copula graphical models in the construction of biological networks. Hacettepe Journal of Mathematics and Statistics 50 4 1172–1184.
IEEE H. Farnoudkia and V. Purutcuoglu, “Vine copula graphical models in the construction of biological networks”, Hacettepe Journal of Mathematics and Statistics, vol. 50, no. 4, pp. 1172–1184, 2021, doi: 10.15672/hujms.728352.
ISNAD Farnoudkia, Hajar - Purutcuoglu, Vilda. “Vine Copula Graphical Models in the Construction of Biological Networks”. Hacettepe Journal of Mathematics and Statistics 50/4 (August 2021), 1172-1184. https://doi.org/10.15672/hujms.728352.
JAMA Farnoudkia H, Purutcuoglu V. Vine copula graphical models in the construction of biological networks. Hacettepe Journal of Mathematics and Statistics. 2021;50:1172–1184.
MLA Farnoudkia, Hajar and Vilda Purutcuoglu. “Vine Copula Graphical Models in the Construction of Biological Networks”. Hacettepe Journal of Mathematics and Statistics, vol. 50, no. 4, 2021, pp. 1172-84, doi:10.15672/hujms.728352.
Vancouver Farnoudkia H, Purutcuoglu V. Vine copula graphical models in the construction of biological networks. Hacettepe Journal of Mathematics and Statistics. 2021;50(4):1172-84.