Conditional density estimation using population Monte Carlo based approximate Bayesian computation
Year 2023,
Volume: 52 Issue: 4, 1120 - 1134, 15.08.2023
Faiza Afzaal
,
Dr. Maryam Ilyas
Abstract
Most statistical methods require likelihood evaluation to draw a statistical inference. However, in some situations, likelihood evaluation becomes difficult analytically or computationally. Different likelihood-free methods are available that eliminate the need to compute the likelihood function. Approximate Bayesian Computation (ABC) is a framework that implements likelihood-free inference and replaces the likelihood evaluation with simulations by using forward modeling. The goal of ABC methods is to approximate the posterior distribution. However, posterior approximation via ABC methods is still considerably expensive for high dimensions. ABC requires many simulations that become computationally infeasible for complex models. Here, a technique is proposed that combines a somewhat more efficient form of ABC (Population Monte Carlo, PMC) with a Conditional Density Estimation (CDE) approach. The proposed framework provides an estimation of the posterior distribution which is referred to as PMC-CDE. A simulation study is performed that provides empirical evidence to show the efficiency of PMC-CDE in terms of integrated squared error loss. Furthermore, real-life datasets manifest the application of the proposed method.
Supporting Institution
University of the Punjab, Pakistan
Thanks
many thanks to the team of Hacettepe University for providing such a platform to publish our research articles free of cost.
References
- [1] M.A. Beaumont and R. Bruce, The Bayesian revolution in genetics, Nat. Rev. Genet.
5 (4), 251-261, 2004.
- [2] M.A. Beaumont, J.M. Cornuet, J.M. Marin and C.P. Robert, Adaptive approximate
Bayesian computation, Biometrika 96 (4), 983-990, 2009.
- [3] M.A. Beaumont, W. Zhang and D.J. Balding, Approximate Bayesian computation in
population genetics, Genetics 162 (4), 2025-2035, 2002.
- [4] J. Bi, W. Shen and W. Zhu, Random forest adjustment for approximate Bayesian
computation, J. Comput. Graph. Statist. 31 (1), 64-73, 2022.
- [5] G. Biau, F. Cérou and A. Guyader, New insights into approximate Bayesian computation,
Ann. I. H. Poincare-P.R. 51 (1), 376-403, 2015.
- [6] C.M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press,
1995.
- [7] M. Blangiardo and C. Michela, Spatial and Spatio-temporal Bayesian Models with
R-INLA, John Wiley & Sons, 2015.
- [8] M.G. Blum and O. François, Sequential Monte Carlo samplers, J. R. Stat. Soc. Ser.
B. Stat. Methodol. 68 (3), 411-436, 2006.
- [9] M.G. Blum and O. François, Non-linear regression models for approximate Bayesian
computation, Stat. Comput. 20 (1), 63-73, 2010.
- [10] O. Cappé, A. Guillin, J.M. Marin and C.P. Robert, Population Monte Carlo, J.
Comput. Graph. Statist. 13 (4), 907-929, 2004.
- [11] N.N. Cencov, Estimation of an unknown distribution density from observations, Soviet
Math. 9 (2), 1559-1566, 1962.
- [12] R. Christian and C. George, Monte Carlo Statistical Methods, Springer Science &
Business Media, 2013.
- [13] N. de Freitas and N.J. Gordon, An Introduction to Sequential Monte Carlo Methods,
Springer, 2001.
- [14] S. Efromovich, Orthogonal series density estimation, Wiley Interdiscip. Rev. Comput.
Stat. 2 (4), 467-476, 2010.
- [15] S. Efromovich, Fast nonparametric conditional density estimation, arXiv: 1206.5278
[stat.ME].
- [16] E. Fix, Discriminatory Analysis: Nonparametric Discrimination, Consistency Properties,
USAF School of Aviation Medicine, 1951.
- [17] J. Fox, S. Weisberg, D. Adler, D. Bates, G. Baud-Bovy, S. Ellison, D. Firth, M.
Friendly, G. Gorjanc, S. Graves and R. Heiberger, Package: “car”, R package version:
3.1-1, 2022.
- [18] C. Grazian and Y. Fan, A review of approximate Bayesian computation methods via
density estimation: Inference for simulator-models, Wiley Interdiscip. Rev. Comput.
Stat. 12 (4), e1486, 1-16, 2020.
- [19] C. Hahn, M. Vakili, K. Walsh, A.P. Hearin, D.W. Hogg and D. Campbell, Approximate
Bayesian computation in large-scale structure: constraining the galaxy-halo
connection, Mon. Notices Royal Astron. Soc. 469 (3), 2791-2805, 2017.
- [20] E.E. Ishida, S.D.P. Vitenti, M. Penna-Lima, J. Cisewski, R.S. de Souza, A.M.M.
Trindadei E. Cameron, V.C. Busti, for the COIN collaboration, Cosmoabc: likelihoodfree
inference via population Monte Carlo approximate Bayesian computation, Astron.
Comput. 13, 1-11, 2015.
- [21] R. Izbicki and A.B. Lee, Nonparametric conditional density estimation in a highdimensional
regression setting, J. Comput. Graph. Statist. 25 (4), 1297-1316, 2016.
- [22] R. Izbicki and A.B. Lee, Converting high-dimensional regression to high-dimensional
conditional density estimation, Electron. J. Stat. 11 (2), 2800-2831, 2017.
- [23] R. Izbicki, A.B. Lee and T. Pospisil ABC–CDE: Toward approximate Bayesian computation
with complex high-dimensional data and limited simulations, J. Comput.
Graph. Statist. 28 (3), 481-492, 2019.
- [24] D.V. Lindley, Bayesian Statistics: A Review, SIAM Publishers, 1972.
- [25] S. Mallat, A Wavelet Tour of Signal Processing, Elsevier, 1999.
- [26] P. Marjoram, J. Molitor, V. Plagnol and S. Tavaré, Markov chain Monte Carlo without
likelihoods, Proc. Nat. Acad. Sci. USA 100 (26), 15324-15328, 2003.
- [27] N. Meinshausen, Quantile regression forests, J. Mach. Learn. Res. 7, 983-999, 2006.
- [28] P.D. Moral, A. Doucet and A. Jasra, An adaptive sequential Monte Carlo method for
approximate Bayesian computation, Stat. Comput. 22 (5), 1009-1020, 2012.
- [29] R. Nayek, M.A.B. Abdessalem, N. Dervilis, E.J. Cross and K. Worden, Approximate
Bayesian Inference for Piecewise-Linear Stiffness Systems, in: Nonlinear Structures
& Systems, Volume 1: Proceedings of the 40th IMAC, A Conference and Exposition
on Structural Dynamics, Cham, Springer International Publishing, 2022.
- [30] E. Numminen, L. Cheng, M. Gyllenber and J. Corander, Estimating the transmission
dynamics of Streptococcus pneumoniae from strain prevalence data, Biometrics 69
(3), 748-757, 2013.
- [31] T. Pospisil and A.B. Lee, RFCDE: Random forests for conditional density estimation,
arXiv: 1804.05753 [stat.ML].
- [32] T.P. Prescott and R.E. Baker, Multifidelity approximate Bayesian computation with
sequential Monte Carlo parameter sampling, SIAM-ASA J. Uncertain. 9 (2), 788-817,
2021.
- [33] J.K. Pritchard, M.T. Seielstad, A. Perez-Lezaun and M.W. Feldman, Population
growth of human Y chromosomes: a study of Y chromosome microsatellites, Mol.
Biol. Evol. 16 (12), 1971-1798, 1999.
- [34] B. Ripley, B. Venables, D.M. Bates, K. Hornik, A. Gebhardt, D. Firth and M.B.
Ripley, Package: “MASS”, R package version: 7.3-58.2, 2022.
- [35] M. Rosenblatt, Conditional probability density and regression estimators, in: P.R.
Krishnaiah (ed.) Multivariate Analysis II, 25–31, Academic Press, New York, 1969.
- [36] B.W. Silverman, Density Estimation for Statistics and Data Snalysis, CRC Press,
1986.
- [37] U. Simola, J. Cisewski-Kehe, M.U. Gutmann and J. Corander, Adaptive approximate
Bayesian computation tolerance selection, Bayesian Anal. 16 (2), 397-423, 2021.
- [38] U. Simola, B. Pelssers, D. Barge, J. Conrad and J. Corander, Machine learning accelerated
likelihood-free event reconstruction in dark matter direct detection, J. Instrum.
14 (3), P03004, 2019.
- [39] S.A. Sisson, Y. Fan and M. Beaumont, Handbook of Approximate Bayesian Computation,
CRC Press, 2018.
- [40] S.A. Sisson, Y. Fan and M.M. Tanaka, Sequential Monte Carlo without Likelihoods,
Proc. Natl. Acad. Sci. USA 104 (6), 1760-1765, 2007.
- [41] M.L. Stein, C. Zhiyi and W.J. Leah, Approximating likelihoods for large spatial data
sets, J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 (2), 275-296, 2004.
- [42] M.S. Stigler, Thomas Bayes’s Bayesian inference, J. Roy. Statist. Soc. Ser. A 145
(2), 250-258, 1982.
- [43] R. Swinburne, Bayes’ theorem, Rev. Philos. Fr. Etrang 194 (2), 250-251, 2004.
- [44] S. Tavaré, D.J. Balding, R.C. Griffiths and P. Donnelly, Inferring coalescence times
from DNA sequence data, Genetics 145 (2), 505-518, 1997.
- [45] G.R. Terrell, The maximal smoothing principle in density estimation, J. Amer. Statist.
Assoc. 85 (410), 470-477, 1990.
- [46] T. Toni, D. Welch, N. Strelkowa, A. Ipsen and M.P. Stumpf, Approximate Bayesian
computation scheme for parameter inference and model selection in dynamical systems,
J. R. Soc. Interface 6 (31), 187-202, 2009.
- [47] B.M. Turner and T. van Zandt, Hierarchical approximate Bayesian computation, Psychometrika
79 (2), 185-209, 2014.
- [48] L. Wasserman, All of Nonparametric Statistics, Springer Science & Business Media,
2006.
- [49] S. Watson, Sequential methods in approximate Bayesian computation, PhD thesis,
University of Bristol, 2018.
- [50] A. Weyant, C. Schafer and W.M. Wood-Vasey, Likelihood-free cosmological inference
with type Ia supernovae: approximate Bayesian computation for a complete treatment
of uncertainty, Astrophys. J. 764 (2), 116, 2013.
Year 2023,
Volume: 52 Issue: 4, 1120 - 1134, 15.08.2023
Faiza Afzaal
,
Dr. Maryam Ilyas
References
- [1] M.A. Beaumont and R. Bruce, The Bayesian revolution in genetics, Nat. Rev. Genet.
5 (4), 251-261, 2004.
- [2] M.A. Beaumont, J.M. Cornuet, J.M. Marin and C.P. Robert, Adaptive approximate
Bayesian computation, Biometrika 96 (4), 983-990, 2009.
- [3] M.A. Beaumont, W. Zhang and D.J. Balding, Approximate Bayesian computation in
population genetics, Genetics 162 (4), 2025-2035, 2002.
- [4] J. Bi, W. Shen and W. Zhu, Random forest adjustment for approximate Bayesian
computation, J. Comput. Graph. Statist. 31 (1), 64-73, 2022.
- [5] G. Biau, F. Cérou and A. Guyader, New insights into approximate Bayesian computation,
Ann. I. H. Poincare-P.R. 51 (1), 376-403, 2015.
- [6] C.M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press,
1995.
- [7] M. Blangiardo and C. Michela, Spatial and Spatio-temporal Bayesian Models with
R-INLA, John Wiley & Sons, 2015.
- [8] M.G. Blum and O. François, Sequential Monte Carlo samplers, J. R. Stat. Soc. Ser.
B. Stat. Methodol. 68 (3), 411-436, 2006.
- [9] M.G. Blum and O. François, Non-linear regression models for approximate Bayesian
computation, Stat. Comput. 20 (1), 63-73, 2010.
- [10] O. Cappé, A. Guillin, J.M. Marin and C.P. Robert, Population Monte Carlo, J.
Comput. Graph. Statist. 13 (4), 907-929, 2004.
- [11] N.N. Cencov, Estimation of an unknown distribution density from observations, Soviet
Math. 9 (2), 1559-1566, 1962.
- [12] R. Christian and C. George, Monte Carlo Statistical Methods, Springer Science &
Business Media, 2013.
- [13] N. de Freitas and N.J. Gordon, An Introduction to Sequential Monte Carlo Methods,
Springer, 2001.
- [14] S. Efromovich, Orthogonal series density estimation, Wiley Interdiscip. Rev. Comput.
Stat. 2 (4), 467-476, 2010.
- [15] S. Efromovich, Fast nonparametric conditional density estimation, arXiv: 1206.5278
[stat.ME].
- [16] E. Fix, Discriminatory Analysis: Nonparametric Discrimination, Consistency Properties,
USAF School of Aviation Medicine, 1951.
- [17] J. Fox, S. Weisberg, D. Adler, D. Bates, G. Baud-Bovy, S. Ellison, D. Firth, M.
Friendly, G. Gorjanc, S. Graves and R. Heiberger, Package: “car”, R package version:
3.1-1, 2022.
- [18] C. Grazian and Y. Fan, A review of approximate Bayesian computation methods via
density estimation: Inference for simulator-models, Wiley Interdiscip. Rev. Comput.
Stat. 12 (4), e1486, 1-16, 2020.
- [19] C. Hahn, M. Vakili, K. Walsh, A.P. Hearin, D.W. Hogg and D. Campbell, Approximate
Bayesian computation in large-scale structure: constraining the galaxy-halo
connection, Mon. Notices Royal Astron. Soc. 469 (3), 2791-2805, 2017.
- [20] E.E. Ishida, S.D.P. Vitenti, M. Penna-Lima, J. Cisewski, R.S. de Souza, A.M.M.
Trindadei E. Cameron, V.C. Busti, for the COIN collaboration, Cosmoabc: likelihoodfree
inference via population Monte Carlo approximate Bayesian computation, Astron.
Comput. 13, 1-11, 2015.
- [21] R. Izbicki and A.B. Lee, Nonparametric conditional density estimation in a highdimensional
regression setting, J. Comput. Graph. Statist. 25 (4), 1297-1316, 2016.
- [22] R. Izbicki and A.B. Lee, Converting high-dimensional regression to high-dimensional
conditional density estimation, Electron. J. Stat. 11 (2), 2800-2831, 2017.
- [23] R. Izbicki, A.B. Lee and T. Pospisil ABC–CDE: Toward approximate Bayesian computation
with complex high-dimensional data and limited simulations, J. Comput.
Graph. Statist. 28 (3), 481-492, 2019.
- [24] D.V. Lindley, Bayesian Statistics: A Review, SIAM Publishers, 1972.
- [25] S. Mallat, A Wavelet Tour of Signal Processing, Elsevier, 1999.
- [26] P. Marjoram, J. Molitor, V. Plagnol and S. Tavaré, Markov chain Monte Carlo without
likelihoods, Proc. Nat. Acad. Sci. USA 100 (26), 15324-15328, 2003.
- [27] N. Meinshausen, Quantile regression forests, J. Mach. Learn. Res. 7, 983-999, 2006.
- [28] P.D. Moral, A. Doucet and A. Jasra, An adaptive sequential Monte Carlo method for
approximate Bayesian computation, Stat. Comput. 22 (5), 1009-1020, 2012.
- [29] R. Nayek, M.A.B. Abdessalem, N. Dervilis, E.J. Cross and K. Worden, Approximate
Bayesian Inference for Piecewise-Linear Stiffness Systems, in: Nonlinear Structures
& Systems, Volume 1: Proceedings of the 40th IMAC, A Conference and Exposition
on Structural Dynamics, Cham, Springer International Publishing, 2022.
- [30] E. Numminen, L. Cheng, M. Gyllenber and J. Corander, Estimating the transmission
dynamics of Streptococcus pneumoniae from strain prevalence data, Biometrics 69
(3), 748-757, 2013.
- [31] T. Pospisil and A.B. Lee, RFCDE: Random forests for conditional density estimation,
arXiv: 1804.05753 [stat.ML].
- [32] T.P. Prescott and R.E. Baker, Multifidelity approximate Bayesian computation with
sequential Monte Carlo parameter sampling, SIAM-ASA J. Uncertain. 9 (2), 788-817,
2021.
- [33] J.K. Pritchard, M.T. Seielstad, A. Perez-Lezaun and M.W. Feldman, Population
growth of human Y chromosomes: a study of Y chromosome microsatellites, Mol.
Biol. Evol. 16 (12), 1971-1798, 1999.
- [34] B. Ripley, B. Venables, D.M. Bates, K. Hornik, A. Gebhardt, D. Firth and M.B.
Ripley, Package: “MASS”, R package version: 7.3-58.2, 2022.
- [35] M. Rosenblatt, Conditional probability density and regression estimators, in: P.R.
Krishnaiah (ed.) Multivariate Analysis II, 25–31, Academic Press, New York, 1969.
- [36] B.W. Silverman, Density Estimation for Statistics and Data Snalysis, CRC Press,
1986.
- [37] U. Simola, J. Cisewski-Kehe, M.U. Gutmann and J. Corander, Adaptive approximate
Bayesian computation tolerance selection, Bayesian Anal. 16 (2), 397-423, 2021.
- [38] U. Simola, B. Pelssers, D. Barge, J. Conrad and J. Corander, Machine learning accelerated
likelihood-free event reconstruction in dark matter direct detection, J. Instrum.
14 (3), P03004, 2019.
- [39] S.A. Sisson, Y. Fan and M. Beaumont, Handbook of Approximate Bayesian Computation,
CRC Press, 2018.
- [40] S.A. Sisson, Y. Fan and M.M. Tanaka, Sequential Monte Carlo without Likelihoods,
Proc. Natl. Acad. Sci. USA 104 (6), 1760-1765, 2007.
- [41] M.L. Stein, C. Zhiyi and W.J. Leah, Approximating likelihoods for large spatial data
sets, J. R. Stat. Soc. Ser. B. Stat. Methodol. 66 (2), 275-296, 2004.
- [42] M.S. Stigler, Thomas Bayes’s Bayesian inference, J. Roy. Statist. Soc. Ser. A 145
(2), 250-258, 1982.
- [43] R. Swinburne, Bayes’ theorem, Rev. Philos. Fr. Etrang 194 (2), 250-251, 2004.
- [44] S. Tavaré, D.J. Balding, R.C. Griffiths and P. Donnelly, Inferring coalescence times
from DNA sequence data, Genetics 145 (2), 505-518, 1997.
- [45] G.R. Terrell, The maximal smoothing principle in density estimation, J. Amer. Statist.
Assoc. 85 (410), 470-477, 1990.
- [46] T. Toni, D. Welch, N. Strelkowa, A. Ipsen and M.P. Stumpf, Approximate Bayesian
computation scheme for parameter inference and model selection in dynamical systems,
J. R. Soc. Interface 6 (31), 187-202, 2009.
- [47] B.M. Turner and T. van Zandt, Hierarchical approximate Bayesian computation, Psychometrika
79 (2), 185-209, 2014.
- [48] L. Wasserman, All of Nonparametric Statistics, Springer Science & Business Media,
2006.
- [49] S. Watson, Sequential methods in approximate Bayesian computation, PhD thesis,
University of Bristol, 2018.
- [50] A. Weyant, C. Schafer and W.M. Wood-Vasey, Likelihood-free cosmological inference
with type Ia supernovae: approximate Bayesian computation for a complete treatment
of uncertainty, Astrophys. J. 764 (2), 116, 2013.