Machine learning-enabled classification of global human development using INFORM risk indicators

Merve Doğruel

Research Article

Machine learning-enabled classification of global human development using INFORM risk indicators

Year 2026, Volume: 6 Issue: 1, 1 - 15

Abstract

This study aims to identify the most effective machine learning model for classifying countries' Human Development Index (HDI) levels using indicators from the INFORM Risk Index. The motivation for this work lies in the growing need for data-driven methods to analyze and predict human development outcomes, particularly in the context of complex and high-dimensional socio-economic and disaster-related risk data. Traditional models often fail to capture the non-linear relationships that influence human development. To address this gap, six supervised machine learning algorithms—k-Nearest Neighbors (KNN), Linear and Nonlinear Support Vector Machines (SVM), Classification and Regression Trees (CART), Bagging, and Random Forest (RF)—were systematically evaluated. Performance was measured using weighted F1-scores on both training and testing datasets. The results reveal that while KNN, Linear SVM, and CART have limited predictive power, the Nonlinear SVM suffers from overfitting. In contrast, ensemble-based models—Bagging and RF—demonstrate superior and balanced performance, with F1-scores around 0.80 on both datasets. These methods also allow for interpretability through feature importance analysis. Socio-economic, institutional, and infrastructure-related indicators were identified as the most influential variables in predicting HDI levels. The findings highlight the strength of ensemble learning in modeling complex development-related risks and provide a robust framework for integrating machine learning into global human development analysis. This study offers valuable insights for policymakers and researchers aiming to improve forecasting, resilience planning, and development strategies.

Keywords

Machine learning , Bagging , Random forest , Human development index , INFORM risk

References

UNDP (United Nations Development Programme) (1990) Human Development Report 1990: Concept and Measurement of Human Development. Oxford University Press, New York.
Casau M, Ferreira Dias M, Leite Mota G (2024) Economics, happiness and climate change: exploring new measures of progress. Environ Dev Sustain. https://doi.org/10.1007/s10668-024-05702-2
UNDRR (United Nations Office for Disaster Risk Reduction) (2022) Global Assessment Report on Disaster Risk Reduction 2022: Our World at Risk – Transforming Governance for a Resilient Future. UNDRR, Geneva. https://www.undrr.org/gar2022
Raikes J, Smith TF, Baldwin C, Henstra D (2021) Linking disaster risk reduction and human development. Clim Risk Manag 32:100291. https://doi.org/10.1016/j.crm.2021.100291
Oran FÇ (2023) Afet risk yönetiminin insani gelişim endeksi çerçevesinde incelenmesi. Anadolu Univ J Econ Adm Sci 24(2):233–257. https://doi.org/10.53443/anadoluibfd.1185246
Feldmeyer D, Birkmann J, McMillan JM et al (2021) Global vulnerability hotspots: differences and agreement between international indicator-based assessments. Clim Change 169(12). https://doi.org/10.1007/s10584-021-03203-z
Eze E, Siegmund A (2024) Identifying disaster risk factors and hotspots in Africa from spatiotemporal decadal analyses using INFORM data for risk reduction and sustainable development. Sustain Dev 32(4):4020–4041. https://doi.org/10.1002/sd.2886
Mochizuki J, Naqvi A (2019) Reflecting disaster risk in development indicators. Sustainability 11(4):996. https://doi.org/10.3390/su11040996
Inter-Agency Standing Committee and the European Commission (2024) INFORM Report 2024: 10 years of INFORM. Publications Office of the European Union, Luxembourg. https://data.europa.eu/doi/10.2760/555548
Ricardo M (2011) Inequality and the new human development index. Appl Econ Lett 19:1–3. https://doi.org/10.1080/13504851.2011.587762
UNDP (United Nations Development Programme) (2024) Human Development Report 2024: Breaking the Gridlock. UNDP, USA.
Halder RK, Uddin MN, Uddin MA et al (2024) Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. J Big Data 11:113. https://doi.org/10.1186/s40537-024-00973-y
Srisuradetchai P, Suksrikran K (2024) Random kernel k-nearest neighbors regression. Front Big Data 7. https://doi.org/10.3389/fdata.2024.1402384
Ali A, Hamraz M, Khan DM, Deebani W, Khan Z (2023) A random projection k-nearest neighbours ensemble for classification via extended neighbourhood rule. arXiv preprint arXiv:2303.12210
Du KL, Jiang B, Lu J, Hua J, Swamy MNS (2024) Exploring kernel machines and support vector machines: principles, techniques, and future directions. Math 12. https://doi.org/10.3390/math12243935
Parlak B, Uysal AK (2019) On classification of abstracts obtained from medical journals. J Inf Sci 46(5):648–663. https://doi.org/10.1177/0165551519860982
Amaya-Tejera N, Gamarra M, Velez JI, Zurek E (2024) A distance-based kernel for classification via support vector machines. Front Artif Intell 7. https://doi.org/10.3389/frai.2024.1287875
Almaspoor MH, Safaei A, Salajegheh A, Minaei-Bidgoli B (2021) Support vector machines in big data classification: a systematic literature review. Preprint, Research Square. https://doi.org/10.21203/rs.3.rs-663359/v1
Sun H (2024) pSVM: Soft-margin SVMs with p-norm hinge loss. arXiv preprint arXiv:2408.09908
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and Regression Trees. Wadsworth, Belmont.
Doğruel M, Soner Kara S (2023) Determining the happiness class of countries with tree-based algorithms in machine learning. Acta Infologica 7(2):243–252. https://doi.org/10.26650/acin.1251650
Ngo G, Beard R, Chandra R (2022) Evolutionary bagging for ensemble learning. Neurocomputing 510:1–14. https://doi.org/10.1016/j.neucom.2022.08.055
Malhotra R, Cherukuri M (2024) A systematic review of hyperparameter tuning techniques for software quality prediction models. Intell Data Anal 28. https://doi.org/10.3233/IDA-230
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655
Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F (2018) Learning from Imbalanced Data Sets. Springer Nature, Switzerland. https://doi.org/10.1007/978-3-319-98074-4
Wei B, Wang F et al (2024) Adaptive bagging-based dynamic ensemble selection. Expert Syst Appl 255:124860. https://doi.org/10.1016/j.eswa.2024.124860
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Cutler A, Cutler DR, Stevens JR (2012) Random forests. In: Cha Z, Yunqian M (eds) Ensemble Machine Learning: Methods and Applications. Springer, New York, pp 157–175. https://doi.org/10.1007/978-1-4419-9326-7_5
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42. https://doi.org/10.1007/s10994-006-6226-1
Birkmann J et al (2022) Understanding human vulnerability to climate-induced disasters. Sci Total Environ 803:150065. https://doi.org/10.1016/j.scitotenv.2021.150065
Jamshed A et al. (2023) A bibliometric and systematic review of the Methods for the Improvement of Vulnerability Assessment in Europe (MOVE) framework: A guide for the development of further multi-hazard holistic frameworks. Jàmbá J Disaster Risk Stud 15:1486. https://doi.org/10.4102/jamba.v15i1.1486
Reimann L et al (2024) An empirical social vulnerability map for flood risk assessment at global scale (GlobE-SoVI). Earth’s Future 12:e2023EF003895. https://doi.org/10.1029/2023EF003895
Verschuur J et al (2024) Quantifying climate risks to infrastructure systems: a comparative review of developments across infrastructure sectors. PLOS Clim 3(4):e0000331. https://doi.org/10.1371/journal.pclm.0000331
Düzen MA, Bölükbaşı İB, Çalık E (2024) How to combine ML and MCDM techniques: an extended bibliometric analysis. J Innov Eng Nat Sci 4:642–657. https://doi.org/10.61112/jiens.1475948

INFORM risk göstergeleriyle makine öğrenimi tabanlı küresel insan gelişimi sınıflandırması

Year 2026, Volume: 6 Issue: 1, 1 - 15

Merve Doğruel

Abstract

Bu çalışma, gelişmiş makine öğrenmesi tekniklerinden yararlanarak, ülkelerin İnsani Gelişme Endeksi (HDI) seviyelerinin INFORM risk göstergeleri kullanılarak sınıflandırılması için en uygun kestirim modelini araştırmaktadır. Altı farklı sınıflandırma algoritması—k-En Yakın Komşu (k-NN), Doğrusal ve Doğrusal Olmayan Destek Vektör Makineleri (SVM), Sınıflandırma ve Regresyon Ağaçları (CART), Bagging ve Rastgele Orman (RF)—sistematik olarak değerlendirilmiştir. Model performansları, hem eğitim hem de test veri setleri üzerinde hesaplanan ağırlıklı F1 skoru aracılığıyla titizlikle değerlendirilmiş ve yöntemler arasında anlamlı farklılıklar ortaya konmuştur. KNN, Doğrusal SVM ve CART, görece sınırlı kestirim doğruluğu sergilerken, Doğrusal Olmayan SVM, eğitim setinde yüksek başarı gösterip test setinde performans düşüşü yaşamasıyla aşırı öğrenme (overfitting) belirtileri göstermiştir.
Buna karşılık, Bagging ve Rastgele Orman gibi topluluk (ensemble) yöntemleri, hem eğitim hem de test veri setlerinde yaklaşık 0.80 düzeyinde dengeli ve yüksek F1 skorları ile tutarlı biçimde üstün performans sergilemiş, bu da onların sağlamlığını ve güçlü genelleme yeteneklerini ortaya koymuştur.
Elde edilen bulgular, gelişim araştırmalarında yüksek boyutlu ve karmaşık sosyo-ekonomik verilerin işlenmesinde topluluk öğrenme tekniklerinin etkinliğini açık biçimde desteklemektedir.
Ayrıca yapılan özellik önem analizi, sosyo-ekonomik, kurumsal ve altyapıya ilişkin değişkenlerin HDI kestirimi üzerinde belirleyici etkiler yarattığını göstermektedir. Bu çalışma, kapsamlı karşılaştırmalı değerlendirmeyi yorumlanabilirlik analizleri ile birleştirerek, topluluk makine öğrenmesi yaklaşımlarının sosyo-ekonomik risk değerlendirmeleri ve insani gelişme öngörüleri için uygulanabilirliğine dair ampirik kanıt sunmaktadır. Elde edilen içgörüler, kalkınma süreçlerini izlemeye ve iyileştirmeye yönelik çalışan politika yapıcılar ve araştırmacılar için veri odaklı değerli bir çerçeve sunmakta ve makine öğrenmesinin sosyo-ekonomik araştırma alanlarındaki dönüştürücü potansiyelini vurgulamaktadır.

Keywords

Makine öğrenmesi , Bootstrap toplulaştırması , Rastgele orman , İnsani gelişmişlik endeksi , INFORM Risk Endeksi

References

UNDP (United Nations Development Programme) (1990) Human Development Report 1990: Concept and Measurement of Human Development. Oxford University Press, New York.
Casau M, Ferreira Dias M, Leite Mota G (2024) Economics, happiness and climate change: exploring new measures of progress. Environ Dev Sustain. https://doi.org/10.1007/s10668-024-05702-2
UNDRR (United Nations Office for Disaster Risk Reduction) (2022) Global Assessment Report on Disaster Risk Reduction 2022: Our World at Risk – Transforming Governance for a Resilient Future. UNDRR, Geneva. https://www.undrr.org/gar2022
Raikes J, Smith TF, Baldwin C, Henstra D (2021) Linking disaster risk reduction and human development. Clim Risk Manag 32:100291. https://doi.org/10.1016/j.crm.2021.100291
Oran FÇ (2023) Afet risk yönetiminin insani gelişim endeksi çerçevesinde incelenmesi. Anadolu Univ J Econ Adm Sci 24(2):233–257. https://doi.org/10.53443/anadoluibfd.1185246
Feldmeyer D, Birkmann J, McMillan JM et al (2021) Global vulnerability hotspots: differences and agreement between international indicator-based assessments. Clim Change 169(12). https://doi.org/10.1007/s10584-021-03203-z
Eze E, Siegmund A (2024) Identifying disaster risk factors and hotspots in Africa from spatiotemporal decadal analyses using INFORM data for risk reduction and sustainable development. Sustain Dev 32(4):4020–4041. https://doi.org/10.1002/sd.2886
Mochizuki J, Naqvi A (2019) Reflecting disaster risk in development indicators. Sustainability 11(4):996. https://doi.org/10.3390/su11040996
Inter-Agency Standing Committee and the European Commission (2024) INFORM Report 2024: 10 years of INFORM. Publications Office of the European Union, Luxembourg. https://data.europa.eu/doi/10.2760/555548
Ricardo M (2011) Inequality and the new human development index. Appl Econ Lett 19:1–3. https://doi.org/10.1080/13504851.2011.587762
UNDP (United Nations Development Programme) (2024) Human Development Report 2024: Breaking the Gridlock. UNDP, USA.
Halder RK, Uddin MN, Uddin MA et al (2024) Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. J Big Data 11:113. https://doi.org/10.1186/s40537-024-00973-y
Srisuradetchai P, Suksrikran K (2024) Random kernel k-nearest neighbors regression. Front Big Data 7. https://doi.org/10.3389/fdata.2024.1402384
Ali A, Hamraz M, Khan DM, Deebani W, Khan Z (2023) A random projection k-nearest neighbours ensemble for classification via extended neighbourhood rule. arXiv preprint arXiv:2303.12210
Du KL, Jiang B, Lu J, Hua J, Swamy MNS (2024) Exploring kernel machines and support vector machines: principles, techniques, and future directions. Math 12. https://doi.org/10.3390/math12243935
Parlak B, Uysal AK (2019) On classification of abstracts obtained from medical journals. J Inf Sci 46(5):648–663. https://doi.org/10.1177/0165551519860982
Amaya-Tejera N, Gamarra M, Velez JI, Zurek E (2024) A distance-based kernel for classification via support vector machines. Front Artif Intell 7. https://doi.org/10.3389/frai.2024.1287875
Almaspoor MH, Safaei A, Salajegheh A, Minaei-Bidgoli B (2021) Support vector machines in big data classification: a systematic literature review. Preprint, Research Square. https://doi.org/10.21203/rs.3.rs-663359/v1
Sun H (2024) pSVM: Soft-margin SVMs with p-norm hinge loss. arXiv preprint arXiv:2408.09908
Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and Regression Trees. Wadsworth, Belmont.
Doğruel M, Soner Kara S (2023) Determining the happiness class of countries with tree-based algorithms in machine learning. Acta Infologica 7(2):243–252. https://doi.org/10.26650/acin.1251650
Ngo G, Beard R, Chandra R (2022) Evolutionary bagging for ensemble learning. Neurocomputing 510:1–14. https://doi.org/10.1016/j.neucom.2022.08.055
Malhotra R, Cherukuri M (2024) A systematic review of hyperparameter tuning techniques for software quality prediction models. Intell Data Anal 28. https://doi.org/10.3233/IDA-230
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655
Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F (2018) Learning from Imbalanced Data Sets. Springer Nature, Switzerland. https://doi.org/10.1007/978-3-319-98074-4
Wei B, Wang F et al (2024) Adaptive bagging-based dynamic ensemble selection. Expert Syst Appl 255:124860. https://doi.org/10.1016/j.eswa.2024.124860
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Cutler A, Cutler DR, Stevens JR (2012) Random forests. In: Cha Z, Yunqian M (eds) Ensemble Machine Learning: Methods and Applications. Springer, New York, pp 157–175. https://doi.org/10.1007/978-1-4419-9326-7_5
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42. https://doi.org/10.1007/s10994-006-6226-1
Birkmann J et al (2022) Understanding human vulnerability to climate-induced disasters. Sci Total Environ 803:150065. https://doi.org/10.1016/j.scitotenv.2021.150065
Jamshed A et al. (2023) A bibliometric and systematic review of the Methods for the Improvement of Vulnerability Assessment in Europe (MOVE) framework: A guide for the development of further multi-hazard holistic frameworks. Jàmbá J Disaster Risk Stud 15:1486. https://doi.org/10.4102/jamba.v15i1.1486
Reimann L et al (2024) An empirical social vulnerability map for flood risk assessment at global scale (GlobE-SoVI). Earth’s Future 12:e2023EF003895. https://doi.org/10.1029/2023EF003895
Verschuur J et al (2024) Quantifying climate risks to infrastructure systems: a comparative review of developments across infrastructure sectors. PLOS Clim 3(4):e0000331. https://doi.org/10.1371/journal.pclm.0000331
Düzen MA, Bölükbaşı İB, Çalık E (2024) How to combine ML and MCDM techniques: an extended bibliometric analysis. J Innov Eng Nat Sci 4:642–657. https://doi.org/10.61112/jiens.1475948

There are 34 citations in total.

Details

Primary Language	English
Subjects	Supervised Learning, Machine Learning Algorithms, Data Mining and Knowledge Discovery, Data Analysis
Journal Section	Research Articles
Authors	Merve Doğruel 0000-0003-2299-7182
Publication Date	November 5, 2025
Submission Date	June 21, 2025
Acceptance Date	August 17, 2025
Published in Issue	Year 2026 Volume: 6 Issue: 1

Cite

APA	Doğruel, M. (n.d.). Machine learning-enabled classification of global human development using INFORM risk indicators. Journal of Innovative Engineering and Natural Science, 6(1), 1-15.

Download Cover Image

Article Files

Full Text

Open Journal Systems 28737

Journal of Innovative Engineering and Natural Science by İdris Karagöz is licensed under CC BY 4.0