Research Article
BibTex RIS Cite

Machine learning-enabled classification of global human development using INFORM risk indicators

Year 2026, Volume: 6 Issue: 1, 1 - 15

Abstract

This study aims to identify the most effective machine learning model for classifying countries' Human Development Index (HDI) levels using indicators from the INFORM Risk Index. The motivation for this work lies in the growing need for data-driven methods to analyze and predict human development outcomes, particularly in the context of complex and high-dimensional socio-economic and disaster-related risk data. Traditional models often fail to capture the non-linear relationships that influence human development. To address this gap, six supervised machine learning algorithms—k-Nearest Neighbors (KNN), Linear and Nonlinear Support Vector Machines (SVM), Classification and Regression Trees (CART), Bagging, and Random Forest (RF)—were systematically evaluated. Performance was measured using weighted F1-scores on both training and testing datasets. The results reveal that while KNN, Linear SVM, and CART have limited predictive power, the Nonlinear SVM suffers from overfitting. In contrast, ensemble-based models—Bagging and RF—demonstrate superior and balanced performance, with F1-scores around 0.80 on both datasets. These methods also allow for interpretability through feature importance analysis. Socio-economic, institutional, and infrastructure-related indicators were identified as the most influential variables in predicting HDI levels. The findings highlight the strength of ensemble learning in modeling complex development-related risks and provide a robust framework for integrating machine learning into global human development analysis. This study offers valuable insights for policymakers and researchers aiming to improve forecasting, resilience planning, and development strategies.

References

  • UNDP (United Nations Development Programme) (1990) Human Development Report 1990: Concept and Measurement of Human Development. Oxford University Press, New York.
  • Casau M, Ferreira Dias M, Leite Mota G (2024) Economics, happiness and climate change: exploring new measures of progress. Environ Dev Sustain. https://doi.org/10.1007/s10668-024-05702-2
  • UNDRR (United Nations Office for Disaster Risk Reduction) (2022) Global Assessment Report on Disaster Risk Reduction 2022: Our World at Risk – Transforming Governance for a Resilient Future. UNDRR, Geneva. https://www.undrr.org/gar2022
  • Raikes J, Smith TF, Baldwin C, Henstra D (2021) Linking disaster risk reduction and human development. Clim Risk Manag 32:100291. https://doi.org/10.1016/j.crm.2021.100291
  • Oran FÇ (2023) Afet risk yönetiminin insani gelişim endeksi çerçevesinde incelenmesi. Anadolu Univ J Econ Adm Sci 24(2):233–257. https://doi.org/10.53443/anadoluibfd.1185246
  • Feldmeyer D, Birkmann J, McMillan JM et al (2021) Global vulnerability hotspots: differences and agreement between international indicator-based assessments. Clim Change 169(12). https://doi.org/10.1007/s10584-021-03203-z
  • Eze E, Siegmund A (2024) Identifying disaster risk factors and hotspots in Africa from spatiotemporal decadal analyses using INFORM data for risk reduction and sustainable development. Sustain Dev 32(4):4020–4041. https://doi.org/10.1002/sd.2886
  • Mochizuki J, Naqvi A (2019) Reflecting disaster risk in development indicators. Sustainability 11(4):996. https://doi.org/10.3390/su11040996
  • Inter-Agency Standing Committee and the European Commission (2024) INFORM Report 2024: 10 years of INFORM. Publications Office of the European Union, Luxembourg. https://data.europa.eu/doi/10.2760/555548
  • Ricardo M (2011) Inequality and the new human development index. Appl Econ Lett 19:1–3. https://doi.org/10.1080/13504851.2011.587762
  • UNDP (United Nations Development Programme) (2024) Human Development Report 2024: Breaking the Gridlock. UNDP, USA.
  • Halder RK, Uddin MN, Uddin MA et al (2024) Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. J Big Data 11:113. https://doi.org/10.1186/s40537-024-00973-y
  • Srisuradetchai P, Suksrikran K (2024) Random kernel k-nearest neighbors regression. Front Big Data 7. https://doi.org/10.3389/fdata.2024.1402384
  • Ali A, Hamraz M, Khan DM, Deebani W, Khan Z (2023) A random projection k-nearest neighbours ensemble for classification via extended neighbourhood rule. arXiv preprint arXiv:2303.12210
  • Du KL, Jiang B, Lu J, Hua J, Swamy MNS (2024) Exploring kernel machines and support vector machines: principles, techniques, and future directions. Math 12. https://doi.org/10.3390/math12243935
  • Parlak B, Uysal AK (2019) On classification of abstracts obtained from medical journals. J Inf Sci 46(5):648–663. https://doi.org/10.1177/0165551519860982
  • Amaya-Tejera N, Gamarra M, Velez JI, Zurek E (2024) A distance-based kernel for classification via support vector machines. Front Artif Intell 7. https://doi.org/10.3389/frai.2024.1287875
  • Almaspoor MH, Safaei A, Salajegheh A, Minaei-Bidgoli B (2021) Support vector machines in big data classification: a systematic literature review. Preprint, Research Square. https://doi.org/10.21203/rs.3.rs-663359/v1
  • Sun H (2024) pSVM: Soft-margin SVMs with p-norm hinge loss. arXiv preprint arXiv:2408.09908
  • Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and Regression Trees. Wadsworth, Belmont.
  • Doğruel M, Soner Kara S (2023) Determining the happiness class of countries with tree-based algorithms in machine learning. Acta Infologica 7(2):243–252. https://doi.org/10.26650/acin.1251650
  • Ngo G, Beard R, Chandra R (2022) Evolutionary bagging for ensemble learning. Neurocomputing 510:1–14. https://doi.org/10.1016/j.neucom.2022.08.055
  • Malhotra R, Cherukuri M (2024) A systematic review of hyperparameter tuning techniques for software quality prediction models. Intell Data Anal 28. https://doi.org/10.3233/IDA-230
  • Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655
  • Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F (2018) Learning from Imbalanced Data Sets. Springer Nature, Switzerland. https://doi.org/10.1007/978-3-319-98074-4
  • Wei B, Wang F et al (2024) Adaptive bagging-based dynamic ensemble selection. Expert Syst Appl 255:124860. https://doi.org/10.1016/j.eswa.2024.124860
  • Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
  • Cutler A, Cutler DR, Stevens JR (2012) Random forests. In: Cha Z, Yunqian M (eds) Ensemble Machine Learning: Methods and Applications. Springer, New York, pp 157–175. https://doi.org/10.1007/978-1-4419-9326-7_5
  • Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42. https://doi.org/10.1007/s10994-006-6226-1
  • Birkmann J et al (2022) Understanding human vulnerability to climate-induced disasters. Sci Total Environ 803:150065. https://doi.org/10.1016/j.scitotenv.2021.150065
  • Jamshed A et al. (2023) A bibliometric and systematic review of the Methods for the Improvement of Vulnerability Assessment in Europe (MOVE) framework: A guide for the development of further multi-hazard holistic frameworks. Jàmbá J Disaster Risk Stud 15:1486. https://doi.org/10.4102/jamba.v15i1.1486
  • Reimann L et al (2024) An empirical social vulnerability map for flood risk assessment at global scale (GlobE-SoVI). Earth’s Future 12:e2023EF003895. https://doi.org/10.1029/2023EF003895
  • Verschuur J et al (2024) Quantifying climate risks to infrastructure systems: a comparative review of developments across infrastructure sectors. PLOS Clim 3(4):e0000331. https://doi.org/10.1371/journal.pclm.0000331
  • Düzen MA, Bölükbaşı İB, Çalık E (2024) How to combine ML and MCDM techniques: an extended bibliometric analysis. J Innov Eng Nat Sci 4:642–657. https://doi.org/10.61112/jiens.1475948

INFORM risk göstergeleriyle makine öğrenimi tabanlı küresel insan gelişimi sınıflandırması

Year 2026, Volume: 6 Issue: 1, 1 - 15

Abstract

Bu çalışma, gelişmiş makine öğrenmesi tekniklerinden yararlanarak, ülkelerin İnsani Gelişme Endeksi (HDI) seviyelerinin INFORM risk göstergeleri kullanılarak sınıflandırılması için en uygun kestirim modelini araştırmaktadır. Altı farklı sınıflandırma algoritması—k-En Yakın Komşu (k-NN), Doğrusal ve Doğrusal Olmayan Destek Vektör Makineleri (SVM), Sınıflandırma ve Regresyon Ağaçları (CART), Bagging ve Rastgele Orman (RF)—sistematik olarak değerlendirilmiştir. Model performansları, hem eğitim hem de test veri setleri üzerinde hesaplanan ağırlıklı F1 skoru aracılığıyla titizlikle değerlendirilmiş ve yöntemler arasında anlamlı farklılıklar ortaya konmuştur. KNN, Doğrusal SVM ve CART, görece sınırlı kestirim doğruluğu sergilerken, Doğrusal Olmayan SVM, eğitim setinde yüksek başarı gösterip test setinde performans düşüşü yaşamasıyla aşırı öğrenme (overfitting) belirtileri göstermiştir.
Buna karşılık, Bagging ve Rastgele Orman gibi topluluk (ensemble) yöntemleri, hem eğitim hem de test veri setlerinde yaklaşık 0.80 düzeyinde dengeli ve yüksek F1 skorları ile tutarlı biçimde üstün performans sergilemiş, bu da onların sağlamlığını ve güçlü genelleme yeteneklerini ortaya koymuştur.
Elde edilen bulgular, gelişim araştırmalarında yüksek boyutlu ve karmaşık sosyo-ekonomik verilerin işlenmesinde topluluk öğrenme tekniklerinin etkinliğini açık biçimde desteklemektedir.
Ayrıca yapılan özellik önem analizi, sosyo-ekonomik, kurumsal ve altyapıya ilişkin değişkenlerin HDI kestirimi üzerinde belirleyici etkiler yarattığını göstermektedir. Bu çalışma, kapsamlı karşılaştırmalı değerlendirmeyi yorumlanabilirlik analizleri ile birleştirerek, topluluk makine öğrenmesi yaklaşımlarının sosyo-ekonomik risk değerlendirmeleri ve insani gelişme öngörüleri için uygulanabilirliğine dair ampirik kanıt sunmaktadır. Elde edilen içgörüler, kalkınma süreçlerini izlemeye ve iyileştirmeye yönelik çalışan politika yapıcılar ve araştırmacılar için veri odaklı değerli bir çerçeve sunmakta ve makine öğrenmesinin sosyo-ekonomik araştırma alanlarındaki dönüştürücü potansiyelini vurgulamaktadır.

References

  • UNDP (United Nations Development Programme) (1990) Human Development Report 1990: Concept and Measurement of Human Development. Oxford University Press, New York.
  • Casau M, Ferreira Dias M, Leite Mota G (2024) Economics, happiness and climate change: exploring new measures of progress. Environ Dev Sustain. https://doi.org/10.1007/s10668-024-05702-2
  • UNDRR (United Nations Office for Disaster Risk Reduction) (2022) Global Assessment Report on Disaster Risk Reduction 2022: Our World at Risk – Transforming Governance for a Resilient Future. UNDRR, Geneva. https://www.undrr.org/gar2022
  • Raikes J, Smith TF, Baldwin C, Henstra D (2021) Linking disaster risk reduction and human development. Clim Risk Manag 32:100291. https://doi.org/10.1016/j.crm.2021.100291
  • Oran FÇ (2023) Afet risk yönetiminin insani gelişim endeksi çerçevesinde incelenmesi. Anadolu Univ J Econ Adm Sci 24(2):233–257. https://doi.org/10.53443/anadoluibfd.1185246
  • Feldmeyer D, Birkmann J, McMillan JM et al (2021) Global vulnerability hotspots: differences and agreement between international indicator-based assessments. Clim Change 169(12). https://doi.org/10.1007/s10584-021-03203-z
  • Eze E, Siegmund A (2024) Identifying disaster risk factors and hotspots in Africa from spatiotemporal decadal analyses using INFORM data for risk reduction and sustainable development. Sustain Dev 32(4):4020–4041. https://doi.org/10.1002/sd.2886
  • Mochizuki J, Naqvi A (2019) Reflecting disaster risk in development indicators. Sustainability 11(4):996. https://doi.org/10.3390/su11040996
  • Inter-Agency Standing Committee and the European Commission (2024) INFORM Report 2024: 10 years of INFORM. Publications Office of the European Union, Luxembourg. https://data.europa.eu/doi/10.2760/555548
  • Ricardo M (2011) Inequality and the new human development index. Appl Econ Lett 19:1–3. https://doi.org/10.1080/13504851.2011.587762
  • UNDP (United Nations Development Programme) (2024) Human Development Report 2024: Breaking the Gridlock. UNDP, USA.
  • Halder RK, Uddin MN, Uddin MA et al (2024) Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. J Big Data 11:113. https://doi.org/10.1186/s40537-024-00973-y
  • Srisuradetchai P, Suksrikran K (2024) Random kernel k-nearest neighbors regression. Front Big Data 7. https://doi.org/10.3389/fdata.2024.1402384
  • Ali A, Hamraz M, Khan DM, Deebani W, Khan Z (2023) A random projection k-nearest neighbours ensemble for classification via extended neighbourhood rule. arXiv preprint arXiv:2303.12210
  • Du KL, Jiang B, Lu J, Hua J, Swamy MNS (2024) Exploring kernel machines and support vector machines: principles, techniques, and future directions. Math 12. https://doi.org/10.3390/math12243935
  • Parlak B, Uysal AK (2019) On classification of abstracts obtained from medical journals. J Inf Sci 46(5):648–663. https://doi.org/10.1177/0165551519860982
  • Amaya-Tejera N, Gamarra M, Velez JI, Zurek E (2024) A distance-based kernel for classification via support vector machines. Front Artif Intell 7. https://doi.org/10.3389/frai.2024.1287875
  • Almaspoor MH, Safaei A, Salajegheh A, Minaei-Bidgoli B (2021) Support vector machines in big data classification: a systematic literature review. Preprint, Research Square. https://doi.org/10.21203/rs.3.rs-663359/v1
  • Sun H (2024) pSVM: Soft-margin SVMs with p-norm hinge loss. arXiv preprint arXiv:2408.09908
  • Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and Regression Trees. Wadsworth, Belmont.
  • Doğruel M, Soner Kara S (2023) Determining the happiness class of countries with tree-based algorithms in machine learning. Acta Infologica 7(2):243–252. https://doi.org/10.26650/acin.1251650
  • Ngo G, Beard R, Chandra R (2022) Evolutionary bagging for ensemble learning. Neurocomputing 510:1–14. https://doi.org/10.1016/j.neucom.2022.08.055
  • Malhotra R, Cherukuri M (2024) A systematic review of hyperparameter tuning techniques for software quality prediction models. Intell Data Anal 28. https://doi.org/10.3233/IDA-230
  • Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655
  • Fernández A, García S, Galar M, Prati RC, Krawczyk B, Herrera F (2018) Learning from Imbalanced Data Sets. Springer Nature, Switzerland. https://doi.org/10.1007/978-3-319-98074-4
  • Wei B, Wang F et al (2024) Adaptive bagging-based dynamic ensemble selection. Expert Syst Appl 255:124860. https://doi.org/10.1016/j.eswa.2024.124860
  • Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
  • Cutler A, Cutler DR, Stevens JR (2012) Random forests. In: Cha Z, Yunqian M (eds) Ensemble Machine Learning: Methods and Applications. Springer, New York, pp 157–175. https://doi.org/10.1007/978-1-4419-9326-7_5
  • Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42. https://doi.org/10.1007/s10994-006-6226-1
  • Birkmann J et al (2022) Understanding human vulnerability to climate-induced disasters. Sci Total Environ 803:150065. https://doi.org/10.1016/j.scitotenv.2021.150065
  • Jamshed A et al. (2023) A bibliometric and systematic review of the Methods for the Improvement of Vulnerability Assessment in Europe (MOVE) framework: A guide for the development of further multi-hazard holistic frameworks. Jàmbá J Disaster Risk Stud 15:1486. https://doi.org/10.4102/jamba.v15i1.1486
  • Reimann L et al (2024) An empirical social vulnerability map for flood risk assessment at global scale (GlobE-SoVI). Earth’s Future 12:e2023EF003895. https://doi.org/10.1029/2023EF003895
  • Verschuur J et al (2024) Quantifying climate risks to infrastructure systems: a comparative review of developments across infrastructure sectors. PLOS Clim 3(4):e0000331. https://doi.org/10.1371/journal.pclm.0000331
  • Düzen MA, Bölükbaşı İB, Çalık E (2024) How to combine ML and MCDM techniques: an extended bibliometric analysis. J Innov Eng Nat Sci 4:642–657. https://doi.org/10.61112/jiens.1475948
There are 34 citations in total.

Details

Primary Language English
Subjects Supervised Learning, Machine Learning Algorithms, Data Mining and Knowledge Discovery, Data Analysis
Journal Section Research Articles
Authors

Merve Doğruel 0000-0003-2299-7182

Publication Date November 5, 2025
Submission Date June 21, 2025
Acceptance Date August 17, 2025
Published in Issue Year 2026 Volume: 6 Issue: 1

Cite

APA Doğruel, M. (n.d.). Machine learning-enabled classification of global human development using INFORM risk indicators. Journal of Innovative Engineering and Natural Science, 6(1), 1-15.


by.png
Journal of Innovative Engineering and Natural Science by İdris Karagöz is licensed under CC BY 4.0