Research Article

MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING

Volume: 27 Number: 1 March 27, 2026
TR EN

MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING

Abstract

In this study, sector-based methane emissions of European countries were modeled using a Random Forest–based machine learning approach applied to a panel dataset covering the period 2014–2023 with country–sector–year dimensions. The primary objective of the study is not to maximize predictive accuracy, but to evaluate how different validation strategies affect model performance and generalization behavior. Accordingly, three validation strategies—random training–test split, temporal (time-based) validation, and country-based group validation—were comparatively analyzed. The dataset, obtained from Eurostat, comprises 29 countries, 5 sectors, and 1,449 observations. Model performance was evaluated using root mean square error and the coefficient of determination. Under random splitting, the model achieved very low errors (mean RMSE = 0.0126 ± 0.0025; mean R² = 0.9993 ± 0.0003), although these results may be optimistic due to information leakage. Temporal validation yielded stable near-future performance (RMSE = 0.0225, R² = 0.9975). In contrast, country-based group validation resulted in a substantial performance decline (average RMSE = 0.3132 ± 0.4061), indicating strong cross-country heterogeneity. Overall, the findings demonstrate that, in panel data settings, the choice of validation strategy is as critical as the machine learning algorithm for realistic generalization assessment.

Keywords

Panel data, Machine learning, Validation strategies, Greenhouse gas emissions

Supporting Institution

No external funding was received for this study.

Ethical Statement

No human- subjects data were collected therefore, IRB/ethics committee approval was not required.

References

  1. [1] World Meteorological Organization. WMO Greenhouse Gas Bulletin No. 19: The State of Greenhouse Gases in the Atmosphere. World Meteorological Organization; 2023. Accessed: December 14, 2025. https://bpb-us-w2.wpmucdn.com/blog.nus.edu.sg/dist/0/15540/files/2019/11/ghg_bulletin_en.pdf
  2. [2] World Meteorological Organization. State of the Global Climate 2021. World Meteorological Organization; 2022. Accessed: February 14, 2025. https://wmo.int/resources/publication-series/state-of-global-climate/state-of-global-climate-2021
  3. [3] Gan N, Zhao S. Global greenhouse gas reduction forecasting via machine learning model in the scenario of energy transition. J Environ Manage 2024;371:123309.
  4. [4] Eurostat. Greenhouse gas emissions by source sector. Eurostat; 2024. Accessed October 09, 2025. https://ec.europa.eu/eurostat
  5. [5] UNFCCC. Greenhouse Gas Inventory Data – Time Series. UNFCCC; 2025. Accessed January 05, 2025. https://di.unfccc.int/time_series
  6. [6] Crippa M, Solazzo E, Huang G, Guizzardi D, Koffi E, Muntean M, Schieberle C, Friedrich R, Janssens-Maenhout G. High resolution temporal profiles in the Emissions Database for Global Atmospheric Research. Sci Data 2020; 7(1):121.
  7. [7] Wood R, Neuhoff K, Moran D, Simas M, Grubb M, Stadler K. The structure, drivers and policy implications of the European carbon footprint. Clim Policy 2020; 20(1), S39-S57.
  8. [8] Marotta A, Porras-Amores C, Rodríguez Sánchez A, Villoria Sáez P, Maser G. Greenhouse gas emissions forecasts in countries of the european union by means of a multifactor algorithm. Applied Sciences 2023;13(14), 8520.
  9. [9] Ene Yalçın, S. Development of a Forecasting Framework Based on Advanced Machine Learning Algorithms for Greenhouse Gas Emissions. Systems 2024; 12(12): 528.
  10. [10] Berrington A, Halpin B, Wiggins R. An overview of methods for the analysis of panel data. NCRM Methods Review Paper NCRM/007. National Centre for Research Methods. 2006. Accessed March 14, 2026. https://eprints.ncrm.ac.uk/id/eprint/415/1/MethodsReviewPaperNCRM-007.pdf
APA
Demircioğlu Diren, D. (2026). MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering, 27(1), 204-219. https://doi.org/10.18038/estubtda.1891746
AMA
1.Demircioğlu Diren D. MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING. Estuscience - Se. 2026;27(1):204-219. doi:10.18038/estubtda.1891746
Chicago
Demircioğlu Diren, Deniz. 2026. “MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING”. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering 27 (1): 204-19. https://doi.org/10.18038/estubtda.1891746.
EndNote
Demircioğlu Diren D (March 1, 2026) MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering 27 1 204–219.
IEEE
[1]D. Demircioğlu Diren, “MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING”, Estuscience - Se, vol. 27, no. 1, pp. 204–219, Mar. 2026, doi: 10.18038/estubtda.1891746.
ISNAD
Demircioğlu Diren, Deniz. “MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING”. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering 27/1 (March 1, 2026): 204-219. https://doi.org/10.18038/estubtda.1891746.
JAMA
1.Demircioğlu Diren D. MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING. Estuscience - Se. 2026;27:204–219.
MLA
Demircioğlu Diren, Deniz. “MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING”. Eskişehir Technical University Journal of Science and Technology A - Applied Sciences and Engineering, vol. 27, no. 1, Mar. 2026, pp. 204-19, doi:10.18038/estubtda.1891746.
Vancouver
1.Deniz Demircioğlu Diren. MACHINE LEARNING AND VALIDATION STRATEGIES IN PANEL DATA-BASED GREENHOUSE GAS EMISSION MODELING. Estuscience - Se. 2026 Mar. 1;27(1):204-19. doi:10.18038/estubtda.1891746