Research Article

HypoGAN: A mode- and boundary-aware generative oversampling framework for imbalanced tabular data

Number: Advanced Online Publication Early Pub Date: June 1, 2026
EN

HypoGAN: A mode- and boundary-aware generative oversampling framework for imbalanced tabular data

Abstract

While numerous oversampling strategies have emerged in recent years, synthesizing minority-class samples that preserve the intrinsic data structure while improving classifier performance remains challenging, especially under severe class imbalance. In this study, we propose HypoGAN, a generative oversampling framework for imbalanced numeric tabular data that integrates minority mode discovery, local minority-majority boundary information, latent-noise-driven candidate generation, and post-generation filtering to produce classification-useful minority samples. To evaluate the proposed framework, we adopted a two-phase experimental strategy: (i) simulation experiments under class imbalance ratios of 90:10, 95:5, and 98:2 with feature dimensionalities of n = 3, 5, 10, 20, and (ii) real-world experiments on Wisconsin Breast Cancer, Pima Indians Diabetes, and detection of credit card fraud. In all experiments, HypoGAN was compared with SMOTE, ADASYN, and Borderline-SMOTE within a nested train/validation/test framework with leakage protection. The results indicate that HypoGAN is a competitive oversampling framework in a range of challenging imbalance settings. In the simulation study, it achieved particularly strong performance in lower- and medium-dimensional scenarios, while remaining precision-oriented under more difficult conditions. In real-world experiments, HypoGAN remained competitive in the Wisconsin Breast Cancer and Credit Card Fraud Detection datasets, achieving F1-scores of 0.9489 and 0.9028, respectively, compared to 0.9489 and 0.9037 for SMOTE. Additional results suggest that HypoGAN’s performance is scenariodependent, with effectiveness influenced by the dataset’s structure and the characteristics of the imbalance.

Keywords

References

  1. 1] N.V. Chawla, K.W. Bowyer, L.O. Hall and W.P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique, J. Artif. Intell. Res. 16, 321-357, 2002.

Details

Primary Language

English

Subjects

Adversarial Machine Learning, Classification Algorithms

Journal Section

Research Article

Early Pub Date

June 1, 2026

Publication Date

-

Submission Date

March 3, 2026

Acceptance Date

May 30, 2026

Published in Issue

Year 2026 Number: Advanced Online Publication

APA
Alpay, O. (2026). HypoGAN: A mode- and boundary-aware generative oversampling framework for imbalanced tabular data. Hacettepe Journal of Mathematics and Statistics, Advanced Online Publication, 1-24. https://doi.org/10.15672/hujms.1902068
AMA
1.Alpay O. HypoGAN: A mode- and boundary-aware generative oversampling framework for imbalanced tabular data. Hacettepe Journal of Mathematics and Statistics. 2026;(Advanced Online Publication):1-24. doi:10.15672/hujms.1902068
Chicago
Alpay, Olcay. 2026. “HypoGAN: A Mode- and Boundary-Aware Generative Oversampling Framework for Imbalanced Tabular Data”. Hacettepe Journal of Mathematics and Statistics, no. Advanced Online Publication: 1-24. https://doi.org/10.15672/hujms.1902068.
EndNote
Alpay O (June 1, 2026) HypoGAN: A mode- and boundary-aware generative oversampling framework for imbalanced tabular data. Hacettepe Journal of Mathematics and Statistics Advanced Online Publication 1–24.
IEEE
[1]O. Alpay, “HypoGAN: A mode- and boundary-aware generative oversampling framework for imbalanced tabular data”, Hacettepe Journal of Mathematics and Statistics, no. Advanced Online Publication, pp. 1–24, June 2026, doi: 10.15672/hujms.1902068.
ISNAD
Alpay, Olcay. “HypoGAN: A Mode- and Boundary-Aware Generative Oversampling Framework for Imbalanced Tabular Data”. Hacettepe Journal of Mathematics and Statistics. Advanced Online Publication (June 1, 2026): 1-24. https://doi.org/10.15672/hujms.1902068.
JAMA
1.Alpay O. HypoGAN: A mode- and boundary-aware generative oversampling framework for imbalanced tabular data. Hacettepe Journal of Mathematics and Statistics. 2026;:1–24.
MLA
Alpay, Olcay. “HypoGAN: A Mode- and Boundary-Aware Generative Oversampling Framework for Imbalanced Tabular Data”. Hacettepe Journal of Mathematics and Statistics, no. Advanced Online Publication, June 2026, pp. 1-24, doi:10.15672/hujms.1902068.
Vancouver
1.Olcay Alpay. HypoGAN: A mode- and boundary-aware generative oversampling framework for imbalanced tabular data. Hacettepe Journal of Mathematics and Statistics. 2026 Jun. 1;(Advanced Online Publication):1-24. doi:10.15672/hujms.1902068