Research Article

Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness

Volume: 13 Number: 2 June 30, 2026

Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness

Abstract

Tabular deep learning is still challenging in real-world settings. Many datasets include both numerical and categorical variables, substantial missingness, and a need for not only strong classification performance but also interpretability and reliable probability estimates. DA2-Net is proposed to address this problem through a dual-branch architecture. It combines an interpretable additive pathway for feature-wise main effects with a selective self-attention pathway for higher-order interactions. In this design, features are ranked using additive contribution magnitude, uncertainty, and missingness-aware scaling. Only a Top-K subset is then passed to a single multi-head self-attention block. The final prediction is obtained through uncertainty-aware gated fusion. The model is also supported by sparsity, stability, and Brier-based calibration regularization. This allows it to balance expressive interaction modeling with transparency and robustness under incomplete data. DA2-Net is evaluated on four public binary tabular benchmarks, namely AdultIncome, DefaultCredit, HeartDisease, and BankMarketing, under controlled Missing Completely At Random (MCAR) missingness levels of 0.0, 0.1, 0.2, and 0.3. The evaluation uses 5-fold stratified cross-validation repeated across three random seeds. This produces 15 runs for each dataset and missingness condition, and 128 evaluation blocks in total across AUC, AUPRC, ACC, F1, sensitivity, specificity, Brier score, and Expected Calibration Error (ECE). Across this benchmark, DA2-Net achieves the best overall mean rank with 3.078 ± 2.044, ahead of SAINT-Lite at 3.980 ± 2.624. It achieves or shares the best result in all 16 AUC blocks, 13 of 16 AUPRC blocks, 10 of 16 ACC blocks, 11 of 16 Brier blocks, and 7 of 16 ECE blocks. These results show that its main strength lies in robust ranking-based discrimination and strong overall probability quality under missingness. It also shows a favorable practical-efficiency profile in the current benchmark, remaining more compact and inference-efficient than the main transformer-like baselines. Epoch-wise loss analysis also shows stable convergence across all four datasets. The binary cross-entropy (BCE) term drives the optimization, while the auxiliary regularizers act as controlled refinements. The ablation study further confirms that the interaction branch is essential. Removing it in the AdditiveOnly variant causes the clearest degradation in both predictive and calibration metrics. In contrast, removing the gate or the auxiliary regularization terms leads only to minor changes. A sensitivity analysis also supported the selected interaction subset size k=10 and spline knot count K=8 as balanced settings, while additive shape-function visualizations provided direct qualitative evidence for feature-wise interpretability.

Keywords

Ethical Statement

This study uses publicly available datasets and does not involve human participants, animals, or any prospective data collection. All analyses were conducted on de-identified, publicly accessible data; therefore, ethical approval and informed consent were not required. The study was performed in accordance with applicable institutional and international research integrity principles.

References

  1. Adhikari, D., Jiang, W., Zhan, J., He, Z., Rawat, D. B., Aickelin, U., & Khorshidi, H. A. (2022). A comprehensive survey on imputation of missing data in internet of things. ACM Computing Surveys, 55(7), 133. https://doi.org/10.1145/3533381
  2. Agarwal, R., Melnick, L., Frosst, N., Zhang, X., Lengerich, B., Caruana, R., & Hinton, G. E. (2021, December 6-14). Neural additive models: Interpretable machine learning with neural nets. In: M. Ranzato, A. Beygelzimer, Y. Dauphin, P. S. Liang, & J. Wortman Vaughan (Eds.), Proceedings of the 35th International Conference on Neural Information Processing Systems (NIPS'21), (pp. 4699-4711). https://doi.org/10.48550/arXiv.2004.13912
  3. Arik, S. Ö., & Pfister, T. (2021, February 2-9). Tabnet: Attentive interpretable tabular learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 8, pp. 6679-6687). https://doi.org/10.1609/aaai.v35i8.16826
  4. Borisov, V., Leemann, T., Seßler, K., Haug, J., Pawelczyk, M., & Kasneci, G. (2022). Deep neural networks and tabular data: A survey. IEEE Transactions on Neural Networks and Learning Systems, 35(6), 7499-7519. http://doi.org/10.1109/TNNLS.2022.3229161
  5. Carmichael, Z., & Scheirer, W. J. (2023). How well do feature-additive explainers explain feature-additive predictors?. https://doi.org/10.48550/arXiv.2310.18496
  6. Chang, C. H., Tan, S., Lengerich, B., Goldenberg, A., & Caruana, R. (2021, August 14-18). How interpretable and trustworthy are gams?. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD’21), (pp. 95-105), Singapore. https://doi.org/10.1145/3447548.3467453
  7. Choi, S. R., & Lee, M. (2023). Transformer architecture and attention mechanisms in genome data analysis: a comprehensive review. Biology, 12(7), 1033. https://doi.org/10.3390/biology12071033
  8. Cihan, M. (2026, February 4-5). Interpretable Additive Modeling for Heart Disease Prediction: A Reproducible Benchmark on the UCI Cleveland Dataset. In: M. Keskin (Eds.), International Congress of Health Disciplines (ICHD 2026), (pp. 42-70).

Details

Primary Language

English

Subjects

Deep Learning, Neural Networks

Journal Section

Research Article

Publication Date

June 30, 2026

Submission Date

January 23, 2026

Acceptance Date

April 20, 2026

Published in Issue

Year 2026 Volume: 13 Number: 2

APA
Cihan, M. (2026). Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness. Gazi University Journal of Science Part A: Engineering and Innovation, 13(2), 865-905. https://doi.org/10.54287/gujsa.1870409
AMA
1.Cihan M. Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness. GU J Sci, Part A. 2026;13(2):865-905. doi:10.54287/gujsa.1870409
Chicago
Cihan, Mücahit. 2026. “Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness”. Gazi University Journal of Science Part A: Engineering and Innovation 13 (2): 865-905. https://doi.org/10.54287/gujsa.1870409.
EndNote
Cihan M (June 1, 2026) Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness. Gazi University Journal of Science Part A: Engineering and Innovation 13 2 865–905.
IEEE
[1]M. Cihan, “Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness”, GU J Sci, Part A, vol. 13, no. 2, pp. 865–905, June 2026, doi: 10.54287/gujsa.1870409.
ISNAD
Cihan, Mücahit. “Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness”. Gazi University Journal of Science Part A: Engineering and Innovation 13/2 (June 1, 2026): 865-905. https://doi.org/10.54287/gujsa.1870409.
JAMA
1.Cihan M. Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness. GU J Sci, Part A. 2026;13:865–905.
MLA
Cihan, Mücahit. “Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness”. Gazi University Journal of Science Part A: Engineering and Innovation, vol. 13, no. 2, June 2026, pp. 865-0, doi:10.54287/gujsa.1870409.
Vancouver
1.Mücahit Cihan. Uncertainty-Gated Dual-Branch Additive–Attention Network for Robust and Calibrated Tabular Classification Under Missingness. GU J Sci, Part A. 2026 Jun. 1;13(2):865-90. doi:10.54287/gujsa.1870409