Graph Neural Network-Based Prediction of Soft Error Vulnerability and Criticality of Functions in Scientific Applications

Sanem Arslan Yılmaz

doi:10.54287/gujsa.1766028

Research Article

Graph Neural Network-Based Prediction of Soft Error Vulnerability and Criticality of Functions in Scientific Applications

Year 2025, Volume: 12 Issue: 4, 979 - 998, 31.12.2025

Sanem Arslan Yılmaz

https://doi.org/10.54287/gujsa.1766028

https://izlik.org/JA83SU27UZ

Abstract

Soft errors caused by transient hardware faults can lead to silent data corruptions (SDCs) in scientific applications, potentially impacting correctness and reliability. Traditional fault injection (FI) methods provide accurate vulnerability measurements but are prohibitively time-consuming and resource-intensive. In this work, we propose a function-level prediction framework for SDC vulnerability and criticality in CPU-based scientific applications using Graph Neural Networks (GNNs). Static code features are extracted from LLVM intermediate representation and used to construct function call graphs, enabling GCN, GAT, and GraphSAGE models to capture both intra-function characteristics and inter-function dependencies. The problem is formulated as both regression and classification, predicting continuous vulnerability and criticality scores as well as binary labels. The evaluation is conducted on 30 applications (90 functions) from the PolyBench benchmark suite using leave-one-application-out cross-validation, ensuring that the model is tested on unseen applications. Among the evaluated architectures, GraphSAGE achieves the highest performance (F1 = 0.80, MAE = 0.17), showing strong generalization across diverse workloads. Feature correlation and model-based importance analyses identify the most influential LLVM features, and results demonstrate that the proposed approach provides fine-grained, accurate predictions without the need for exhaustive FI campaigns, enabling more efficient and targeted fault-tolerance strategies.

Keywords

Fault Tolerance , Reliability , GNN , Vulnerability Prediction

References

Allamanis, M., Barr, E. T., Devanbu, P., & Sutton, C. (2018). A survey of machine learning for big code and naturalness. ACM Computing Surveys, 51(4). https://doi.org/10.1145/3212695
Arslan, S., & Unsal, O. (2021). Efficient selective replication of critical code regions for SDC mitigation leveraging redundant multithreading. Journal of Supercomputing, 77(12), 14130–14160. https://doi.org/10.1007/s11227-021-03804-6
Cao, S., Sun, X., Bo, L., Wu, R., Li, B., & Tao, C. (2022, May 21-29). MVD: Memory-related vulnerability detection based on flow-sensitive graph neural networks. In: Proceedings of the 44th International Conference on Software Engineering (ICSE’22) (pp. 1456–1468), Pittsburgh Pennsylvania. https://doi.org/10.1145/3510003.3510219
Fey, M., & Lenssen, J. E. (2019). Fast graph representation learning with PyTorch Geometric. In: Proceedings of the ICLR Workshop on Representation Learning on Graphs and Manifolds. https://doi.org/10.48550/arXiv.1903.02428
Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems (NeurIPS), (pp. 1024–1034). https://doi.org/10.48550/arXiv.1706.02216
Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1609.02907
Laguna, I., Schulz, M., Richards, D. F., Calhoun, J., & Olson, L. N. (2016, March 12-18). IPAS: Intelligent protection against silent output corruption in scientific applications. In: Proceedings of the 2016 International Symposium on Code Generation and Optimization (CGO '16). Association for Computing Machinery (pp. 227–2389, Barcelona, Spain. https://doi.org/10.1145/2854038.2854059
Lu, Q., Farahani, M., Wei, J., & Pattabiraman, K. (2015, August 3-5). LLFI: An intermediate code level fault injection tool for hardware faults. In: Proceedings of the International Conference on Dependable Systems and Networks (DSN ’15), Vancouver, BC, Canada. https://doi.org/10.1109/QRS.2015.13
Mukherjee, S. S., Kontz, C. T., & Reinhardt, S. K. (2002, May 25-29). Detailed design and evaluation of redundant multithreading alternatives. In: Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA), (pp. 99–110), Anchorage, AK, USA. https://doi.org/10.1109/ISCA.2002.1003566
Ni, C., Guo, X., Zhu, Y., Xu, X., & Yang, X. (2024, September 11-15). Function-level Vulnerability Detection Through Fusing Multi-Modal Knowledge. In: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE '23). IEEE Press, (pp. 1911–1918), Luxembourg, Luxembourg. https://doi.org/10.1109/ASE56229.2023.00084
Öz, I., and Arslan, S. (2021). Predicting the soft error vulnerability of parallel applications using machine learning. International Journal of Parallel Programming, 49, 410–439. https://doi.org/10.1007/s10766-021-00707-0
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., & Chintala, S. (2019). PyTorch: An imperative style, high-performance deep learning library. In: H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 32 (NeurIPS 2019) (pp. 8024–8035). Curran Associates, Inc.
Pouchet, L.-N. (2012). Polybench/c: The polyhedral benchmark suite. (Accessed: August 8, 2025) https://www.cs.colostate.edu/~pouchet/software/polybench/
Topçu, B., & Öz, I. (2023). Soft error vulnerability prediction of gpgpu applications. The Journal of Supercomputing, 79, 6965–6990. https://doi.org/10.1007/s11227-022-04933-2
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Li`o, P., & Bengio, Y. (2018). Graph attention networks. In: International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.1710.10903
Wei, X., Zhao, J., Jiang, N., & Yue, H. (2023, October 20–22). GLAM-SERP: Building a graph learning-assisted model for soft error resilience prediction in GPGPUs. In: Algorithms and Architectures for Parallel Processing: 23rd International Conference, ICA3PP 2023, Proceedings, Part IV, LNCS 14490, (pp. 419–435), Tianjin, China. https://doi.org/10.1007/978-981-97-0859-8_25
Zou, D., Wang, S., Xu, S., Li, Z., & Jin, H. (2021). μVulDeePecker: A Deep Learning-Based System for Multiclass Vulnerability Detection. IEEE Transactions on Dependable and Secure Computing, vol. 18, no. 05, pp. 2224-2236, Sept.-Oct. 2021, https://doi.org/10.1109/TDSC.2019.2942930

There are 17 citations in total.

Details

Primary Language	English
Subjects	Dependable Systems, Deep Learning
Journal Section	Research Article
Authors	Sanem Arslan Yılmaz 0000-0003-3019-7070
Submission Date	August 19, 2025
Acceptance Date	November 14, 2025
Publication Date	December 31, 2025
DOI	https://doi.org/10.54287/gujsa.1766028
IZ	https://izlik.org/JA83SU27UZ
Published in Issue	Year 2025 Volume: 12 Issue: 4

Cite

APA	Arslan Yılmaz, S. (2025). Graph Neural Network-Based Prediction of Soft Error Vulnerability and Criticality of Functions in Scientific Applications. Gazi University Journal of Science Part A: Engineering and Innovation, 12(4), 979-998. https://doi.org/10.54287/gujsa.1766028

Article Files

Full Text