Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning

Muhammad Owais Raza; Vajeeha Mir; Jawad Rasheed; Mirsat Yesiltepe; Shtwai Alsubai

doi:10.26650/acin.1634472

Araştırma Makalesi

BibTex

RIS

Kaynak Göster

Yıl 2025, Cilt: 9 Sayı: 1 , 275 - 292 , 30.06.2025

Muhammad Owais Raza , Vajeeha Mir , Jawad Rasheed , Mirsat Yesiltepe , Shtwai Alsubai

https://doi.org/10.26650/acin.1634472

https://izlik.org/JA76GC67PJ

Öz

Kaynakça

Ali, Z. H., & Burhan, A. M. (2023). Hybrid machine learning approach for construction cost estimation: Evaluation of the extreme gradient boosting model. Asian Journal of Cıvıl Engineering, 24(7), 2427-2442, 2017. google scholar
Binkhonain, M., & Zhao, L. (2023). A machine learning approach for hierarchical classification of software requirements. Machine Learning with Applications, 12, 100457. https://doi.org/10.1016/j.mlwa.2023.100457. google scholar
Binkhonain, M., & Zhao, L. (2019). A review of machine learning algorithms for the Identification and classification of non-functional requirements. Expert Systems with Applications: X, 1, 100001. google scholar
Catal, C., & Diri, B. (2009). We investigate the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problems. Information Sciences, 179(8), 1040-1058. google scholar
Carreira-Perpinân, M. Â., & Zharmagambetov, A. (2020). The ensembles of bagged TAO trees consistently improved över random forests, AdaBoost, and gradient boosting. İn Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference (pp. 35-46). google scholar
Devlin, J., Chang, M.-W., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. İn Proceedings of NAACL-HLT (p. 2). Minneapolis, MN, USA. google scholar
Dias Canedo, E., & Cordeiro Mendes, B. (2020). Software requirements classification using machine learning algorithms. Entropy, 22(9), 1057. https://doi.org/10.3390/e22091057. google scholar
Eckhardt, J., Vogelsang, A., & Fern'andez, D. M. (2016). Are non-functional requirements really non-functional? An investigation of non-functional requirements in practice. In Proceedings of the 38th International Conference on Software Engineering (pp. 832-842). Austin, TX, USA. google scholar
Garcîa, S. M., Fernândez-y-Fernândez, C. A., & Perez, E. R. (2023). Classification of non-functional requirements using convolutional neural networks. Programming and Computer Software, 49(8), 705-711. google scholar
Handa, N., Sharma, A., & Gupta, A. (2022). Framework for prediction and classification of non-functional requirements: A novel vision. Cluster Computing, 25(2), 1155-1173. google scholar
Haque, M. A., Rahman, M. A., & Siddik, M. S. (2019). Non-functional requirements classification with feature extraction and machine learning: An empirical study. In 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT) (pp. 1-5). Dhaka, Bangladesh. google scholar
Hey, T., Keim, J., Koziolek, A., & Tichy, W. F. (2020). Norbert: Transfer learning for requirements classification. In 2020 IEEE 28th International Requirements Engineering Conference (RE) (pp. 169-179). Zurich, Switzerland. google scholar
Khan, M. A., Khan, M. S., Khan, I., Ahmad, S., & Huda, S. (2023). Non-functional requirements identification and classification using transfer learning model. IEEE Access. google scholar
Khatian, V. M., Arain, Q. A., Alenezi, M., Raza, M. O., & Shaikh, F. (2021). Comparative analysis for predicting non-functional requirements using supervised machine learning. In 2021 1st International Conference on Artificial Intelligence and Data Analytics (CAIDA) (pp. 7-12). Riyadh, Saudi Arabia. google scholar
Kaur, K., & Kaur, P. (2023). Improving the BERT model for requirements classification using a bidirectional LSTM-CNN deep model. Computers and Electrical Engineering. google scholar
Laplante, P. A., & Kassab, M. (2022). Requirement engineering of software and systems. Boca Raton, FL: Auerbach Publications. google scholar
Li, B., & Nong, X. (2022). Automatically classifying non-functional requirements using a deep neural network. Pattern Recognition, 132, 108948. google scholar
Luitel, D., Hassani, S., & Sabetzadeh, M. (2024). Improving requirements completeness: Automated assistance through large language models. Requirements Engineering, 29(1), 73-95. google scholar
Luo, X. (2021). Efficient English text classification using selected machine learning techniques. Alexandria Engineering Journal, 60(3), 3401-3409. google scholar
Mullis, J., Chen, C., Morkos, B., & Ferguson, S. (2024). Deep neural networks in natural language processing to classify requirements by origin and functionality: Application of BERT in system requirements. Journal of Mechanical Design, 146(4), 041401. google scholar
Okesola, O. J., Adebiyi, A. A., Owoade, A. A., Adeaga, O., Adeyemi, O., et al. (2020). Software requirement in iterative SDLC model. In Intelligent algorithms in software engineering: Proceedings of the 9th Computer science on-line conference 2020 (pp. 26-34). Cham, Switzerland: Springer; 2020. google scholar
Raza, O. “Owais4321/software-requirement-dataset: Dataset for Software Requirement Engineering. dataset, Zenodo, Apr. 17, 2025, doi:10.5281/zenodo.15235904. google scholar
Rahman, K., Ghani, A., Misra, S., & Rahman, A. U. (2024). A deep learning framework for non-functional requirement classification. Scientific Reports, 14(1), 3216. google scholar
Rahimi, N., Eassa, F., & Elrefaei, L. (2021). One-and two-phase software requirement classification using ensemble deep learning. Entropy, 23(10), 1264. google scholar
Rashwan, A., Ormandjieva, O., & Witte, R. (2013). Ontology-based classification of non-functional requirements in software specifications: A new corpus and SVM-based classifier. İn 2013 IEEE 37th Annual Computer Software and Applications Conference (pp. 381-386). Kyoto, Japan. google scholar
Sherif, E., Helmy, W., & Galal-Edeen, G. H. (2023). Proposed frameworK to manage non-functional requirements in agile. IEEE Access, 11, 53995-54005. google scholar
Syriopoulos, P. K., Kalampalikis, N. G., Kotsiantis, S. B., & Vrahatis, M. N. (2023). kNN classification: A review. Annals of Mathematics and Artificial Intelligence, 1-33. google scholar
Wang, H., Li, G., & Wang, Z. (2023). Fast SVM classifier for large-scale classification problems. Information Sciences, 642, 119136. google scholar
Yuvaraj, N., Chang, V., Gobinathan, B., Pinagapani, A., Kannan, S., Dhiman, G., & Rajan, A. R. (2021). Automatic detection of cyberbullying using multifeature artificial intelligence with deep decision tree classification. Computers and Electrical Engineering, 92, 107186. google scholar
Younas, M., Jawawi, D. N., Ghani, I., & Shah, M. A. (2020). Extraction of non-functional requirement using semantic similarity distance. Neural Computing and Applications, 32, 7383-7397. google scholar

Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning

Yıl 2025, Cilt: 9 Sayı: 1 , 275 - 292 , 30.06.2025

Muhammad Owais Raza , Vajeeha Mir , Jawad Rasheed , Mirsat Yesiltepe , Shtwai Alsubai

https://doi.org/10.26650/acin.1634472

https://izlik.org/JA76GC67PJ

Öz

Software engineering involves numerous steps; a successful software product follows these guidelines to the core. One such step is gathering requirements for a software product. This step is quite expensive in terms of time and money; a potential solution is to automate the requirement collection process. Automating the process of gathering software requirements requires separating requirements into types. An approach to predict the type of requirement is using text classification and machine learning; however, the problem with this approach is that it requires a large amount of data, which is not available for this use case. In this study, we perform dataset fusion to create a large dataset. We applied vertical fusion, which increased the number of instances in the dataset. Once a fusion-based dataset is created, machine learning algorithms are applied, and based on empirical results, the performance of the machine learning model after fusion drastically improved to 87.78% f1score with support vector machine (SVM). This improvement shows the efficacy of data fusion in improving the performance of a text classifier and demonstrates that it can overcome the limitations of small datasets by combining data from diverse sources. Our study demonstrated the robustness of our approach in software requirement classification by surpassing the highest recall scores from the previous four years, achieving 94.20% with fusion-based SVC and outperforming previous models even in non-fusion settings.

Anahtar Kelimeler

Software Engineering , Machine Learning , Software Requirement Engineering , Hybrid Models , Ensemble Modeling

Kaynakça

Ali, Z. H., & Burhan, A. M. (2023). Hybrid machine learning approach for construction cost estimation: Evaluation of the extreme gradient boosting model. Asian Journal of Cıvıl Engineering, 24(7), 2427-2442, 2017. google scholar
Binkhonain, M., & Zhao, L. (2023). A machine learning approach for hierarchical classification of software requirements. Machine Learning with Applications, 12, 100457. https://doi.org/10.1016/j.mlwa.2023.100457. google scholar
Binkhonain, M., & Zhao, L. (2019). A review of machine learning algorithms for the Identification and classification of non-functional requirements. Expert Systems with Applications: X, 1, 100001. google scholar
Catal, C., & Diri, B. (2009). We investigate the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problems. Information Sciences, 179(8), 1040-1058. google scholar
Carreira-Perpinân, M. Â., & Zharmagambetov, A. (2020). The ensembles of bagged TAO trees consistently improved över random forests, AdaBoost, and gradient boosting. İn Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference (pp. 35-46). google scholar
Devlin, J., Chang, M.-W., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. İn Proceedings of NAACL-HLT (p. 2). Minneapolis, MN, USA. google scholar
Dias Canedo, E., & Cordeiro Mendes, B. (2020). Software requirements classification using machine learning algorithms. Entropy, 22(9), 1057. https://doi.org/10.3390/e22091057. google scholar
Eckhardt, J., Vogelsang, A., & Fern'andez, D. M. (2016). Are non-functional requirements really non-functional? An investigation of non-functional requirements in practice. In Proceedings of the 38th International Conference on Software Engineering (pp. 832-842). Austin, TX, USA. google scholar
Garcîa, S. M., Fernândez-y-Fernândez, C. A., & Perez, E. R. (2023). Classification of non-functional requirements using convolutional neural networks. Programming and Computer Software, 49(8), 705-711. google scholar
Handa, N., Sharma, A., & Gupta, A. (2022). Framework for prediction and classification of non-functional requirements: A novel vision. Cluster Computing, 25(2), 1155-1173. google scholar
Haque, M. A., Rahman, M. A., & Siddik, M. S. (2019). Non-functional requirements classification with feature extraction and machine learning: An empirical study. In 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT) (pp. 1-5). Dhaka, Bangladesh. google scholar
Hey, T., Keim, J., Koziolek, A., & Tichy, W. F. (2020). Norbert: Transfer learning for requirements classification. In 2020 IEEE 28th International Requirements Engineering Conference (RE) (pp. 169-179). Zurich, Switzerland. google scholar
Khan, M. A., Khan, M. S., Khan, I., Ahmad, S., & Huda, S. (2023). Non-functional requirements identification and classification using transfer learning model. IEEE Access. google scholar
Khatian, V. M., Arain, Q. A., Alenezi, M., Raza, M. O., & Shaikh, F. (2021). Comparative analysis for predicting non-functional requirements using supervised machine learning. In 2021 1st International Conference on Artificial Intelligence and Data Analytics (CAIDA) (pp. 7-12). Riyadh, Saudi Arabia. google scholar
Kaur, K., & Kaur, P. (2023). Improving the BERT model for requirements classification using a bidirectional LSTM-CNN deep model. Computers and Electrical Engineering. google scholar
Laplante, P. A., & Kassab, M. (2022). Requirement engineering of software and systems. Boca Raton, FL: Auerbach Publications. google scholar
Li, B., & Nong, X. (2022). Automatically classifying non-functional requirements using a deep neural network. Pattern Recognition, 132, 108948. google scholar
Luitel, D., Hassani, S., & Sabetzadeh, M. (2024). Improving requirements completeness: Automated assistance through large language models. Requirements Engineering, 29(1), 73-95. google scholar
Luo, X. (2021). Efficient English text classification using selected machine learning techniques. Alexandria Engineering Journal, 60(3), 3401-3409. google scholar
Mullis, J., Chen, C., Morkos, B., & Ferguson, S. (2024). Deep neural networks in natural language processing to classify requirements by origin and functionality: Application of BERT in system requirements. Journal of Mechanical Design, 146(4), 041401. google scholar
Okesola, O. J., Adebiyi, A. A., Owoade, A. A., Adeaga, O., Adeyemi, O., et al. (2020). Software requirement in iterative SDLC model. In Intelligent algorithms in software engineering: Proceedings of the 9th Computer science on-line conference 2020 (pp. 26-34). Cham, Switzerland: Springer; 2020. google scholar
Raza, O. “Owais4321/software-requirement-dataset: Dataset for Software Requirement Engineering. dataset, Zenodo, Apr. 17, 2025, doi:10.5281/zenodo.15235904. google scholar
Rahman, K., Ghani, A., Misra, S., & Rahman, A. U. (2024). A deep learning framework for non-functional requirement classification. Scientific Reports, 14(1), 3216. google scholar
Rahimi, N., Eassa, F., & Elrefaei, L. (2021). One-and two-phase software requirement classification using ensemble deep learning. Entropy, 23(10), 1264. google scholar
Rashwan, A., Ormandjieva, O., & Witte, R. (2013). Ontology-based classification of non-functional requirements in software specifications: A new corpus and SVM-based classifier. İn 2013 IEEE 37th Annual Computer Software and Applications Conference (pp. 381-386). Kyoto, Japan. google scholar
Sherif, E., Helmy, W., & Galal-Edeen, G. H. (2023). Proposed frameworK to manage non-functional requirements in agile. IEEE Access, 11, 53995-54005. google scholar
Syriopoulos, P. K., Kalampalikis, N. G., Kotsiantis, S. B., & Vrahatis, M. N. (2023). kNN classification: A review. Annals of Mathematics and Artificial Intelligence, 1-33. google scholar
Wang, H., Li, G., & Wang, Z. (2023). Fast SVM classifier for large-scale classification problems. Information Sciences, 642, 119136. google scholar
Yuvaraj, N., Chang, V., Gobinathan, B., Pinagapani, A., Kannan, S., Dhiman, G., & Rajan, A. R. (2021). Automatic detection of cyberbullying using multifeature artificial intelligence with deep decision tree classification. Computers and Electrical Engineering, 92, 107186. google scholar
Younas, M., Jawawi, D. N., Ghani, I., & Shah, M. A. (2020). Extraction of non-functional requirement using semantic similarity distance. Neural Computing and Applications, 32, 7383-7397. google scholar

Toplam 30 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Otomatik Yazılım Mühendisliği
Bölüm	Araştırma Makalesi
Yazarlar	Muhammad Owais Raza 0000-0002-3065-385X Vajeeha Mir 0009-0009-4935-9042 Jawad Rasheed 0000-0003-3761-1641 Mirsat Yesiltepe 0000-0003-4433-5606 Shtwai Alsubai 0000-0002-6584-7400
Gönderilme Tarihi	7 Şubat 2025
Kabul Tarihi	3 Haziran 2025
Yayımlanma Tarihi	30 Haziran 2025
DOI	https://doi.org/10.26650/acin.1634472
IZ	https://izlik.org/JA76GC67PJ
Yayımlandığı Sayı	Yıl 2025 Cilt: 9 Sayı: 1

Kaynak Göster

APA	Raza, M. O., Mir, V., Rasheed, J., Yesiltepe, M., & Alsubai, S. (2025). Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning. Acta Infologica, 9(1), 275-292. https://doi.org/10.26650/acin.1634472
AMA	1.Raza MO, Mir V, Rasheed J, Yesiltepe M, Alsubai S. Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning. ACIN. 2025;9(1):275-292. doi:10.26650/acin.1634472
Chicago	Raza, Muhammad Owais, Vajeeha Mir, Jawad Rasheed, Mirsat Yesiltepe, ve Shtwai Alsubai. 2025. “Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning”. Acta Infologica 9 (1): 275-92. https://doi.org/10.26650/acin.1634472.
EndNote	Raza MO, Mir V, Rasheed J, Yesiltepe M, Alsubai S (01 Haziran 2025) Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning. Acta Infologica 9 1 275–292.
IEEE	[1]M. O. Raza, V. Mir, J. Rasheed, M. Yesiltepe, ve S. Alsubai, “Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning”, ACIN, c. 9, sy 1, ss. 275–292, Haz. 2025, doi: 10.26650/acin.1634472.
ISNAD	Raza, Muhammad Owais - Mir, Vajeeha - Rasheed, Jawad - Yesiltepe, Mirsat - Alsubai, Shtwai. “Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning”. Acta Infologica 9/1 (01 Haziran 2025): 275-292. https://doi.org/10.26650/acin.1634472.
JAMA	1.Raza MO, Mir V, Rasheed J, Yesiltepe M, Alsubai S. Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning. ACIN. 2025;9:275–292.
MLA	Raza, Muhammad Owais, vd. “Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning”. Acta Infologica, c. 9, sy 1, Haziran 2025, ss. 275-92, doi:10.26650/acin.1634472.
Vancouver	1.Muhammad Owais Raza, Vajeeha Mir, Jawad Rasheed, Mirsat Yesiltepe, Shtwai Alsubai. Enhancing Software Requirement Classification via Dataset Fusion and Machine Learning. ACIN. 01 Haziran 2025;9(1):275-92. doi:10.26650/acin.1634472

Makale Dosyaları

Tam Metin