Enhanced Software Defect Prediction Using Deep Ensemble Learning for Proactive Quality Assurance

Iman Ismail Akkar, Mohammed Shabat Abdul, Alaa Jamal Jabbar
International Journal of Computational and Electronic Aspects in Engineering
Volume 6: Issue 3, September 2025, pp 202-217


Author's Information
Iman Ismail Akkar 1 
Corresponding Author
1College Of Medicine, Sumer University, Iraq
emaneesmail30@gmail.com

Mohammed Shabat Abdul2
2Director of Citizens Affairs Division, Sumer University, Iraq

Alaa Jamal Jabbar3
3Department of Computer Science, Directorate General of Dhi Qar Education, Iraq

Research Paper -- Peer Review
First online on – 6 September 2025

Open Access article under Creative Commons License

Cite this article –Iman Ismail Akkar, Mohammed Shabat Abdul, Alaa Jamal Jabbar, “Enhanced Software Defect Prediction Using Deep Ensemble Learning for Proactive Quality Assurance ”, International Journal of Computational and Electronic Aspects in Engineering, RAME Publishers, Volume 6, Issue 3, pp. 202-217, 2025.
https://doi.org/10.26706/ijceae.6.3.20250811


Abstract:-

SDP is the term that describes the most significant step in the process of identifying fault-prone components provided in the case of the software development life cycle. Its main agenda is not only to enhance software quality but also to cut down maintenance cost in the long run. Although the sheer amount of research dedicated to SDP is quite high, the current models still have numerous limitations in its practice such as high false positive rates and the serious imbalance of the number of defective and substitute modules. Natural limitations negatively affect the learning capabilities of predictive models and lower the accuracy and quality of the debugging process as a whole. To solve these nagging issues, the current study has proposed a robust and strong SDP model that builds on a mixture of sophisticated computing techniques, in particular, those that are eventualities of making precise forecasts and model generalization. The method stated here starts with a thorough preprocessing bezel on the information, during which the information will be normalized in terms of features, so that all elements are on a similar scale. The move serves to reduce the impact of outliers and noisy data hence improving the quality of training dataset. After preprocessing, a class imbalance will be solved with the help of Minority Oversampling by Synthetic Data (MOSD) technique. This can be done by creating synthetic data of the minority class to get a more balanced distribution of defective and good examples which is essential toward good training of classifiers. After that, an Adaptive Sequential K-Best (ASKB) feature selection algorithm is applied to highlight the most considerable and informative features. This approach analytically discerns the significance of any attribute in a dynamic fashion; thus decreasing the dimension of the dataset with negligible loss of crucial information on predictive variables. The slimmed down feature set will also help in building a model that can be interpreted better and is also computationally economical. As a classification algorithm a Weighted Random Forest (WRF) is used in the case of the classification task. This extension of the conventional Random Forest incorporates instance-based weighting where the model is made to give more weight to developing accurate labels of instances of the minority class. The WRF classifier, in turn, improves the overall performance in prediction and lessens the bias in relation to the dominant category. The empirical assessment of the offered structure proves its capabilities to be highly advanced in comparison to the existing models. The method produced outstanding classification accuracy measures that were 99.11999, 99.43111, 99.12199 and 99.33333 in terms of accuracy rate, precision, recall and F1- score respectively. The outcomes can substantiate the usefulness of the combined methodology on enhancing defect forecasting abilities. Moreover, the suggested model has practical implications that can be employed in proactive software quality assurance, thus more reliable and cost-effective software engineering processes can be conducted.

Index Terms:-

Class imbalance , Defect predication in software , Minority oversampling , Data preprocessing , Synthetic data ,Adaptive UIS K-best ,Weighted random forest, Feature selection , Predictive modelling


REFERENCES
  1. K. Phung, E. Ogunshile, and M. E. Aydin, "Domain-specific implications of error-type metrics in risk-based software fault prediction," Software Quality Journal, vol. 33, no. 1, pp. 1–41, 2025.

  2. L. Madeyski and S. Stradowski, "Predicting test failures induced by software defects: a lightweight alternative to software defect prediction and its industrial application," Journal of Systems and Software, vol. 223, p. 112360, 2025.

  3. R. G. Hussain, K.-C. Yow, and M. Gori, "Leveraging an enhanced CodeBERT-based model for multiclass software defect prediction via defect classification," IEEE Access, vol. 13, pp. 24383–24397, 2025, doi: 10.1109/ACCESS.2024.3525069.

  4. S. Pargaonkar, "Enhancing software quality in architecture design: a survey-based approach," International Journal of Scientific Research Publications (IJSRP), vol. 13, no. 08, p. 116, 2023.

  5. M. Alenezi and M. Akour, "AI-driven innovations in software engineering: a review of current practices and future directions," Applied Sciences, vol. 15, no. 3, p. 1344, 2025.

  6. Y. Tang, Q. Dai, D. Ye, T.-S. Zheng, and M.-H. Li, "Capsule feature selector for software defect prediction," Journal of Supercomputing, vol. 81, no. 3, p. 489, 2025.

  7. M. Y. Yeow, C. Y. Chong, M. K. Lim, and Y. Y. Yee, "Predicting software reuse using machine learning techniques—a case study on open-source Java software systems," PLoS ONE, vol. 20, no. 2, p. e0314512, 2025.

  8. N. Gupta, R. R. Sinha, A. Goyal, N. Sunda, and D. Sharma, "Analyze the performance of software by machine learning methods for fault prediction techniques," International Journal on Recent and Innovation Trends in Computing and Communication (IJRITCC), vol. 11, no. 5s, pp. 178–187, 2023.

  9. U. Samal and A. Kumar, "Empowering software reliability: leveraging efficient fault detection and removal efficiency," Quality Engineering, vol. 37, no. 1, pp. 118–129, 2025.

  10. T. O. Olaleye, D. A. Aborishade, O. Arogundade, A. Abayomi-Alli, and O. J. Adeniran, "Multilayer perceptron of software complexity metrics for explainable multicollinearity mitigation and defect localization," Cureus Journal of Computer Science, vol. 17, pp. 1–17, 2025.

  11. A. Selvaraj, M. Devan, and K. Thirunavukkarasu, "AI-driven approaches for test data generation in FinTech applications: enhancing software quality and reliability," Journal of Artificial Intelligence Research and Applications, vol. 4, no. 1, pp. 397–429, 2024.

  12. J. Wang, Y. Huang, C. Chen, Z. Liu, S. Wang, and Q. Wang, "Software testing with large language models: survey, landscape, and vision," IEEE Transactions on Software Engineering, vol. 50, no. 4, pp. 911–936, 2024, doi: 10.1109/TSE.2024.3368208.

  13. H. Chen and M. Ali Babar, "Security for machine learning-based software systems: a survey of threats, practices, and challenges," ACM Computing Surveys, vol. 56, no. 6, pp. 1–38, 2024.

  14. W. E. Kedi, C. Ejimuda, C. Idemudia, and T. I. Ijomah, "Machine learning software for optimizing SME social media marketing campaigns," Computer Science and IT Research Journal, vol. 5, no. 7, pp. 1634–1647, 2024.

  15. M. F. I. Khan and A. K. M. Masum, "Predictive analytics and machine learning for real-time detection of software defects and agile test management," Educational Administration: Theory and Practice, vol. 30, no. 4, pp. 1051–1057, 2024.

  16. S. Feng, J. Keung, Y. Xiao, P. Zhang, Y. Xiao, and X. Cao, "Improving the undersampling technique by optimizing the termination condition for software defect prediction," Expert Systems with Applications, vol. 235, p. 121084, 2024.

  17. J. BrundhaElci and S. Nandagopalan, "SS-WDRN: sparrow search optimization-based weighted dual recurrent network for software fault prediction," Knowledge and Information Systems, vol. 66, no. 2, pp. 1037–1064, 2024.

  18. U. Samal and A. Kumar, "Enhancing software reliability forecasting through a hybrid ARIMA-ANN model," Arabian Journal for Science and Engineering, vol. 49, no. 5, pp. 7571–7584, 2024.

  19. L. Zhifang, W. Kun, Z. Qi, L. Shengzong, Z. Yan, and J. He, "Classification of open source software bug report based on transfer learning," Expert Systems, vol. 41, no. 5, p. e13184, 2024.

  20. S. Oster, P. P. Breese, A. Ulbricht, G. Mohr, and S. J. Altenburg, "A deep learning framework for defect prediction based on thermographic in-situ monitoring in laser powder bed fusion," Journal of Intelligent Manufacturing, vol. 35, no. 4, pp. 1687–1706, 2024.

  21. M. M. Morovati, A. Nikanjam, F. Tambon, F. Khomh, and Z. M. Jiang, "Bug characterization in machine learning-based systems," Empirical Software Engineering, vol. 29, no. 1, p. 14, 2024.

  22. M. H. Rahman et al., "Accelerating defect predictions in semiconductors using graph neural networks," APL Machine Learning, 2024, doi: 10.1063/5.0176333.

  23. M. Fu, C. Tantithamthavorn, T. Le, Y. Kume, V. Nguyen, D. Phung, and J. Grundy, "Aibughunter: a practical tool for predicting, classifying and repairing software vulnerabilities," Empirical Software Engineering, vol. 29, no. 1, p. 4, 2024.

  24. J. Al Dallal, H. Abdulsalam, M. AlMarzouq, and A. Selamat, "Machine learning-based exploration of the impact of move method refactoring on object-oriented software quality attributes," Arabian Journal for Science and Engineering, vol. 49, no. 3, pp. 3867–3885, 2024.

  25. G. Giray, K. E. Bennin, Ö. Köksal, Ö. Babur, and B. Tekinerdogan, "On the use of deep learning in software defect prediction," Journal of Systems and Software, vol. 195, p. 111537, 2023.

  26. M. H. Geem, "On strongly continuous ρh-semigroup," Journal of Physics: Conference Series, vol. 1234, no. 1, 2019, doi: 10.1088/1742-6596/1234/1/012109.

  27. M. H. Geem, A. R. Hassan, and H. I. Neamah, "0-Semigroup of g-transformation," Journal of Interdisciplinary Mathematics, vol. 28, no. 1, pp. 311–316, 2025. [Online]. Available: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05619-z

  28. S. H. Mohammad, I. Z. C. Alrikabi, and H. R. D. Alfayyadh, "Number plate recognition system based on an improved segmentation method," International Journal of Computational and Electronic Aspects in Engineering, RAME Publishers, vol. 6, no. 1, pp. 42–50, 2025, doi: 10.26706/ijceae.6.1.20250207.

  29. S. I. Hamad, "Utilizing convolutional neural networks for the identification of lung cancer," International Journal of Computational and Electronic Aspects in Engineering, RAME Publishers, vol. 6, no. 1, pp. 35–41, 2025, doi: 10.26706/ijceae.6.1.20250206.

  30. H. Hatem, "Improved deep learning models for plants diseases detection for smart farming," International Journal of Computational and Electronic Aspects in Engineering, RAME Publishers, vol. 6, no. 1, pp. 10–21, 2025, doi: 10.26706/ijceae.6.1.20250204.

  31. S. T. Hlama, Z. H. Alkhairullah, and A. N. Faisal, "Development of a concept for a driverless vehicle using an artificial neural network," International Journal of Computational and Electronic Aspects in Engineering, RAME Publishers, vol. 6, no. 3, pp. 121–133, 2025, doi: 10.26706/ijceae.6.3.20250602.

  32. To view full paper, Download here


Publishing with