Q264 : Proposing an Oversampling Method for Software Fault Prediction Using GAN Network
Thesis > Central Library of Shahrood University > Computer Engineering > MSc > 2024
Authors:
Abstarct:
In the era of technology, with the rapid advancement of the software industry and the increasing use of software products, as well as the growth of software in terms of data and users, the occurrence and observation of faults have become inevitable. Hence, the importance of software fault prediction is more strongly felt than ever. One of the methods for predicting software faults is binary classification, which determines whether a software module is faulty or not. A significant challenge in this field is the imbalance between faulty and non-faulty classes in the software fault dataset, where the faulty class has significantly fewer instances compared to the non-faulty class. This imbalance leads to models with low predictive accuracy. Additionally, previous fault prediction models have used datasets derived from small and outdated projects. In this study, we developed a software fault prediction model using a new dataset called BugHunter, which is derived from large, up-to-date and open-source Java projects. To address the issue of data imbalance, we used Generative Adversarial Networks (GANs) to augment the data, thereby balancing the classes. After achieving class balance, we applied machine learning techniques and algorithms, including Decision Trees, Random Forest, K-Nearest Neighbors, and Logistic Regression, for modeling and fault prediction. By combining GANs with machine learning algorithms, this research has improved the performance of software fault prediction. The highest f-measure achieved by the proposed model is 81%, indicating better performance compared to similar methods and previous studies.
Keywords:
#Software Fault Prediction #Generative Adversarial Network #Data Balancing #Software Testing #Machine Learning #BugHunter Dataset Keeping place: Central Library of Shahrood University
Visitor:
Visitor: