Class Imbalance Reduction and Centroid based Relevant Project Selection for Cross Project Defect Prediction

Kiran Kumar  Bejjanki; Sai Priyanka  Kanchanapally; Mahesh Kumar  Thota

doi:10.17762/ijritcc.v11i6s.6933

PDF

Published: Jun 11, 2023

DOI: https://doi.org/10.17762/ijritcc.v11i6s.6933

Keywords:

Cross Project Defect Prediction, Class imbalance, PF-SMOTE, Software quality assurance

Kiran Kumar Bejjanki

Associate Professor, Dept. of Information Technology, Kakatiya Institute of Technology & Science Warangal, India

Sai Priyanka Kanchanapally

M.Tech Scholar, Dept. of Information Technology, Kakatiya Institute of Technology & Science Warangal, India

Mahesh Kumar Thota

Assistant Professor, Dept. of Information Technology, Kakatiya Institute of Technology & Science Warangal, India

Abstract

Cross-Project Defect Prediction (CPDP) is the process of predicting defects in a target project using information from other projects. This can assist developers in prioritizing their testing efforts and finding flaws. Transfer Learning (TL) has been frequently used at CPDP to improve prediction performance by reducing the disparity in data distribution between the source and target projects. Software Defect Prediction (SDP) is a common study topic in software engineering that plays a critical role in software quality assurance. To address the cross-project class imbalance problem, Centroid-based PF-SMOTE for Imbalanced data is used. In this paper, we used a Centroid-based PF-SMOTE to balance the datasets and Centroid based relevant data selection for Cross Project Defect Prediction. These methods use the mean of all attributes in a dataset and calculating the difference between mean of all datasets. For experimentation, the open source software defect datasets namely, AEEM, Re-Link, and NASA, are considered.

How to Cite

Bejjanki, K. K. ., Kanchanapally, S. P. ., & Thota, M. K. . (2023). Class Imbalance Reduction and Centroid based Relevant Project Selection for Cross Project Defect Prediction. International Journal on Recent and Innovation Trends in Computing and Communication, 11(6s), 293–302. https://doi.org/10.17762/ijritcc.v11i6s.6933

Issue

Vol. 11 No. 6s (2023): Advances in Computational Modeling and Simulation of Computing Systems

Section

Articles

References

T. Zimmermann, N. Nagappan, H. Gall, E. Giger, and B. Murphy, “Cross-project defect prediction: a large scale experiment on data vs. domain vs. process,” in FSE/ESEC’09. ACM, 2009, pp. 91–100.

B. Turhan, T. Menzies, A. B. Bener, and J. Di Stefano, “On the relative value of cross-company and within-company data for defect prediction,” Empirical Software Engineering, vol. 14, no. 5, pp. 540–578, 2009.

A Survey on Software Defect Prediction Using Deep Learning Elena N. Akimova 1,2,? , Alexander Yu. Bersenev 1,2, Artem A. Deikov 1,2, Konstantin S. Kobylkin 1,2 , Anton V. Konygin 1 , Ilya P. Mezentsev 1,2 and Vladimir E. Misilov 1,2.

S. Lessmann, B. Baesens, C. Mues, and S. Pietsch, “Benchmarking classification models for software defect prediction: A proposed framework and novel findings,” IEEE Transactions on Software Engineering, vol. 34, no. 4, pp. 485–496, 2008.

X.-Y. Jing, S. Ying, Z.-W. Zhang, S.-S. Wu, and J. Liu, “Dictionary learning based software defect prediction,” in ICSE’14. ACM, 2014, pp. 414–423.

T. Wang, Z. Zhang, X.-Y. Jing, and L. Zhang, “Multiple kernel ensemble learning for software defect prediction,” Automated Software Engineering, vol. 23, no. 4, pp. 569–590, 2016.

Z. Xu, J. Liu, X. Luo, Z. Yang, Y. Zhang, P. Yuan, Y. Tang, and T. Zhang, “Software defect prediction based on kernel PCA and weighted extreme learning machine,” Information and Software Technology, vol. 106, pp. 182–200, 2019.

DSSDPP: Data Selection and Sampling based Domain Programming Predictor for Cross-project Defect Prediction Zhiqiang Li, Hongyu Zhang, Xiao-Yuan Jing, Juanying Xie, Min Guo, Jie Ren.

Z. Li, X.-Y. Jing, and X. Zhu, “Progress on approaches to software defect prediction,” IET Software, vol. 12, no. 3, pp. 161–175, 2018.

M. Shepperd, D. Bowes, and T. Hall, “Researcher bias: The use of machine learning in software defect prediction,” IEEE Transactions on Software Engineering, vol. 40, no. 6, pp. 603–616, 2014.

Y. Zhou, Y. Yang, H. Lu, L. Chen, Y. Li, Y. Zhao, J. Qian, and B. Xu, “How far we have progressed in the journey? An examination of cross-project defect prediction,” ACM Transactions on Software Engineering and Methodology, vol. 27, no. 1, pp. 1–51, 2018.

S. Herbold, A. Trautsch, and J. Grabowski, “A comparative study to benchmark cross-project defect prediction approaches,” IEEE Transactions on Software Engineering, vol. 44, no. 9, pp. 811–833, 2018.

S. Hosseini, B. Turhan, and D. Gunarathna, “A systematic literature review and meta-analysis on cross project defect prediction,” IEEE Transactions on Software Engineering, vol. 45, no. 2, pp. 111– 147, 2019.

S. Watanabe, H. Kaiya, and K. Kaijiri, “Adapting a fault prediction model to allow inter language reuse,” in PROMISE’08, 2008, pp. 19–24

A. E. Camargo Cruz and K. Ochimizu, “Towards logistic regression models for predicting fault-prone code across software projects,” in ESEM’09, 2009, pp. 460–463.

C. Ni, W. S. Liu, X. Chen, Q. Gu, D. X. Chen, and Q. G. Huang, “A cluster based feature selection method for cross-project software defect prediction,” Journal of Computer Science and Technology, vol. 32, no. 6, pp. 1090–1107, 2017.

Y. Zhang, L. O. David, X. Xia, and J. Sun, “Combined classifier for cross-project defect prediction: an extended empirical study,” Frontiers of Computer Science, vol. 12, no. 2, pp. 280–296, 2018.

J. Nam, S. J. Pan, and S. Kim, “Transfer defect learning,” in ICSE’13. IEEE, 2013, pp. 382–391.

Y. Ma, G. Luo, X. Zeng, and A. Chen, “Transfer learning for cross company software defect prediction,” Information and Software Technology, vol. 54, no. 3, pp. 248–256, 2012.

C. Liu, D. Yang, X. Xia, M. Yan, and X. Zhang, “A two-phase transfer learning model for cross-project defect prediction,” Information and Software Technology, vol. 107, pp. 125–136, 2019.

Z. Li, J. Niu, X.-Y. Jing, W. Yu, and C. Qi, “Cross-project defect prediction via landmark selection-based kernelized discriminant subspace alignment,” IEEE Transactions on Reliability, vol. 70, no. 3, pp. 996–1013, 2021.

X. Xia, D. Lo, S. J. Pan, N. Nagappan, and X. Wang, “Hydra: massively compositional model for cross-project defect prediction,” IEEE Transactions on Software Engineering, vol. 42, no. 10, pp. 977– 998, 2016

L. Chen, B. Fang, Z. Shang, and Y. Tang, “Negative samples reduction in cross-company software defects prediction,” Information and Software Technology, vol. 62, pp. 67–77, 2015.

D. Ryu, J.-I. Jang, and J. Baik, “A transfer cost-sensitive boosting approach for cross-project defect prediction,” Software Quality Journal, vol. 25, no. 1, pp. 235–272, 2017.

L. Gong, S. Jiang, and L. Jiang, “An improved transfer adaptive boosting approach for mixed-project defect prediction,” Journal of Software: Evolution and Process, vol. 31, no. 10, pp. 1–28, 2019.

D. Ryu, O. Choi, and J. Baik, “Value-cognitive boosting with a support vector machine for cross-project defect prediction,” Empirical Software Engineering, vol. 21, no. 1, pp. 43–71, 2016.

Shankarpure, M. R. ., & Patil, D. D. . (2023). A Comprehensive Survey on Methods and Techniques for Automated Fruit Plucking. International Journal of Intelligent Systems and Applications in Engineering, 11(1), 156–168. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2454.

X.-Y. Jing, F. Wu, X. Dong, and B. Xu, “An improved sda based defect prediction framework for both within-project and cross project class-imbalance problems,” IEEE Transactions on Software Engineering, vol. 43, no. 4, pp. 321–338, 2017.

D. Ryu, J. Jang, and J. Baik, “A hybrid instance selection using nearest-neighbor for cross-project defect prediction,” Journal of Computer Science and Technology, vol. 30, no. 5, pp. 969–980, 2015.

Z. Xu, S. Pang, T. Zhang, X. Luo, J. Liu, Y. Tang, X. Yu, and L. Xue, “Cross project defect prediction via balanced distribution adaptation based transfer learning,” Journal of Computer Science and Technology, vol. 34, no. 5, pp. 1039–1062, 2019.

A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schlkopf, and A. J. Smola, “A kernel method for the two-sample-problem,” in NIPS’07. MIT Press, 2007, pp. 513–520.

Kiran Kumar Bejjanki, Sai Priyanka Kanchanapally, “Centroid-based PF-SMOTE for Imbalanced data,” International Conference on Mathematical Sciences and Emerging Applications in Technology (ICMSEAT-2022) (In collaboration with APTSMS), September 9-11, 2022

Citation Indices	All	Since 2018
Citation	5854	3996
h-index	28	23
i10-index	119	72

Year	Rate
2019	12.6%
2018	18.3%
2017	16.9%
2016	18.8%
2015	22.9%
2014	28.9%
2013	26.1%

Class Imbalance Reduction and Centroid based Relevant Project Selection for Cross Project Defect Prediction

Abstract

References

Contact Us:

Auricle Global Society of Education and Research

Y-18-A, Near Sanskar Play School, Sudarshana Nagar,

Bikaner, Rajasthan (India). Pin 334003

: editor@ijritcc.org

Quick Links:

Article Sidebar

Main Article Content

Abstract

Article Details

References

Contact Us:

Auricle Global Society of Education and Research

Y-18-A, Near Sanskar Play School, Sudarshana Nagar,

Bikaner, Rajasthan (India). Pin 334003

: editor@ijritcc.org

Quick Links: