Prediksi Cacat Perangkat Lunak Kelas Tidak Seimbang Menggunakan Resample J48 Dan J48 Consolidated
Abstract
Prediksi cacat perangkat lunak merupakan aspek penting dalam jaminan kualitas perangkat lunak, dengan tujuan mengidentifikasi dan mengatasi potensi cacat sebelum mereka muncul dalam lingkungan produksi. Penelitian ini menyajikan pendekatan inovatif untuk mengatasi masalah distribusi kelas yang tidak seimbang dalam prediksi cacat perangkat lunak menggunakan teknik resampling dan algoritma J48 dan J48 Consolidated. Selain itu, penelitian ini memperkenalkan varian baru dari J48, yang disebut sebagai J48 Consolidated, yang menggabungkan beberapa pohon keputusan menjadi satu model ensemble tunggal untuk meningkatkan kinerja prediksi. Model J48 Consolidated dibandingkan dengan algoritma J48 tradisional dalam konteks prediksi cacat perangkat lunak dengan distribusi kelas yang tidak seimbang. Dataset yang yang digunakan pada penelitian ini menggunakan dataset PROMISE repository. Hasil penelitian menunjukan bahwa integrase Algoritma RUS + J48 Consolidated layak digunakan untuk memprediksi cacat software dengan rata-rata akurasi 78% dengan nilai AUC 0.783. Penelitian ini menguji kinerja J48 dengan pendekata ROS dan RUS menggunakan Algoritma J48 dan J48 Consolidated. Hasil penelitian menunjukan model RUS+J48 Consolidated lebih baik dari model RUS+J48 dengan nilai rata-rata akurasi 78% dan 77% serta nilai AUC 0.783 dan 0.766.
References
Bienvenido-Huertas, D., Nieto-Julián, J. E., Moyano, J. J., Macías-Bernal, J. M., & Castro, J. (2020). Implementing Artificial Intelligence in H-BIM Using the J48 Algorithm to Manage Historic Buildings. International Journal of Architectural Heritage, 14(8), 1148–1160. https://doi.org/10.1080/15583058.2019.1589602
Daud, N., Mohd Noor, N. L., Aljunid, S. A., Noordin, N., & Fahmi Teng, N. I. M. (2019). Predictive Analytics: The Application of J48 Algorithm on Grocery Data to Predict Obesity. 2018 IEEE Conference on Big Data and Analytics, ICBDA 2018, November, 1–6. https://doi.org/10.1109/ICBDAA.2018.8629623
Diwandri, N., & Setiawan, A. (2015). Perbandingan Algoritme J48 Dan Nbtree Untuk Klasifikasi. Seminar Nasional Teknologi Informasi Dan Komunikasi 2015 (SENTIKA 2015), 2015(Sentika), 205–212.
Dr. Bhargava N., Sharma G., Dr. Bhargava R., M. M. (2013). International Journal of Advanced Research in Decision Tree Analysis on J48 Algorithm for Data Mining. International Journal of Advanced Research in Computer Science and Software Engineering, 3(6), 1114–1119.
Fitriyani, & Wahono, R. S. (2015). Integrasi Bagging dan Greedy Forward Selection pada Prediksi Cacat Software dengan Menggunakan Naïve Baye. Journal of Software Engineering, 1(2).
Gorunescu, F. (2011). Data Mining: Concepts, models and techniques.
Gray, D., Bowes, D., Davey, N., Sun, Y., & Christianson, B. (2011). The misuse of the NASA Metrics Data Program data sets for automated software defect prediction. IET Seminar Digest, 2011(1), 96–103. https://doi.org/10.1049/ic.2011.0012
Ibarguren, I., Pérez, J. M., Muguerza, J., Gurrutxaga, I., & Arbelaitz, O. (2015). Coverage-based resampling: Building robust consolidated decision trees. Knowledge-Based Systems, 79, 51–67. https://doi.org/10.1016/j.knosys.2014.12.023
Khoshgoftaar, T. M., Gao, K., & Seliya, N. (2010). Attribute selection and imbalanced data: Problems in software defect prediction. Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI, 1, 137–144. https://doi.org/10.1109/ICTAI.2010.27
Ma, B., Dejaeger, K., Vanthienen, J., & Baesens, B. (n.d.). Software defect prediction based on asso- ciation rule classification. 0–7.
McDonald, M., Musson, R., & Smith, R. (2007). The Practical Guide to Defect Prevention (Best Practices (Microsoft)). http://www.amazon.com/Practical-Defect-Prevention-Practices-Microsoft/dp/0735622531
Philip, D., & Gray, H. (2012). Software Defect Prediction Using Static Code Metrics : Formulating a Methodology. Journal, December.
Saifudin, A., & Wahono, R. S. (2015). Penerapan Teknik Ensemble untuk Menangani Ketidakseimbangan Kelas pada Prediksi Cacat Software. Journal of Software Engineering, 1(1).
Sathyaraj, R., & Prabu, S. (2015). An approach for software fault prediction to measure the quality of different prediction methodologies using software metrics. Indian Journal of Science and Technology, 8(35). https://doi.org/10.17485/ijst/2015/v8i35/73717
Song, Q., Jia, Z., Shepperd, M., Ying, S., & Liu, J. (2011). A general software defect-proneness prediction framework. IEEE Transactions on Software Engineering, 37(3), 356–370. https://doi.org/10.1109/TSE.2010.90
Vasco, P. (2014). An update of the J48Consolidated WEKA ’ s class : CTC algorithm enhanced with the notion of coverage. June.
Vercellis, C. (2009). Business Intelligence: Data Mining and Optimization for Decision Making. In Business Intelligence: Data Mining and Optimization for Decision Making. https://doi.org/10.1002/9780470753866
Wahono, R. S., & Suryana, N. (2013). Combining particle swarm optimization based feature selection and bagging technique for software defect prediction. International Journal of Software Engineering and Its Applications, 7(5), 153–166. https://doi.org/10.14257/ijseia.2013.7.5.16
Wahono, R. S., Suryana, N., & Ahmad, S. (2014). Metaheuristic Optimization based Feature Selection for Software Defect Prediction. Journal of Software, 9(5). https://doi.org/10.4304/jsw.9.5.1324-1333
Wang, S., & Yao, X. (n.d.). Using Class Imbalance Learning for Software Defect Prediction. 1–11.
Yap, B. W., Rani, K. A., Abd Rahman, H. A., Fong, S., Khairudin, Z., & Abdullah, N. N. (2014). An application of oversampling, undersampling, bagging and boosting in handling imbalanced datasets. In Lecture Notes in Electrical Engineering: Vol. 285 LNEE. https://doi.org/10.1007/978-981-4585-18-7_2
Copyright (c) 2024 Ilham Nurjabar, Lindung Parningotan Manik
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.