Mining version history to predict the class instability
Autoři:
Shahid Hussain aff001; Humaira Afzal aff002; Muhammad Rafiq Mufti aff003; Muhammad Imran aff002; Amjad Ali aff004; Bashir Ahmad aff005
Působiště autorů:
Department of Computer Science, COMSATS University, Islamabad, Pakistan
aff001; Department of Computer Science, Bahauddin Zakariya University, Multan, Pakistan
aff002; Department of Computer Science, COMSATS University, Vehari, Pakistan
aff003; Department of Computer, University of Swat, Swat, Pakistan
aff004; Department of Computer Science, Qurtaba University, DIK, Pakistan
aff005
Vyšlo v časopise:
PLoS ONE 14(9)
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pone.0221780
Souhrn
While most of the existing class stability assessors just rely on structural information retrieved from a desired source code snapshot. However, class stability is intrinsically characterized by the evolution of a number of dependencies and change propagation factors which aid to promote the ripple effect. Identification of classes prone to ripple effect (instable classes) through mining the version history of change propagation factors can aid developers to reduce the efforts needed to maintain and evolve the system. We propose Historical Information for Class Stability Prediction (HICSP), an approach to exploit change history information to predict the instable classes based on its correlation with change propagation factors. Subsequently, we performed two empirical studies. In the first study, we evaluate the HICSP on the version history of 10 open source projects. Subsequently, in the second replicated study, we evaluate the effectiveness of HICSP by tuning the parameters of its stability assessors. We observed the 4 to 16 percent improvement in term of F-measure value to predict the instable classes through HICSP as compared to existing class stability assessors. The promising results indicate that HICSP is able to identify instable classes and can aid developers in their decision making.
Klíčová slova:
Computer and information sciences – Computer software – Artificial intelligence – Machine learning – Support vector machines – Software engineering – Source code – Biology and life sciences – Organisms – Eukaryota – Animals – Vertebrates – Amniotes – Mammals – Camels – Physical sciences – Mathematics – Applied mathematics – Algorithms – Machine learning algorithms – Research and analysis methods – Simulation and modeling – Decision analysis – Decision trees – Engineering and technology – Management engineering
Zdroje
1. Li B, Sun X, Leung H, Zhang S. A Survey of code-based change impact analysis techniques, Journal of Software Testing, Verification and Reliability, 2013; 23(8): 613–646. doi: 10.1002/stvr.1475
2. Jaafar F, Gueheneuc Y-G. Hammel S, Antoniol G, Detecting asynchrony and dephase change patterns by mining software repositories, Journal of Software: Evolution and Process, 2014; 26(1): 77–106. doi: 10.1002/smr.1635
3. Penta MD, Cerulo L, Gueheneuc Y-G, Antoniol G, An empirical study of relationships between design pattern roles and class change proneness, Proceedings of 24th International Conference of Software Maintenance (ICSM’08), 2008: 217–226. doi: 10.1109/SCIS.2007.367670
4. Aversano L, Canfora G, Cerulo L, Grosso CD, Penta MD, An empirical study on the evolution of design patterns, Proceedings of 6th Joint Meeting European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundation of Software Engineering (ESEC-FSE’07), 2007: 385–394. doi: 10.1145/1287624.1287680
5. Ampatzoglou A, Chatzigeorgiou A, Charalampidou S, Avgeriou P, The effect of GoF design patterns on stability: A case study,” IEEE Transactions on Software Engineering, 2015; 41(8): 781–802. doi: 10.1109/TSE.2015.2414917
6. Arvanitou E-M, Ampatzoglou A, Chatzigeorgiou A and Avgeriou P, Introducing a Ripple Effect Measure: A theoretical and empirical validation, Proceeding of the 9th International Symposium on Empirical Software Engineering and Measurement (ESEM’15), 2015. doi: 10.1109/ESEM.2015.7321204
7. Tsantalis N, Chatzigeorgiou A, Stephanides G, Predicting the probability of change in Object-Oriented Systems,” IEEE Transactions on Software Engineering, 2005; 31(7): 601–614.
8. Alshayeb M, Li W, An empirical study of system design instability metric and design evolution in an agile software process, Journal of Systems and Software, 2005; 74(3): 269–274. http://dx.doi.org/10.1016/j.jss.2004.02.002.
9. Elish MO, Rine D, “Investigation of metrics for object-oriented design logical stability,” Proceeding of 7th European Conference on Software Maintenance and Reengineering (CSMR’03), 2003: 193–200.
10. Bansiya J, Davies CG, A hierarchical model for object-oriented design quality assessment, IEEE Transactions on Software Engineering, 2002; 28(1): 4–17.
11. Chidamber SR, Kemerer CF, A metrics suite for Object-Oriented design,” IEEE Transactions on Software Engineering, 1994; 20(6): 476–493.
12. Hussain S, A methodology to predict the instable classes, 32nd ACM Symposium on Applied Computing (SAC) Morocco, 2017.
13. Poshyvanyk D, Marcus A, Ferenc R, Gyimothy T, Using information retrieval based coupling measures for impact analysis, Empirical Software Engineering, 2009; 14(1); 5–32. doi: 10.1007/s10664-008-9088-2
14. Hassaine S, Boughanmi F, Gueheneuc Y-G, Hamel S, Antoniol G, A seismology-inspired approach to study change propagation, Proceedings of 27th International Conference on Software Maintenance (ICSM’ 11), 2011. doi: 10.1109/ICSM.2011.6080772
15. Bohner SA, “Impact analysis in the software change process: A year 2000 perspective,” Proceedings of the International Conference on Software Maintenance (ICSM’ 96), 1996: 42–51.
16. Rovegard P, Angelis L, Wohlin C, “An empirical study on views of importance of change impact analysis issues,” IEEE Transactions on Software Engineering, 2008; 34(4): 516–530. doi: 10.1109/TSE.2008.32
17. Horowitz E, Williamson RC, SODOS: a software documentation support environment-It’s definition, IEEE Transactions on Software Engineering, 1986; 12(8): 849–859.
18. Ramanathan M-K, Grama A, Jagannathan S, Sieve: a tool for automatically detecting variations across program versions, Proceedings of the 21st IEEE/ACM International Conference on Automated Software Engineering (ASE’06), 2006: 241–252. doi: 10.1109/ASE.2006.61
19. Law J, Rothernel G, Whole program path-based dynamic impact analysis, Proceedings of the 25th International Conference on Software Engineering (ICSE’03), 2003: 308–318.
20. Apiwattanapong T, Orso A, Harrold MJ, Efficient and precise dynamic impact analysis using execute-after sequences, Proceedings of the 27th International Conference on Software Engineering (ICSE’05), 2005: 432–441, doi: 10.1145/1062455.1062534
21. Gall H, Hajek K, Jazayeri M, Detection of Logical coupling based on product release history, Proceedings of International Conference on Software Maintenance (ICSM ‘98), 1998: 190–198.
22. Sherriff M, Williams L, Empirical software change impact analysis using singular value decomposition, Proceedings of the International Conference on Software Testing, Verification, and Validation, 2008: 268–277.
23. Bieman JM, Andrews AA, Yang HJ. “Understanding change-proneness in OO software through visualization,” Proceedings of 11th International Workshop on Program Comprehension, 2003: 44–53.
24. Zimmermann T, Weisgerber P, Diehl S, Zeller A, Mining version histories to guide software changes, Proceedings of the 26th International Conference on Software Engineering (ICSE’04), 2004: 563–572.
25. Torchiano M, Ricca F, Impact analysis by means of unstructured knowledge in the context of bug repositories, Proceedings of the ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, 2010: doi: 10.1145/1852786.1852847
26. Jaafar F, Gueheneuc YG, Hamel S, Khomh F, “Analyzing anti-pattern static relationship with design patterns,” Electronic Communication of the EASST, 2013: 59.
27. Lee YS, Liang BS, Wu SF, Wang FJ, Measuring the coupling and cohesion of an Object-Oriented program based on information flow,” Proceedings of International Conference on Software Quality, 1995.
28. Briand L-C, Devanbu P, Melo W-L, “An investigation into coupling measures for C++,” Proceedings of 19th International Conference on Software engineering (ICSE'97), 1997: 412–421. doi: 10.1145/253228.253367
29. Gall H, Jazayeri M, Krajewski J, CVS release history data for detecting logical couplings,” Proceedings of 6th International Workshop on Principles of Software Evolution (IWPSE'03), 2003: 13–23.
30. Hussain S, Keung J, Sohail MK, Ilahi M, Khan AA, Automated framework for classification and selection of software design patterns, Applied Soft Computing, 2019; 75: 1–20.
31. Black S, Deriving an approximation algorithm for automatic computation of ripple effect measures, Journal of Information and Software Technology, 2008; 50(7): 723–736, doi: 10.1016/j.infsof.2007.07.008
32. Catal C, Diri B, Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem, Journal of Information Sciences, 2009; 179(8): 1040–1058. doi: 10.1016/j.ins.2008.12.001
33. Fernández-Delgado M, Cernadas E, Barro S, Amorim D, Do we need hundreds of classifiers to solve real world classification problems, Journal of Machine Learning Research, 2014; 15(1): 3133–3181.
34. Ma Y, Guo L, Cukic B, A Statistical framework for the prediction of fault-proneness, Proceedings of Advances in Machine Learning Application in Software Engineering, 2006: 237–265.
35. Koprinska I, Poon J, Clark J, Chan J, Learning to classify e-mail, Jounrnal of Information Sciences, 2007; 177(10): 2167–2187. doi: 10.1016/j.ins.2006.12.005
36. Menzies T, Greenwald J, Frank A, Data mining static code attributes to learn defect predictors, IEEE Transactions on Software Engineering, 2007; 33(1): 2–13. doi: 10.1109/TSE.2007.10
37. Hussain S, Keung J, Khan AA, Software design patterns classification and selection using text categorization approach, Applied Soft Computing, 2017; 58: 225–244.
38. Breiman L, Random Forests, Journal of Machine Learning, 2001; 45(1): 5–32. doi: 10.1023/A:1010933404324
39. Quinlan JR, C4.5: Programs for machine learning, Morgan Kaufmann Publishers San Francisco, CA, USA, ISBN: 1558602402, 1993.
40. John GH, Langley P, Estimating continuous distributions in bayesian classifiers, Proceedings of 11th Conference on Uncertainity in Artificial Intelligence, 1995: 338–345.
41. Muzammal SM, Shah MA, Khatak HA, Ahmad G, Hussain S, Khalid S et al., Counter meauring conceiable security threats on smart healthcare devices, IEEE Access, 2018.
42. Nasir JA, Hussain S, Dang H, Integrated planning approach towards home health care, telehealth and patients group based care, Journal of Network and Computer Applications, 2018; 117: 31–40.
43. Platt J, Sequential Minimal Optimization: A fast algorithm for training Support Vector Machines, Technical Report: MSR-TR-98-14, 1998.
44. Tantithamthavorn C, Mclntosh S, Hassan AE, Matsumoto K, An Emperical Comparison of Model Validation Techniques for Defect Prediction Models, IEEE Transactions on Software Engneering, PP(99), 2016.
Článek vyšel v časopise
PLOS One
2019 Číslo 9
- Proč jsou nemocnice nepřítelem spánku? A jak to změnit?
- Dlouhodobá ketodieta může poškozovat naše orgány
- „Jednohubky“ z klinického výzkumu – 2024/42
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- MUDr. Jana Horáková: Remise již dosahujeme u více než 80 % pacientů s myastenií