Automated Classification of Class Role-Stereotypes via Machine Learning
Arif Nurwidyantoro, Truong Ho-Quang, and Michel R. V. Chaudron
In Proceedings of the Evaluation and Assessment on Software Engineering, 2019
Role stereotypes indicate generic roles that classes play in the design of software systems (e.g. controller, information holder, or interfacer). Knowledge about the role-stereotypes can help in various tasks in software development and maintenance, such as program understanding, program summarization, and quality assurance. This paper presents an automated machine learning-based approach for classifying the role-stereotype of classes in Java. We analyse the performance of this approach against a manually labelled ground truth for a sizable open source project (of 770+ Java classes) for the Android platform. Moreover, we compare our approach to an existing rule-based classification approach. The contributions of this paper include an analysis of which machine learning algorithms and which features provide the best classification performance. This analysis shows that the Random Forest algorithm yields the best classification performance. We find however, that the performance of the ML-classifier varies a lot for classifying different role-stereotypes. In particular its performs degrades for rare role-types. Our ML-classifier improves over the existing rule-based classification method in that the ML-approach classifies all classes, while rule-based approaches leave a significant number of classes unclassified.