Investigating Order Information in API-Usage Patterns: A Benchmark and Empirical Study

Ervina Çergani, Sebastian Proksch, Sarah Nadi, Mira Mezini
2018 Proceedings of the 13th International Conference on Software Technologies  
Many approaches have been proposed for learning Application Programming Interface (API) usage patterns from code repositories. Depending on the underlying technique, the mined patterns may (1) be strictly sequential, (2) consider partial order between method calls, or (3) not consider order information. Understanding the trade-offs between these pattern types with respect to real code is important in many applications (e.g. code recommendation or misuse detection). In this work, we present a
more » ... chmark consisting of an episode mining algorithm that can be configured to learn all three types of patterns mentioned above. Running our benchmark on an existing dataset of 360 C# code repositories, we empirically study the resulting API usage patterns per pattern type. Our results show practical evidence that not only do partial-order patterns represent a generalized super set of sequential-order patterns, partial-order mining also finds additional patterns missed by sequence mining, which are used by a larger number of developers across code repositories. Additionally, our study empirically quantifies the importance of the order information encoded in sequential and partial-order patterns for representing correct co-occurrences of code elements in real code. Furthermore, our benchmark can be used by other researchers to explore additional properties of API patterns.
doi:10.5220/0006839000910102 dblp:conf/icsoft/CerganiPNM18 fatcat:eitwod3u3rgthch5l2w7mtz5wq