A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is
DeepSCC: Source Code Classification Based on Fine-Tuned RoBERTa (S)
Proceedings of the 33rd International Conference on Software Engineering and Knowledge Engineering
In software engineering-related tasks (such as programming language tag prediction based on code snippets from Stack Overflow), the programming language classification for code snippets is a common task. In this study, we propose a novel method DeepSCC, which uses a fine-tuned RoBERTa model to classify the programming language type of the source code. In our empirical study, we choose a corpus collected from Stack Overflow, which contains 224,445 pairs of code snippets and correspondingdoi:10.18293/seke2021-005 fatcat:nw7ghjuehvhtpagvose5vnmx3m