SSNCSE_NLP@Authorship Identification of SOurce COde (AI-SOCO) 2020

Nitin Nikamanth Appiah Balaji, B. Bharathi
2020 Forum for Information Retrieval Evaluation  
As the amount of data and software applications increases, it becomes important to identify the true authors for ownership and liability of the work. Issues such as plagiarism in academic activities, open-source contributions, and identification of the creators of malware applications can be done using automatic authorship identification models. In this work, the performance of Character Count vectorization and TFIDF models are studied on the AI-SOCO data-set. We achieved a significant
more » ... nt from the baseline with 85% accuracy on the test-set and 92% accuracy on the dev-set.
dblp:conf/fire/BalajiB20b fatcat:y4kfkbcnnfgbbdjljdywkuaurq