Intrinsic Plagiarism Detection using N-gram Classes

Imene Bensalem, Paolo Rosso, Salim Chikhi
2014 Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)  
When it is not possible to compare the suspicious document to the source document(s) plagiarism has been committed from, the evidence of plagiarism has to be looked for intrinsically in the document itself. In this paper, we introduce a novel languageindependent intrinsic plagiarism detection method which is based on a new text representation that we called n-gram classes. The proposed method was evaluated on three publicly available standard corpora. The obtained results are comparable to the
more » ... comparable to the ones obtained by the best state-of-the-art methods.
doi:10.3115/v1/d14-1153 dblp:conf/emnlp/BensalemRC14 fatcat:dqvatxnw3ja6zpy463q7izitim