A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Mahak Samim: A Corpus of Persian Academic Texts for Evaluating Plagiarism Detection Systems
2016
Forum for Information Retrieval Evaluation
In this paper we introduce Mahak Samim, a plagiarism detection corpus that consists of Persian academic texts in which plagiarism cases are embedded. This corpus, which can be used for evaluating plagiarism detection systems, consists of more than five thousand artificial plagiarism cases with various lengths and diverse degrees of obfuscation. The development process and the features of the corpus are described here. CCS Concepts • Information systems ➝ Information retrieval ➝ Retrieval tasks
dblp:conf/fire/SharifabadiE16
fatcat:ke3usghg6veq3m24p6wetcuvxy