Plagiarism Detection Based on Citing Sentences [chapter]

Sidik Soleman, Atsushi Fujii
2017 Lecture Notes in Computer Science  
With the advent of the internet and easy access to digital libraries, plagiarism has become a major issue. Applying search engines is one of the plagiarism detection techniques that converts plagiarism patterns to search queries. Generating suitable queries is the heart of this technique, and the existing methods suffer from the lack of producing accurate queries, Precision, and Speed of retrieved results. This research work proposes a framework called ParaMaker. It generates accurate
more » ... s of any sentence, similar to human behaviors, and sends them to a search engine to find the plagiarism patterns. For the English language, ParaMaker is examined against six known methods with standard PAN2014 datasets. The results obtained show an improvement of 34% in terms of the Recall parameter, while the parameters Precision and Speed are maintained. In the Persian language, statements of suspicious documents are examined compared to an exact search approach. ParaMaker shows an improvement of at least 42% in Recall, while Precision and Speed are maintained.
doi:10.1007/978-3-319-67008-9_38 fatcat:vjf67csttbg4fch7rra427xh5u