Unsupervised Model for Detecting Plagiarism in Internet-based Handwritten Arabic Documents

Mahmoud Zaher, Abdulaziz Shehab, Mohamed Elhoseny, Farahat Farag Farahat
2020 Journal of Organizational and End User Computing  
Due to the rapid increase of internet-based data, there is urgent need for a robust intelligent documents security mechanism. Although there are many attempts to build a plagiarism detection system in natural language documents, the unlimited variation and different writing styles of each character in Arabic documents make building such systems challenging. Based on its position in a word, the same Arabic letter can be written three different ways, which makes the handwritten character
more » ... on a cumbersome process. This article proposes an intelligent unsupervised model to detect plagiarism in these documents called ASTAP. First, a handwritten Arabic character recognition system is proposed using the Grey Wolf Optimization (GWO) algorithm. Then, a modified Abstract Syntax Tree (AST) is used to match the contents of the Arabic documents to detect any similarity. Compared to the state-of-the-art methods, ASTAP improves the effectiveness of the plagiarism detection in terms of the matched similarity ratio, the precision ratio, and the processing time.
doi:10.4018/joeuc.2020040103 fatcat:tmecmdln6nflvpv4jkqtl5ch3u