A Novel Approach for Authorship Verification using Similarity Measure

2020 International journal for innovative engineering and management research  
Authorship verification is a task of identifying whether two text documents are written by the same author or not by evaluating the veracity and authenticity of writings. Authorship Verification is used in various applications such as analysis of anonymous emails for forensic investigations, verification of historical literature, continuous authentication in cyber-security and detection of changes in writing styles. The Authorship Verification problem primarily depends on the similarity among
more » ... e documents. In this work, a new approach is proposed based on the similarity between the known documents of the author and anonymous document. In this approach, extract the most frequent terms from the dataset for document vector representation. These most frequent terms are used to represent the train and test documents. The term weight measure is used to represent the term value in the vector representation. The Cosine similarity measure is used to determine the similarity among the training and test document. Based on the threshold value of similarity score, the author of a test document is verified whether the test document is written by the suspected author or not. The PAN competition 2014 Authorship Verification dataset is used in this experiment. The proposed approach achieved best results for Authorship verification when compared with various solutions proposed in this domain
doi:10.48047/ijiemr/v09/i12/83 fatcat:fri65pdlkfaulog7xftoryzg3i