A systematic study of knowledge graph analysis for cross-language plagiarism detection

Marc Franco-Salvador, Paolo Rosso, Manuel Montes-y-Gómez
2016 Information Processing & Management  
Elsevier Franco-Salvador, M.; Rosso, P.; Montes Gomez, M. (2016). A Systematic Study of Knowledge Graph Analysis for Cross-language Plagiarism Detection. Information Processing and Management. 52(4):550-570. Abstract Cross-language plagiarism detection aims to detect plagiarised fragments of text among documents in different languages. In this paper, we perform a systematic examination of Cross-language Knowledge Graph Analysis; an approach that represents text fragments using knowledge graphs
more » ... s a language independent content model. We analyse the contributions to crosslanguage plagiarism detection of the different aspects covered by knowledge graphs: word sense disambiguation, vocabulary expansion, and representation by similarities with a collection of concepts. In addition, we study both the relevance of concepts and their relations when detecting plagiarism. Finally, as a key component of the knowledge graph construction, we present a new weighting scheme of relations between concepts based on distributed representations of concepts. Experimental results in Spanish-English and German-English plagiarism detection show state-of-the-art performance and provide interesting insights on the use of knowledge graphs.
doi:10.1016/j.ipm.2015.12.004 fatcat:v44kzogvmffdvaego56qijxzwq