Graph-Based Siamese Network for Authorship Verification

Daniel Embarcadero-Ruiz, Helena Gómez-Adorno, Alberto Embarcadero-Ruiz, Gerardo Sierra
2022 Mathematics  
In this work, we propose a novel approach to solve the authorship identification task on a cross-topic and open-set scenario. Authorship verification is the task of determining whether or not two texts were written by the same author. We model the documents in a graph representation and then a graph neural network extracts relevant features from these graph representations. We present three strategies to represent the texts as graphs based on the co-occurrence of the POS labels of words. We
more » ... ose a Siamese Network architecture composed of graph convolutional networks along with pooling and classification layers. We present different variants of the architecture and discuss the performance of each one. To evaluate our approach we used a collection of fanfiction texts provided by the PAN@CLEF 2021 shared task in two settings: a "small" corpus and a "large" corpus. Our graph-based approach achieved average scores (AUC ROC, F1, Brier score, F0.5u, and C@1) between 90% and 92.83% when training on the "small" and "large" corpus, respectively. Our model obtain results comparable to those of the state of the art in this task and greater than traditional baselines.
doi:10.3390/math10020277 fatcat:b45k7hhb6fgjpioku74d6tnrai