Towards Authorship Attribution for Bibliometrics using Stylometric Features

Andi Rexha, Stefan Klampfl, Mark Kröll, Roman Kern
2015 International Conference on Scientometrics and Informetrics  
The overwhelming majority of scientific publications are authored by multiple persons; yet, bibliographic metrics are only assigned to individual articles as single entities. In this paper, we aim at a more fine-grained analysis of scientific authorship. We therefore adapt a text segmentation algorithm to identify potential author changes within the main text of a scientific article, which we obtain by using existing PDF extraction techniques. To capture stylistic changes in the text, we adopt
more » ... number of stylometric features. We evaluate our approach on a small subset of PubMed articles consisting of an approximately equal number of research articles written by a varying number of authors. Our results indicate that the more authors an article has the more potential author changes are identified. These results can be considered as an initial step towards a more detailed analysis of scientific authorship, thereby extending the repertoire of bibliometrics.
dblp:conf/issi/RexhaKKK15 fatcat:p4xq2fonnfgsfdyks3id4jyxw4