A systematic metadata harvesting workflow for analysing scientific networks

Bilal H. Butt, Muhammad Rafi, Muhammad Sabih
2021 PeerJ Computer Science  
One of the disciplines behind the science of science is the study of scientific networks. This work focuses on scientific networks as a social network having different nodes and connections. Nodes can be represented by authors, articles or journals while connections by citation, co-citation or co-authorship. One of the challenges in creating scientific networks is the lack of publicly available comprehensive data set. It limits the variety of analyses on the same set of nodes of different
more » ... ific networks. To supplement such analyses we have worked on publicly available citation metadata from Crossref and OpenCitatons. Using this data a workflow is developed to create scientific networks. Analysis of these networks gives insights into academic research and scholarship. Different techniques of social network analysis have been applied in the literature to study these networks. It includes centrality analysis, community detection, and clustering coefficient. We have used metadata of Scientometrics journal, as a case study, to present our workflow. We did a sample run of the proposed workflow to identify prominent authors using centrality analysis. This work is not a bibliometric study of any field rather it presents replicable Python scripts to perform network analysis. With an increase in the popularity of open access and open metadata, we hypothesise that this workflow shall provide an avenue for understanding scientific scholarship in multiple dimensions.
doi:10.7717/peerj-cs.421 pmid:33817056 pmcid:PMC7959659 fatcat:s27z2jycx5fqnlu4dj2tnoyp4i