The information value of early career productivity in mathematics: a ROC analysis of prediction errors in bibliometricly informed decision making

Jonas Lindahl, Rickard Danell
2016 Scientometrics  
The aim of this study was to provide a framework to evaluate bibliometric indicators as decision support tools from a decision making perspective and to examine the information value of early career publication rate as a predictor of future productivity. We used ROC analysis to evaluate a bibliometric indicator as a tool for binary decision making. The dataset consisted of 451 early career researchers in the mathematical sub-field of number theory. We investigated the effect of three different
more » ... efinitions of top performance groups-top 10, top 25, and top 50 %; the consequences of using different thresholds in the prediction models; and the added prediction value of information on early career research collaboration and publications in prestige journals. We conclude that early career performance productivity has an information value in all tested decision scenarios, but future performance is more predictable if the definition of a high performance group is more exclusive. Estimated optimal decision thresholds using the Youden index indicated that the top 10 % decision scenario should use 7 articles, the top 25 % scenario should use 7 articles, and the top 50 % should use 5 articles to minimize prediction errors. A comparative analysis between the decision thresholds provided by the Youden index which take consequences into consideration and a method commonly used in evaluative bibliometrics which do not take consequences into consideration when determining decision thresholds, indicated that differences are trivial for the top 25 and the 50 % groups. However, a statistically significant difference between the methods was found for the top 10 % group. Information on early career collaboration and publication strategies did not add any prediction value to the bibliometric indicator publication rate in any of the models. The key contributions of this research is the focus on consequences in terms of prediction errors and the notion of transforming uncertainty into risk when we are choosing decision thresholds in bibliometricly informed decision making. The significance of our results are discussed from the point of view of a science policy and management.
doi:10.1007/s11192-016-2097-9 pmid:27942088 pmcid:PMC5124050 fatcat:7dl64vyvhrcmjh4xqqmyrvszze