Comparing Supervised Machine Learning Techniques for Genre Analysis in Software Engineering Research Articles

Felipe A. de Britto, FUMEC University, Brazil, Thiago C. Ferreira, Leonardo P. Nunes, Fernando S. Parreiras, Federal University of Minas Gerais, Brazil, Federal University of Minas Gerais, Brazil, FUMEC University, Brazil
2021 Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications   unpublished
Written communication is of utmost importance to the progress of scientific research. The speed of such development, however, may be affected by the scarcity of reviewers to referee the quality of research articles. In this context, automatic approaches that are able to query linguistic segments in written contributions by detecting the presence or absence of common rhetorical patterns have become a necessity in the refereeing process. This paper aims to compare supervised machine learning
more » ... iques tested to accomplish genre analysis in Introduction sections of software engineering articles. A semi-supervised approach to augment the number of annotated sentences in SciSents 1 was performed. Two supervised approaches using SVM and logistic regression to assess the F-score for genre analysis in the corpus were undertaken. A technique based on logistic regression and BERT has been found to perform genre analysis highly satisfactorily with an average of 88.25 on F-score when retrieving patterns at an overall level.
doi:10.26615/978-954-452-072-4_008 fatcat:tvedi7ab6ngajfjqldetuyytr4