A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
DeepPaperComposer: A Simple Solution for Training Data Preparation for Parsing Research Papers
2020
Proceedings of the First Workshop on Scholarly Document Processing
unpublished
We present DeepPaperComposer, a simple solution for preparing highly accurate (100%) training data without manual labeling to extract content from scholarly articles using convolutional neural networks (CNNs). We used our approach to generate data and trained CNNs to extract eight categories of both textual (titles, abstracts, authors, headers, figure and table captions, and body texts) and nontextual content (figures and tables) from 30 years of 2916 IEEE VIS conference papers, of which a
doi:10.18653/v1/2020.sdp-1.10
fatcat:zx3n7qxtnbepthr5u5ouiwxhze