A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
pdfPapers: shell-script utilities for frequency-based multi-word phrase extraction from PDF documents
[article]
2021
arXiv
pre-print
Biomedical research is intensive in processing information in the previously published papers. This motivated a lot of efforts to provide tools for text mining and information extraction from PDF documents over the past decade. The *nix (Unix/Linux) operating systems offer many tools for working with text files, however, very few such tools are available for processing the contents of PDF files. This paper reports our effort to develop shell script utilities for *nix systems with the core
arXiv:2101.10554v1
fatcat:gpd4z6rxp5c5tesiidx4qjzmyu