A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is
The R Journal
Recent advances in natural language processing have produced libraries that extract lowlevel features from a collection of raw texts. These features, known as annotations, are usually stored internally in hierarchical, tree-based data structures. This paper proposes a data model to represent annotations as a collection of normalized relational data tables optimized for exploratory data analysis and predictive modeling. The R package cleanNLP, which calls one of two state of the art NLPdoi:10.32614/rj-2017-035 fatcat:55i4l3urxnfdnesqvrlxixmuea