Automatic Detection of Text Genre [article]

Brett Kessler, Geoffrey Nunberg, Hinrich Schuetze (Xerox PARC and Stanford University)
1997 arXiv   pre-print
As the text databases available to users become larger and more heterogeneous, genre becomes increasingly important for computational linguistics as a complement to topical and structural principles of classification. We propose a theory of genres as bundles of facets, which correlate with various surface cues, and argue that genre detection based on surface cues is as successful as detection based on deeper structural properties.
arXiv:cmp-lg/9707002v1 fatcat:5o6lbf2tlrfezd6dijeyawmqba