A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Can We Quantify Domainhood? Exploring Measures to Assess Domain-Specificity in Web Corpora
[chapter]
2018
Communications in Computer and Information Science
Web corpora are a cornerstone of modern Language Technology. Corpora built from the web are convenient because their creation is fast and inexpensive. Several studies have been carried out to assess the representativeness of general-purpose web corpora by comparing them to traditional corpora. Less attention has been paid to assess the representativeness of specialized or domain-specific web corpora. In this paper, we focus on the assessment of domain representativeness of web corpora and we
doi:10.1007/978-3-319-99133-7_17
fatcat:ncso5ksl5vfwvkrdeqehfqbuze