SCAP-TT: Tagging and lemmatising Spanish tourism discourse, and beyond

Patrick Goethals, Els Lefever, Lieve Macken
2017 Ibérica  
In this research note we report on the first results of SCAP, the Spanish Corpus Annotation Project, applied to tourism discourse (SCAP_tur). In particular, we present and assess a new TreeTagger parameter set for Spanish (SCAP-TT), which has been trained for the Part-of-Speech tagging (POS-tagging) and lemmatisation of Spanish promotional tourism texts. Although SCAP-TT has been trained for specialized tourism discourse, we also show promising results for the annotation of other text genres such as essays and literary texts.
doaj:717b4f30537d484688ca487c03925aae fatcat:xkdfylp555dkxl57vsnlthbyja