Computational Text Analysis within the Humanities [chapter]

2020 Reflektierte algorithmische Textanalyse  
This position paper is based on a keynote presentation at the COLING 2016 Workshop on Language Technology for Digital Humanities (LT4DH) in Osaka, Japan. It departs from observations about working practices in Humanities disciplines following a hermeneutic tradition of text interpretation vs. the methodoriented research strategies in Computational Linguistics (CL). The respective praxeological traditions are quite different. Yet more and more researchers are willing to open up towards truly
more » ... sdisciplinary collaborations, trying to exploit advanced methods from CL within research that ultimately addresses questions from the traditional humanities disciplines and the social sciences. The article identifies two central workflow-related issues for this type of collaborative project in the Digital Humanities (DH) and Computational Social Science: (i) a scheduling dilemma, which affects the point in the course of the project when specifications of the core analysis task are fixed (as early as possible from the computational perspective, but as late as possible from the Humanities perspective) and (ii) the subjectivity problem, which concerns the degree of intersubjective stability of the target categories of analysis. CL methodology demands high interannotator agreement and theory-independent categories, while the categories in hermeneutic reasoning are often tied to a particular interpretive approach (viz. a theory of literary interpretation) and may bear a non-trivial relation to a reader's pre-understanding. Building a comprehensive methodological framework that helps overcome these issues requires considerable time and patience. The established computational methodology has to be gradually opened up to more hermeneutically oriented research questions; resources and tools for the relevant categories of analysis have to be constructed. This article does not call into question that well-targeted efforts along this path are worthwhile. Yet, it makes the following additional programmatic point regarding directions for future research: It might be fruitful to explore -in parallel -the potential lying in DH-specific variants of the concept of rapid Note: This article is a slightly revised version of: Jonas Kuhn (2019).
doi:10.1515/9783110693973-004 fatcat:bmv3fasbu5hwlfx3vncsmcwnx4