Word Clustering for Historical Newspapers Analysis

Lidia Pivovarova, University of Helsinki, Finland, Jani Marjanen, Elaine Zosa
2019 Proceedings of the Workshop on Language Technology for Digital Historical Archives - with a Special Focus on Central-, (South-)Eastern Europe, Middle East and North Africa   unpublished
This paper is a part of a collaboration between computer scientists and historians aimed at development of novel methods for historical newspapers analysis. We present a case study of ideological terms ending with -ism suffix in nineteenthcentury Finnish newspapers. We propose a two-step procedure to trace differences in word usages over time: training of diachronic embeddings on several time slices and when clustering embeddings of selected words together with their neighbours to obtain
more » ... rs to obtain historical context. The obtained clusters turn out to be useful for historical studies. The paper also discusses specific difficulties related to development of historian-oriented tools.
doi:10.26615/978-954-452-059-5_002 fatcat:i5hxvssjaffkbi4hs3ker5a43i