Extracting perceived landscape properties from text sources

Olga Koblet
2020
In parallel with the emergence of new data sources and the re-discovery of existing sources, such as written first-person narratives available in travel reports and diaries, is an increasing realisation of the importance of capturing bottomup ways of experiencing landscapes. This recognition is reflected in different policy works including overarching frameworks European Landscape Convention and Millennium Ecosystem Assessment, and local ones, such as Landscape Character Assessment in England
more » ... d Scotland (LCA) and the Swiss Landscape Monitoring Program. Important challenges for these frameworks are how to include multiple perspectives of landscape perception and how to integrate different senses including sound and smell experiences, memories and associations, and experiential perceptions such as touch and feel. The proliferation of new data in the form of natural language has brought with it a need for robust and reproducible workflows allowing extraction and classification of descriptions referring to perceived landscape properties. Therefore, the overall aim of this thesis is to explore the potential of written firstperson narratives for landscape assessment and to develop methodological workflows, which can extract and classify information containing visual, aural and olfactory perception as well as tranquillity from natural language. To approach this aim, we set out a series of experiments in Great Britain and the English Lake District, first, demonstrating to what degree landscape scenicness can be modelled purely as a function of language (Publication 1), second, extracting and classifying information of other senses from written first-person narratives (Publications 2, 3, 4) and exploring temporal changes in landscapes, in perception and in their polarity (Publication 3). Lastly, we created a spatial corpus of written firstperson narratives and assessed if the collected information is useful for practice (Publication 4). Our model based on written first-person narratives was able to explain 52% of the variation in scenicness, comparable to models using more traditional approaches of interviews and participatory methods, land cover data and social media, demonstrating that textual descriptions are feasible to use in studies of landscape perception. From these descriptions we were able to extract more than 8000 explicit references to aural perception in Great Britain, which accounts for a small percentage of all descriptions (ca. 0.25%), but in its absolute value exceeds what can be collected using interviews and does not intrude into the experiences of people. Estimation of polarity gave an additional level of understanding of descriptions classified into different types of sound emitter with no clear distinction of, for example, anthrophony being more negative than biophony or geophony, contrary to the statements in literature that natural sounds have positive connotations. The majority of extracted descriptions in Great Britain (ca. 59%) referred to the perceived absence of sound, therefore, we undertook a detailed study of intertwined visual and aural perception reflected in the concept of tranquillity concentrating on the Lake District region. To do so we used historical and contemporary corpora and a combination of micro-and micro-analysis, allowing us, first, to develop a taxonomy of tranquillity as encoded in natural language and, second, to explore the changes in descriptions of the Lake District, such as an overall decline of mentions of total silence and increase in the references to tranquillity as a contrast to anthropogenic intrusions. By mapping our results we were able to demonstrate that spatial modelling based on proximity to potential noise emitters as a proxy of tranquillity disturbance does not reveal tranquillity pockets close to transportation arteries, which emerged through our analysis. The aforementioned results were obtained from textual resources partly unique to Great Britain (ScenicOrNot and Geograph datasets).To demonstrate the transferability of our results to other territories, we created a workflow allowing collection of first-person narratives for a region of interest. Using this workflow we were able to collect almost 7000 rich first-person narratives of the Lake District, comprising ca. 8 million of words. From these descriptions we extracted, classified and linked to space more than 28000 references to visual perception, almost 1500 to aural perception and tranquillity and 78 explicit references to olfactory experiences. We explored these descriptions using four levels of granularity: Great Britain, the Lake District, areas of distinctive character as used in LCA, and individual named landscape elements. We presented the resulting dataset to the Lake District National Park Authority and an important local pressure group Friends of the Lake District, who gave their feedback on strengths and weaknesses of our approach and explicitly confirmed its value for LCA and other monitoring activities of the National Park. Overall, our results demonstrated that written firstperson narratives are a valuable source of landscape perception complementary to field-based studies, since they contain information about different types of perceived landscape properties that can be extracted and classified and extend temporal coverage as demonstrated through analysis of the historical corpus. However, the possibility to 'go back in time' should not necessarily mean several centuries; it can be useful also for shorter time periods, when, for example, no interviews were conducted for a certain area. Important limitations of our approach from the data source point of view include a potential bias towards people who enjoy writing and a potential over/under-representation of certain groups based on other criteria. From the methodological point of view, despite the advances of natural languages processing tools and techniques, natural language remains a challenging source of information for analysis, including problems related to disambiguation of words that can be used in different senses, detection of metaphorical and ironical phrases and differentiation between mentioned locations that are visited or simply seen. Further research has to be done, first, to improve these methods and, second, to develop clear criteria that allow the assessment of how balanced a corpus of written first-person narratives is. We see great potential of our results for several fields of studies including GIScience, landscape studies, digital humanities and tourism studies. Information available in textual sources can be scaled up to cover large spatial extents, offering GIScience, first, to add an additional dimension of human experiences in the research related to, for example, sense of place and delineation of cognitive regions, and, second, to increase the participation in the production of spatial information and knowledge. Methods used in this work can be extended to explore other landscape-related concepts, which are likely to be captured in the natural language, including concepts of wilderness and naturalness. Thus, we see value in integrating our results into a more general landscape preference model based on written first-person narratives and textual analysis. The corpus of first-person perception in the Lake District created in this work contains a plethora of writers and viewpoints, and allows researchers in the digital humanities to continue exploring the words (e.g., 'sublime', 'picturesque') and concepts they refer to (e.g., 'scenery', 'manner') as selected by contemporary authors to describe their affections towards landscapes. This information, as shown throughout our work, gives an additional level of understanding of the ways writing of the forebears has influenced our landscape perception today, and suggests to explore deeper to which extent modern day tourism follows the foundations laid by Dorothy and William Wordsworth in the Romantic era and by Alfred Wainwright throughout the 20th century. This thesis is presented in two parts: a synthesis, describing the project as a whole, and the following 4 publications, which are found in the appendix. viii words that can be used in different senses, detection of metaphorical and ironical phrases and differentiation between mentioned locations that are visited or simply seen. Further research has to be done, first, to improve these methods and, second, to develop clear criteria that allow the assessment of how balanced a corpus of written first-person narratives is. We see great potential of our results for several fields of studies including GIScience, landscape studies, digital humanities and tourism studies. Information available in textual sources can be scaled up to cover large spatial extents, offering GIScience, first, to add an additional dimension of human experiences in the research related to, for example, sense of place and delineation of cognitive regions, and, second, to increase the participation in the production of spatial information and knowledge. Methods used in this work can be extended to explore other landscape-related concepts, which are likely to be captured in the natural language, including concepts of wilderness and naturalness. Thus, we see value in integrating our results into a more general landscape preference model based on written first-person narratives and textual analysis. The corpus of first-person perception in the Lake District created in this work contains a plethora of writers and viewpoints, and allows researchers in the digital humanities to continue exploring the words (e.g., 'sublime', 'picturesque') and concepts they refer to (e.g., 'scenery', 'manner') as selected by contemporary authors to describe their affections towards landscapes. This information, as shown throughout our work, gives an additional level of understanding of the ways writing of the forebears has influenced our landscape perception today, and suggests to explore deeper to which extent modern day tourism follows the foundations laid by Dorothy and William Wordsworth in the Romantic era and by Alfred Wainwright throughout the 20th century. This thesis is presented in two parts: a synthesis, describing the project as a whole, and the following 4 publications, which are found in the appendix. Publication 1: Chesnokova, O., Nowak, M., and Purves, R.S., 2017. A crowdsourced model of landscape preference. In:
doi:10.5167/uzh-193201 fatcat:nojlrh7vlnfybnvv5sx6lpzkfq