Exploration and Exploitation in Natural Viewing Behavior

Ricardo Ramos Gameiro, Kai Kaspar, Sabine U. König, Sontje Nordholt, Peter König
2017 Scientific Reports  
Many eye-tracking studies investigate visual behavior with a focus on image features and the semantic content of a scene. A wealth of results on these aspects is available, and our understanding of the decision process where to look has reached a mature stage. However, the temporal aspect, whether to stay and further scrutinize a region (exploitation) or to move on and explore image regions that were yet not in the focus of attention (exploration) is less well understood. Here, we investigate
more » ... e trade-off between these two processes across stimuli with varying properties and sizes. In a free viewing task, we examined gaze parameters in humans, involving the central tendency, entropy, saccadic amplitudes, number of fixations and duration of fixations. The results revealed that the central tendency and entropy scaled with stimulus size. The mean saccadic amplitudes showed a linear increase that originated from an interaction between the distribution of saccades and the spatial bias. Further, larger images led to spatially more extensive sampling as indicated by a higher number of fixations at the expense of reduced fixation durations. These results demonstrate a profound shift from exploitation to exploration as an adaptation of main gaze parameters with increasing image size. Vision is the key modality by which humans interact with the environment. However, our processing capacity is limited regarding attention 1-5 . In fact, visual attention is an integral part of our interaction with the environment. By focusing the line of sight by eye movements, humans actively select regions of interest for in-depth processing with high spatial resolution [6] [7] [8] . Therefore, investigating the visual system with an emphasis on overt visual attention has developed into a most active research topic in cognitive science 9 . Although vision evolves in an alternation of saccades and fixations, overt visual attention is a continuous process. We constantly have to decide whether to move on to sample another image region or to linger in the currently fixated region for in-depth processing. In analogy to other science areas, here we label these two processes exploration and exploitation respectively 10 . Thus, each decision to fixate on a new location terminates scrutinizing of the currently fixated region and establishes a classic exploration-exploitation dilemma 11, 12 . In visual behavior, the number and spatial distribution of fixations characterize the exploration of a scene 13 . By contrast, the time spent at a fixated location (i.e., fixation duration) reflects the degree of in-depth processing of what is observed and hence characterizes the exploitation aspect [14] [15] [16] . However, given time constraints for image observation and interpretation, exploration of the whole visual scene and exploitation of local image regions impose conflicting requirements. Consequently, while scanning a scene, overt attention in visual behavior consists of a continuous interplay between exploration and exploitation. Vision research has identified several factors that influence eye movement behavior. These factors can be classified as top-down and bottom-up influences 2, 17-20 as well as spatial viewing biases [21] [22] [23] [24] [25] . Top-down effects are aspects of the observing agent, the task, and the context. In particular, top-down factors comprise the observer's current motivational state and time-independent personality traits 26, 27 . Furthermore, the observer's current emotional state 28, 29 as well as the emotional valence of external objects [30] [31] [32] are strong top-down influences on exploration and exploitation. Top-down factors also cover specific personal interests 33 that may be different depending on the current task performed by the observer 34, 35 . Overall, such top-down factors play a major role in viewing behavior and can explain a large part of the variance in eye movements. By contrast, bottom-up factors comprise the properties of the stimulus that influences the selection of fixation locations. These properties may relate to primary contrasts (e.g., luminance, color, and saturation). For instance, edge information and high contrast of image regions play a role in attracting fixations 23, [36] [37] [38] . In fact, models based on the concept of a salience map that incorporates such basic image properties can predict human visual
doi:10.1038/s41598-017-02526-1 pmid:28536434 pmcid:PMC5442137 fatcat:cwo4qwl7xngcpnzht5442yuzfe