Special Issue on Automated Perception of Human Affect from Longitudinal Behavioral Data

Pablo Barros, Stefan Wermter, Ognjen Rudovic, Hatice Gunes
2021 IEEE Transactions on Affective Computing  
Ç R ESEARCH trends within artificial intelligence and cognitive sciences are still heavily based on computational models that attempt to imitate human perception in various behavior categorization tasks. However, most research in the field focuses on instantaneous categorization and interpretation of human affect, such as the inference of six basic expressions of emotions from face images, affective dimensions (valence-arousal), stress and the engagement from multi-modal (e.g., video, audio,
more » ... autonomic physiology) data. This diverges from the developmental aspect of emotional behavior perception and learning, where human behavior and expressions of affect evolve and change over time. This calls for a new perspective when designing computational models for analysis and interpretation of human affective behavior: the computational models that can adapt timely and efficiently to different contexts and individuals over time, and incorporate existing neurophysiological and psychological aspects (prior knowledge). Thus, the long-term goal is to create life-long personalized learning and inference systems for analysis and perception of human affect. Such systems would benefit from long-term contextual information (including demographics and social aspects) as well as individual characteristics. This, in turn, would allow building intelligent agents (such as mobile and robot technologies) capable of adapting their behavior in a continuous and on-line manner to the target contexts and individuals. This special issue is aimed at contributions from computational neuroscience and psychology, artificial intelligence, machine learning, and affective computing, challenging and expanding current research on interpretation and estimation of human affective behavior from longitudinal data, i.e., single or multiple modalities captured over extended periods of time allowing efficient representation of behavior and inference in terms of affect and other socio-cognitive dimensions. In this regard, Hang, Zhang, Ren and Schuller present a novel method for improving the accuracy of emotion recognition of single modalities by extracting contextual information from auxiliary modalities. Du, Wu, Huang, Li and Wang introduce a convolution-based encoder/decoder structure for the recognition of videobased emotional expressions. Their model leverages the capability of convolutional neural networks to encode audio and visual information. They propose a novel temporal correlation model, based on the learned encoded information, to provide extraction of time-related features for video-based dimensional affect recognition. Modeling temporal information is also the key aspect of the paper written by Ong, Wu, Zhi-Xuan, Reddan, Kahhale, Mattek and Zaki. Their focus, however, is on extracting contextual dynamics of dialogue-rich interactions. They collected and presented the Stanford Emotional Narratives Dataset (SENDv1) and conducted several experiments on affect recognition with recurrent neural networks, that achieve human-level performance. In the same context, Kollias and Zafeiriou present a study on multi-CNN features used on a CNN-RNN architecture to recognize affect from monologues. They investigate different feature fusion and multi-task training schemes and provide an understanding of the impact of feature recombination and features for longitudinal affect recognition. Another important factor, when recognizing subjective information from videos, is the observer natural bias which is likely to impact the interpretation and modelling of computational solutions. Principi, Palmero, Jacques Junior and Escalera present an interesting study where they identify different subjective bias on an apparent personality analysis task. They propose a deep neural network, combining auditory and visual encoders, to model and identify such biases, and reduce their impact on the task. Chen and Epps propose an event-based behavior modelling, based on wearable devices, to recognize task load level on a temporally longitudinal task. Their main contribution is on adapting, processing and integrating physiological and behavioral information to achieve a dynamic decision-making process. Human perception is subjective, and to model and recognize it using artificial solutions, it is also important to understand the underlining factors that impact our perception. McDuff, Jun, Rowan, Czerwinski investigate the impact of emotion regulation strategies on the expression of affect in temporal tasks. One of their findings is that negative emotions tend to increase over the course of a day, which might impact the development of future affective computing solutions.
doi:10.1109/taffc.2021.3079535 fatcat:c6ktm46rkbgptffenogw3pagwe