Physiological Inspired Deep Neural Networks for Emotion Recognition

Pedro M. Ferreira, Filipe Marques, Jaime S. Cardoso, Ana Rebelo
2018 IEEE Access  
Facial expression recognition (FER) is currently one of the most active research topics due to its wide range of applications in the human-computer interaction field. An important part of the recent success of automatic FER was achieved thanks to the emergence of deep learning approaches. However, training deep networks for FER is still a very challenging task, since most of the available FER data sets are relatively small. Although transfer learning can partially alleviate the issue, the
more » ... mance of deep models is still below of its full potential as deep features may contain redundant information from the pre-trained domain. Instead, we propose a novel end-to-end neural network architecture along with a well-designed loss function based on the strong prior knowledge that facial expressions are the result of the motions of some facial muscles and components. The loss function is defined to regularize the entire learning process so that the proposed neural network is able to explicitly learn expression-specific features. Experimental results demonstrate the effectiveness of the proposed model in both lab-controlled and wild environments. In particular, the proposed neural network provides quite promising results, outperforming in most cases the current state-of-the-art methods. INDEX TERMS Facial expressions recognition, convolutional neural networks, regularization, domain-knowledge. 53930 2169-3536 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. VOLUME 6, 2018 P. M. Ferreira et al.: Physiological Inspired Deep Neural Networks for Emotion Recognition FIGURE 1. Illustration of the main challenges of FER. Those challenges are mainly related to: (a) several physical factors such as pose, viewing angle, occlusions and illumination; and (b) psychological factors such as the inter-individuals facial expressivenesses variability. FIGURE 2. The six basic facial expressions. From left to right: surprise, sadness, fear, anger, disgust and happy. be universal across cultures and subgroups, namely: happy, surprise, fear, anger, sadness, and disgust (see Figure 2 ); some systems also recognize the neutral and the contempt expressions [3] . Fewer works follow the dimensional approach, in which the FER is treated as regression problem in a continuous two-dimensional space, usually arousal and valence [4], [5] . It is the example of the research work proposed by Kosti et al. [5]. The authors proposed a very complete database that comprises annotations regarding the discrete emotional categories as well as the continuous emotional dimensions. The higher dimensionality of the arousal/valence space potentially allows describing more complex and subtle emotions. However, this richer representation of the expressions is more difficult to use in practice, since the linkage of such dimensional representation to a specific emotion is not straightforward [3] . Other works also attempt to recognize micro-expressions [6]. Micro-expressions are brief involuntary facial expressions that reveal the emotions that a person tries to conceal, especially in high-stakes situations [6] . A very comprehensive and recent survey on FER can be found in [3] . FIGURE 3. Diagram of blocks of a typical FER system, where I denotes the input image andŷ represents the predicted FE.
doi:10.1109/access.2018.2870063 fatcat:6euqkvfmkfgajdhzmcfrvmgqnm