The Role of Context in Affective Behavior Understanding [chapter]

Louis-Philippe Morency
2013 Social Emotions in Nature and Artifact  
Face-to-face communication is highly interactive. Even when only one person speaks at the time, other participants exchange information continuously amongst themselves and with the speaker through gesture, gaze, posture and facial expressions. Such affective feedback is an essential and predictable aspect of natural conversation and its absence can significantly disrupt participants ability to communicate [2, 20] . During multi-party interactions such as in meetings, information is exchanged
more » ... ween participants using both audio and visual channels. Visual feedback can range from a simple eye glance to a large arm gesture or posture change. One important visual cue is head nod during conversation. Head nods are used for displaying agreement, grounding information or during turn-taking [7, 8] . Recognizing these affective gestures is important for understanding all the information exchanged during a meeting or conversation, and can be particularly crucial for identifying more subtle factors such as the effectiveness of communication [17] , points of confusion, status relationships between participants [18], or the diagnosis social disorders [15] . This chapter argues that it is possible to significantly improve state-of-the art recognition techniques by exploiting regularities in how people communicate. People do not provide affective feedback at random. Rather they react to the current topic, previous utterances and the speaker's current verbal and nonverbal behavior [1]. For example, listeners are far more likely to nod or shake if the speaker has just asked them a question, and incorporating such dialogue context can improve recognition performance during human-robot interaction [11] . More generally, speakers and listeners co-produce a range of lexical, prosodic, and nonverbal patterns. Our goal is to automatically discover these patterns using only easily observable features of human face-to-face interaction (e.g. prosodic features and eye gaze), and exploit them to improve recognition accuracy. This chapter shows that the recognition of affective gestures can be im-
doi:10.1093/acprof:oso/9780195387643.003.0009 fatcat:32x3i43b5rc43fl4nojmrhu2lm