7,411 Hits in 7.7 sec

Affect corpus 2.0

Ricardo A. Calix, Gerald M. Knapp
2011 Proceedings of the second annual ACM conference on Multimedia systems - MMSys '11  
Need for annotation at the actor level that includes:  Actors per story  Actor presence in a sentence and location in a story  Emotion magnitudes per actor Praat text grids were used to annotate sentence  ...  Objectives  Interest in emotion detection and emotion prediction: Advertisements  Human-computer interactions  Social media mining  Resources are needed to train and test emotion prediction models  ...  Applications  Speech/Text-to-Scene processing  Text-to-Speech processing  HCI  Calibration of emotion recognition within multimedia systems  Social media content analysis and Twitter dialog censoring  ... 
doi:10.1145/1943552.1943570 dblp:conf/mmsys/CalixK11 fatcat:yhuhbg4wh5dg3ktzujznvfbpv4

Feature Extraction from Speech Data for Emotion Recognition

S. Demircan, H. Kahramanlı
2014 Journal of Advances in Computer Networks  
In this paper we performed pre-processing necessary for emotion recognition from speech data. We extract features from speech signal.  ...  In recent years the workings which requires human-machine interaction such as speech recognition, emotion recognition from speech recognition is increasing.  ...  Nwe and friends [4] , a text independent method of emotion classification of speech is proposed in their paper.  ... 
doi:10.7763/jacn.2014.v2.76 fatcat:psvgkmmbrbaghpeltcezpviufa

A multimodal approach of generating 3D human-like talking agent

Minghao Yang, Jianhua Tao, Kaihui Mu, Ya Li, Jianfeng Che
2011 Journal on Multimodal User Interfaces  
A simplified high level Multimodal Marker Language (MML), in which only a few fields are used to coordinate the agent channels, is introduced to drive the agent.  ...  In this framework, lip movements are obtained by searching and matching acoustic features which are represented by Mel-frequency cepstral coefficients (MFCC) in audio-visual bimodal database.  ...  The actions in same column belong to same emotional state, and are different on action speed and magnitude.  ... 
doi:10.1007/s12193-011-0073-5 fatcat:34jcjjmdqbhwtkx3balxs5laci

Recognizing emotions of characters in movies

Ruchir Srivastava, Shuicheng Yan, Terence Sim, Sujoy Roy
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
and occlusions in scenes.  ...  This work presents an investigation into recognizing emotions of people in near real life scenarios.  ...  We are interested in individual and collective emotions of actors in the movie which may or may not have predictable effect on viewers. For example, some action movie scenes make people laugh.  ... 
doi:10.1109/icassp.2012.6288052 dblp:conf/icassp/SrivastavaYSR12 fatcat:dq32djdzwjbttankmqgfp7dcpe

A Study on a Speech Emotion Recognition System with Effective Acoustic Features Using Deep Learning Algorithms

Sung-Woo Byun, Seok-Pil Lee
2021 Applied Sciences  
In this work, we constructed a Korean emotional speech database for speech emotion analysis and proposed a feature combination that can improve emotion recognition performance using a recurrent neural  ...  In the speech emotion recognition study, the most important issue is the effective parallel use of the extraction of proper speech features and an appropriate classification engine.  ...  Linear Prediction Cepstrum Coefficients, MFCCs, and F0 have been widely used for the recognition of speech emotion.  ... 
doi:10.3390/app11041890 fatcat:p6vw34pr2ngp5igtskwy3mc2ym

Feature Optimization of Speech Emotion Recognition

Chunxia Yu, Ling Xie, Weiping Hu
2016 Journal of Biomedical Science and Engineering  
Speech emotion is divided into four categories, Fear, Happy, Neutral and Surprise in this paper. Traditional features and their statistics are generally applied to recognize speech emotion.  ...  What's more, two new characteristics of speech emotion, MFCC feature extracted from the fundamental frequency curve (MFCCF0) and amplitude perturbation parameters extracted from the shorttime average magnitude  ...  Four actors (two females and two males) read 50 different texts respectively in the six emotions.  ... 
doi:10.4236/jbise.2016.910b005 fatcat:zstjq5uudffxnfssuotksgv4ty

Acoustic correlates for perceived effort levels in male and female acted voices

Mary Pietrowicz, Mark Hasegawa-Johnson, Karrie G. Karahalios
2017 Journal of the Acoustical Society of America  
In this exploratory, openended, mixed-methods study, approximately 60% of all responses described emotion, and the remainder of responses split evenly between voice quality (including effort levels) and  ...  Perception-grounded male and female acoustic feature sets which tracked the actors' expressive effort levels through the continuum of whispered, breathy, modal, and resonant speech are presented and validated  ...  ACKNOWLEDGMENTS This work was funded in part by grant R21HS022948 from AHRQ. All findings and opinions are those of the authors, and are not endorsed by AHRQ.  ... 
doi:10.1121/1.4997189 pmid:28863599 fatcat:tehetg7chfalnaxzm4gc4qbvgq

Joint Learning of Speech-Driven Facial Motion with Bidirectional Long-Short Term Memory [chapter]

Najmeh Sadoughi, Carlos Busso
2017 Lecture Notes in Computer Science  
The input to the models are features extracted from speech that convey the verbal and emotional states of the speakers.  ...  The face conveys a blend of verbal and nonverbal information playing an important role in daily interaction.  ...  The IEMOCAP corpus is emotionally annotated at the speaking turn level by three annotators in term of nine emotional categories (neutral, anger, happiness, sadness, fear, frustration, surprise, disgust  ... 
doi:10.1007/978-3-319-67401-8_49 fatcat:6j52e3odbfekbmr6piot2lhy54

Mistaking minds and machines: How speech affects dehumanization and anthropomorphism

Juliana Schroeder, Nicholas Epley
2016 Journal of experimental psychology. General  
We predicted that paralinguistic cues in speech are particularly likely to convey the presence of a humanlike mind, such that removing voice from communication (leaving only text) would increase the likelihood  ...  that defining features of personhood may be conveyed more clearly in speech (Experiments 1 and 2).  ...  We had no a priori prediction about the effect of emotional valence; we manipulated valence only as a robustness check for the magnitude of our predicted effect of speech (vs. text).  ... 
doi:10.1037/xge0000214 pmid:27513307 fatcat:2d363rs2nnc4nc3viltiz4triq

A Review on Emotion Recognition Algorithms using Speech Analysis

Teddy Surya Gunawan, Muhammad Fahreza Alghifari, Malik Arman Morshidi, Mira Kartiwi
2018 Indonesian Journal of Electrical Engineering and Informatics (IJEEI)  
In recent years, there is a growing interest in speech emotion recognition (SER) by analyzing input speech.  ...  SER can be considered as simply pattern recognition task which includes features extraction, classifier, and speech emotion database.  ...  The EMO-DB and LDC Emotional Prosody Speech and Transcripts are two examples of an actor-based database.  ... 
doi:10.11591/ijeei.v6i1.409 fatcat:rqye5oo4gvaytnvbdprcm6dzdm

Emotion Recognition from Speech using Prosodic and Linguistic Features

Mahwish Pervaiz, Tamim Ahmed
2016 International Journal of Advanced Computer Science and Applications  
Speech signal can be used to extract emotions. However, it is pertinent to note that variability in speech signal can make emotion extraction a challenging task.  ...  Separately, prosodic/temporal and linguistic features of speech do not provide results with adequate accuracy. We can also find out emotions from linguistic features if we can identify contents.  ...  Therefore, problem occurs between recognition of these two sets of emotions due to the fact that we extract emotion directly and only from speech signal or text and due to the feature set we use for recognition  ... 
doi:10.14569/ijacsa.2016.070813 fatcat:lp5bqyxjezbx7cgtzfer7b7kry

A CNN-Assisted Enhanced Audio Signal Processing for Speech Emotion Recognition

Mustaqeem, Soonil Kwon
2019 Sensors  
A SoftMax classifier is used for the classification of emotions in speech.  ...  The proposed technique is evaluated on Interactive Emotional Dyadic Motion Capture (IEMOCAP) and Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) datasets to improve accuracy by 7.85%  ...  [26] utilized a complicated model, DBN used for features learning to get hidden features from speech and SVM classifier was utilized for emotion prediction to achieve high-level accuracy in SER using  ... 
doi:10.3390/s20010183 pmid:31905692 pmcid:PMC6982825 fatcat:wmf5dbicqza7ndkdxt6ufgw7ny

Expressive speech synthesis: a review

D. Govind, S. R. Mahadeva Prasanna
2012 International Journal of Speech Technology  
The review provided in this paper include, review of the various approaches for text to speech synthesis, various studies on the analysis and estimation of expressive parameters and various studies on  ...  In this approach, the ESS is achieved by modifying the parameters of the neutral speech which is synthesized from the text.  ...  The present work is also supported from the ongoing DIT funded project on the Development of text to speech synthesis systems in Assamese and Manipuri languages.  ... 
doi:10.1007/s10772-012-9180-2 fatcat:syjgawdjbbdapmdq6d6h5ulzni

Multimodal emotion estimation and emotional synthesize for interaction virtual agent

Minghao Yang, Jianhua Tao, Hao Li, Kaihui Mu
2012 2012 IEEE 2nd International Conference on Cloud Computing and Intelligence Systems  
For the output module of the agent, the voice is generated by TTS (Text-to-Speech)system by freely given text.  ...  In this study, we create a 3D interactive virtual character based on multi-modal emotional recognition and rule based emotional synthesize techniques.  ...  The actions in a column belong to same emotional state, and they are different on speed and magnitude. Finally, we obtain 42 action units for all emotions. a. b.  ... 
doi:10.1109/ccis.2012.6664394 dblp:conf/ccis/YangTLM12 fatcat:fiyjm7wdojfyhflr2lnrnuczwa

Voice Feature Extraction for Gender and Emotion Recognition

Madhu M. Nashipudimath, Pooja Pillai, Anupama Subramanian, Vani Nair, Sarah Khalife, V.A. Vyawahare, M.D. Patil
2021 ITM Web of Conferences  
The speech signal consists of semantic information, speaker information (gender, age, emotional state), accompanied by noise.  ...  This led to the development of a model for "Voice feature extraction for Emotion and Gender Recognition".  ...  Sharvari Govilkar and Principal, Dr. Sandeep M. Joshi, Pillai College of Engineering, New Panvel for support and boost.  ... 
doi:10.1051/itmconf/20214003008 fatcat:5e54lfohjvdolmkteqccba6srm
« Previous Showing results 1 — 15 out of 7,411 results