A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Speech emotion Recognition using Neural Networks
2019
International journal of recent technology and engineering
Speech emotion recognition is one the area which can be used to identify the emotions from verbal expression of human. ...
Implementation of Speech Emotion Recognition may involve several learning models, classification methods, feature extraction and pattern recognition. ...
The spectral and cepstral features are Mel frequency cepstral coefficient (MFCC),linear prediction cepstral coefficient (LPCC) and acoustic features like pitch,energy, formants are also used in speech ...
doi:10.35940/ijrte.b1432.0982s1119
fatcat:zxocwlr2tbel3n7i5iombaeinq
Deep Learning Approach for Spoken Digit Recognition in Gujarati Language
2022
International Journal of Advanced Computer Science and Applications
To implement a deep learning approach, Convolutional Neural Network (CNN) with MFCC is used to analyze audio clips to generate spectrograms. ...
With this approach maximum 98.7% accuracy is achieved for spoken digits in Gujarati language with 98% Precision and 98% Recall. ...
Sen [10] proposed a framework for digit recognition using neural network. For feature extraction they have used Mel Frequency Cepstral Coefficient (MFCC) and Filter Banks (FB) coefficients. ...
doi:10.14569/ijacsa.2022.0130450
fatcat:yjaajwbuzfdtjg2jrbrcf22sje
Human-Computer Interaction with Detection of Speaker Emotions Using Convolution Neural Networks
2022
Computational Intelligence and Neuroscience
The suggested classification model, a 1D convolutional neural network (1D CNN), outperforms traditional machine learning approaches in classification. ...
Emotions play an essential role in human relationships, and many real-time applications rely on interpreting the speaker's emotion from their words. ...
[46] also used a recurrent neural network (RNN) to extract relationships from 3D spectrograms across timesteps and frequencies. Lee et al. ...
doi:10.1155/2022/7463091
pmid:35401731
pmcid:PMC8989588
fatcat:aeyjth2glfe6vpffkxejvdbdke
Speech Recognition using Multiscale Scattering of Audio Signals and Long Short-Term Memory of Neural Networks
2019
VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE
This method provides an increased accuracy than other standard methods that uses Melfrequency Cepstral coefficients (MFFC) and LSTM network to recognize digits. ...
In order to understand the audio language used by humans, machines use different techniques to convert speech to machine readable form called speech recognition. ...
There are various time-frequency representations to measure energy (or power) from a signal like Mel Frequency Cepstral Coefficients (MFCC), Fourier-based coefficients, wavelet scattering coefficients ...
doi:10.35940/ijitee.k2270.0981119
fatcat:ex7zwzgwzbhllmltn7wdgqag4y
Emotion Recognition with Capsule Neural Network
2022
Computer systems science and engineering
Among the models and the classifiers used to recognize emotions, neural networks appear to be promising due to the network's ability to learn and the diversity in configuration. ...
Following the convolutional neural network, a capsule neural network (CapsNet) with inputs and outputs that are not scalar quantities but vectors allows the network to determine the part-whole relationships ...
A capsule network for low resource spoken language understanding was proposed for commandand-control applications in [5] . ...
doi:10.32604/csse.2022.021635
fatcat:odikrqtbovhc3kerlqyw3vgxta
Sentiment analysis in non-fixed length audios using a Fully Convolutional Neural Network
2021
Biomedical Signal Processing and Control
Mel spectrogram and Mel Frequency Cepstral Coefficients are used as audio description methods and a Fully Convolutional Neural Network architecture is proposed as a classifier. ...
The results have been validated using three well known datasets: EMODB, RAVDESS and TESS. The results obtained were promising, outperforming the state-of-the-art methods. ...
Mel-frequency cepstral coefficients (MFCCs) are coefficients derived from a type of cepstral representation of the audio clip (a nonlinear "spectrum-of-a-spectrum") and it is the most used representation ...
doi:10.1016/j.bspc.2021.102946
fatcat:wdlxldvtubeu5j76qtkex74fwi
Arabic Speech Emotion Recognition from Saudi Dialect Corpus
2021
IEEE Access
The first model combined a convolutional neural networks (CNN), bi-directional long short-term memory (BLSTM), and deep neural networks (DNN) for the attention-based CNN-LSTM-DNN model, and the second ...
Researchers have applied machine learning algorithms to detect emotions from English speech, such as [11] , which used the SVM as a classifier to train the data and applied mel-frequency cepstral coefficients ...
doi:10.1109/access.2021.3110992
fatcat:c73knoukoradles6fmny6sgffq
Speech Emotion Recognition Systems: Review
2020
International Journal for Research in Applied Science and Engineering Technology
In Emotion Detection, domain has two types of features i.e. Important Utterance and Prosodic features. ...
In human machine interface application, emotion recognition from the speech signal has been research topic since many years. ...
After choosing the useful features such as Mel-Frequency Cepstral Coefficients (MFCC) and its transient parameters, a better performance with the application of Back Propagation Neural Networks (BPNNs) ...
doi:10.22214/ijraset.2020.1007
fatcat:ih3klw2nsneo5dpjfgdkzw6nei
Isolated Telugu Speech Recognition On TDSCC And DNN Techniques
2019
VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE
This research recognizes speaker independent data which gives good results by using TDSCC (Teager energy operator delta spectral cepstral coefficients) feature extraction technique and DNN (Deep Neural ...
Networks) feature classification technique. ...
Mel Frequency Cepstral Coefficients (MFCC) features are most commonly used for speech as well as emotion recognition obtaining a good appreciation rate. ...
doi:10.35940/ijitee.k2544.0981119
fatcat:uler6m6uhfa6zbhwdmdwk2ebti
SPECTRAL FEATURES ANALYSIS FOR HINDI SPEECH RECOGNITION SYSTEM
2016
International Journal of Research in Engineering and Technology
So, this paper deals with the various speech features that can used for Hindi speech that has been tested for many other languages. ...
In this work, MFCC, PLP, EFCC and LPC have been tested against Hindi Speech Corpus using HMM toolkit HTK 3.4.1. These features have been evaluated using common environment. ...
Acoustic models can be implemented using Hidden Markov Models (HMM), Support Vector Machines (SVM), Deep Neural Networks (DNN) etc. ...
doi:10.15623/ijret.2016.0507058
fatcat:fsa3icolgrbovnrccnaijpqpdu
Deep Multimodal Learning for Emotion Recognition in Spoken Language
[article]
2018
arXiv
pre-print
Second, we fuse all features by using a three-layer deep neural network to learn the correlations across modalities and train the feature extraction and fusion modules together, allowing optimal global ...
In this paper, we present a novel deep multimodal framework to predict human emotions based on sentence-level spoken language. Our architecture has two distinctive characteristics. ...
ACKNOLEDGEMENTS We would like to thank reviewers for the valuable feedback and the SAIL-USC for providing us the IEMOCAP dataset. ...
arXiv:1802.08332v1
fatcat:hyvzt6wrnbdedir3m3y7ld7fym
EMOTION DETECTION USING AUDIO DATA SAMPLES
2019
International Journal of Advanced Research in Computer Science
Several machine learning algorithms including K-nearest neighbours (KNN) and decision trees were implemented, based on acoustic features such as Mel Frequency Cepstral Coefficient (MFCC). ...
Our evaluation shows that the proposed approach yields accuracies of 98%, 92% and 99% using KNN, Decision Trees and Extra-Tree Classifiers, respectively, for 7 emotions using Toronto Emotional Speech Set ...
are extracted from the audio samples: Zero Crossing Rate (ZCR), Mel Frequency Cepstral Coefficient (MFCC), Tonnetz, Contrast, Mel, Chroma. ...
doi:10.26483/ijarcs.v10i6.6489
fatcat:pbc5qb5c3ff4vj4k75mqbam4hm
Speech Emotion Recognition System Using Recurrent Neural Network in Deep Learning
2022
International Journal for Research in Applied Science and Engineering Technology
Keywords: Deep Learning, Recurrent Neural Networks, Emotion Recognition, Speech Recognition, SER, RNN, Catatonia. ...
In this context, we also present an approach of using the Recurrent Neural Network which is a part of Deep learning algorithms. ...
Also, time-dependent acoustic features, different spectral features similarly as linear predictor coefficients (LPC), linear predictor cepstral coefficients (LPCC), and Mel-frequency cepstral coefficients ...
doi:10.22214/ijraset.2022.41112
fatcat:gbg7jfik6rff3k23inl6hvqsfa
A Review on Automatic Speech Recognition Architecture and Approaches
2016
International Journal of Signal Processing, Image Processing and Pattern Recognition
Speech recognition interfaces in native language will enable the illiterate/semi-literate people to use the technology to greater extent without the knowledge of operating with computer keyboard or stylus ...
Speech recognition applications enable people to use speech as another input mode to interact with applications with ease and effectively. ...
The output of DCT is Mel-cepstral coefficients of 13th order. Delta MFCC Features -In order to capture the changes in speech from frameto-frame, the first and second derivative of the MFCC coefficients ...
doi:10.14257/ijsip.2016.9.4.34
fatcat:xbagvt7qc5a2dbxbwcjsofp7y4
Arabic Speech Classification Method Based on Padding and Deep Learning Neural Network
2021
Baghdad Science Journal
The performance of the proposed method with padding technique is at par with the spectrogram but better than mel-spectrogram and mel-frequency cepstral coefficients. ...
Deep learning convolution neural network has been widely used to recognize or classify voice. ...
Acknowledgment: Funding for this research was from the Universitas Muhammadiyah Yogyakarta, Indonesia and work was conducted in collaboration with the Universiti Utara Malaysia, Malaysia. ...
doi:10.21123/bsj.2021.18.2(suppl.).0925
fatcat:jlona462yvbbpinkdjpci7mj5m
« Previous
Showing results 1 — 15 out of 398 results