A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Densely Connected Convolutional Networks for Speech Recognition
[article]
2018
arXiv
pre-print
This paper presents our latest investigation on Densely Connected Convolutional Networks (DenseNets) for acoustic modelling (AM) in automatic speech recognition. ...
DenseN-ets are very deep, compact convolutional neural networks, which have demonstrated incredible improvements over the state-of-the-art results on several data sets in computer vision. ...
Figure 1 : 1 The examples of 3-layer traditional convolutional networks (top) and 3-layer dense block DenseNet (bottom).
Figure 2 : 2 A deep DenseNet with two dense blocks for Speech Recognition. ...
arXiv:1808.03570v1
fatcat:jvy42m6qjzchpeiq5weof3sjti
Investigation of Densely Connected Convolutional Networks with Domain Adversarial Learning for Noise Robust Speech Recognition
[article]
2021
arXiv
pre-print
We investigate densely connected convolutional networks (DenseNets) and their extension with domain adversarial training for noise robust speech recognition. ...
Our experimental results reveal that DenseNets are more robust against noise than other neural network based models such as deep feed forward neural networks and convolutional neural networks. ...
[13] for unsupervised domain adaptation in natural language processing and was then applied to deep feed forward neural networks (DNNs) for noise robust speech recognition [14, 15] . ...
arXiv:2112.10108v1
fatcat:fzxluw7xp5djfiahukz5xhczqu
Acoustic Modeling with Densely Connected Residual Network for Multichannel Speech Recognition
2018
Interspeech 2018
Motivated by recent advances in computer vision research, this paper proposes a novel acoustic model called Densely Connected Residual Network (DenseRNet) for multichannel speech recognition. ...
It adopts the basic "building blocks" of ResNet with different convolutional layers, receptive field sizes and growth rates as basic components that are densely connected to form socalled denseR blocks ...
DenseNet: Densely connected convolutional network DenseNet is composed of multiple dense blocks. Each block can be further divided into several densely connected convolution layers (see Fig. 1a ). ...
doi:10.21437/interspeech.2018-1089
dblp:conf/interspeech/TangSDM18
fatcat:v2g6licrpraqzfe6ciqn7wlciq
Exploiting Nontrivial Connectivity for Automatic Speech Recognition
[article]
2017
arXiv
pre-print
In this paper we make a comparison between residual networks, densely-connected networks and highway networks on an image classification task. ...
Next, we show that these methodologies can easily be deployed into automatic speech recognition and provide significant improvements to existing models. ...
networks in the context of speech recognition is justified. ...
arXiv:1711.10271v1
fatcat:nr4ely2kn5bmpjrm3sxgfpfiny
Dense Prediction on Sequences with Time-Dilated Convolutions for Speech Recognition
[article]
2016
arXiv
pre-print
When doing dense prediction we pay specific attention to strided pooling in time and introduce an asymmetric dilated convolution, called time-dilated convolution, that allows for efficient and elegant ...
Convolutional neural networks achieve good performance on this task, while being computationally efficient. ...
Convolutional networks are also used in end-to-end models for speech recognition. ...
arXiv:1611.09288v2
fatcat:pkkkbkkudzbfpgbp2btvl5xm34
Bandwidth Embeddings for Mixed-bandwidth Speech Recognition
[article]
2019
arXiv
pre-print
Furthermore, we propose to use parallel convolutional layers to handle the mismatch between the narrow and wideband speech better, where separate convolution layers are used for each type of input speech ...
Our best system achieves 13% relative improvement on narrowband speech, while not degrading on wideband speech. ...
Convolutional layers are used to reduce the spectral variations in the features and have shown to perform well for speech recognition [20, 21] . the AM which uses an embedding layer connected to all dense ...
arXiv:1909.02667v1
fatcat:acq3on567va3tpw65anvw4fxa4
Bandwidth Embeddings for Mixed-Bandwidth Speech Recognition
2019
Interspeech 2019
Furthermore, we propose to use parallel convolutional layers to handle the mismatch between the narrow and wideband speech better, where separate convolution layers are used for each type of input speech ...
Our best system achieves 13% relative improvement on narrowband speech, while not degrading on wideband speech. ...
Note that we use shared parameters for the weights connecting the convolutional layers and the dense layer. ...
doi:10.21437/interspeech.2019-2589
dblp:conf/interspeech/MantenaKAM19
fatcat:mjrr3bbujrg4jpctd5rdrz7wce
Deep Neural Network Architectures for Modulation Classification
[article]
2018
arXiv
pre-print
In this work, we investigate the value of employing deep learning for the task of wireless signal modulation recognition. ...
We then develop architectures based on the recently introduced ideas of Residual Networks (ResNet [2]) and Densely Connected Networks (DenseNet [3]) to achieve high SNR accuracies of approximately 83.5% ...
RNNs are neural networks with memory that are suitable for learning sequence tasks such as speech recognition and handwritten recognition. ...
arXiv:1712.00443v3
fatcat:qetjoxg3ivhgbpb5wypxduyc3e
Speech Synthesis from ECoG using Densely Connected 3D Convolutional Neural Networks:
[article]
2018
bioRxiv
pre-print
The proposed method uses a densely connected convolutional neural network topology which is well-suited to work with the small amount of data available from each participant. ...
Approach: Here we show that deep neural networks can be used to map ECoG from speech production areas onto an intermediate representation of speech (logMel spectrogram). ...
Here, we show that densely-connected convolutional neural networks can be trained on limited training data to map ECoG dynamics directly to a speech spectrogram. ...
doi:10.1101/478644
fatcat:5eizj7fu4fc3tm2o4737wubg6a
A Hybrid Technique using CNN+LSTM for Speech Emotion Recognition
2020
International Journal of Engineering and Advanced Technology
This paper is motivated by using spectrograms as inputs to the hybrid deep convolutional LSTM for speech emotion recognition. ...
Automatic speech emotion recognition is a very necessary activity for effective human-computer interaction. ...
[15] used auto-encoders followed by single-layer Convolution Neural Network (CNN) for speech emotion recognition and achieved good performance. ...
doi:10.35940/ijeat.e1027.069520
fatcat:rflenzwfcjcfblygfn3lxt5vzy
Densely Connected Networks for Conversational Speech Recognition
2018
Interspeech 2018
We propose densely connected LSTMs (namely, dense LSTMs), inspired by the densely connected convolutional neural networks recently introduced for image classification tasks. ...
With RNN-LM rescoring and lattice combination on the 5 systems (including 2 dense LSTM based systems) trained across three different phone sets, Capio's conversational speech recognition system has obtained ...
Dense connection can be easily applied to existing LSTMbased neural network architectures for speech recognition, thanks to a simple connectivity pattern. ...
doi:10.21437/interspeech.2018-1486
dblp:conf/interspeech/HanCKL18
fatcat:yupvfdlforethijj47yxlmefkm
Deep neural network architectures for modulation classification
2017
2017 51st Asilomar Conference on Signals, Systems, and Computers
We then develop architectures based on the recently introduced ideas of Residual Networks (ResNet, He et al. (2015) ) and Densely Connected Networks (DenseNet, Huang et al. (2016)) and achieve high SNR ...
In this work, we investigate the value of employing statistical machine learning in general and deep learning in particular for the task of wireless signal modulation recognition. ...
In this paper, we present our experiments of the deep neural network application on modulation recognition using optimized CNN, Densely connected network and CLDNN. ...
doi:10.1109/acssc.2017.8335483
dblp:conf/acssc/LiuYG17
fatcat:hzwto7ywdnbrtnuktuktxeg3qe
Convolutional Attention-based Seq2Seq Neural Network for End-to-End ASR
[article]
2017
arXiv
pre-print
Finally the proposed model proved its effectiveness for speech recognition achieving 15.8% phoneme error rate on TIMIT dataset. ...
It also describes various neural network algorithms including Batch normalization, Dropout and Residual network which constitute the convolutional attention-based seq2seq neural network. ...
In this thesis, I proposed the convolutional attention-based seq2seq neural network for end-to-end automatic speech recognition. ...
arXiv:1710.04515v1
fatcat:dzuhksgzejdxpi2idsecbnnmg4
Multi-Modal Emotion recognition on IEMOCAP Dataset using Deep Learning
[article]
2019
arXiv
pre-print
In this paper we attempt to exploit this effectiveness of Neural networks to enable us to perform multimodal Emotion recognition on IEMOCAP dataset using data from Speech, Text, and Motion capture data ...
With the advancement of technology our understanding of emotions are advancing, there is a growing need for automatic emotion recognition systems. ...
Our ensemble consists of Long Short Term Memory networks, Convolution Neural Networks, fully connected Multi-Layer Perceptrons and we complement them using techniques such as Dropout, adaptive optimizers ...
arXiv:1804.05788v3
fatcat:5bxu2yszsjcjbli5ec3ju3lwky
CONVOLUTIONAL NEURAL NETWORK FOR ARABIC SPEECH RECOGNITION
2020
The Egyptian Journal of Language Engineering
The convolutional neural network (CNN) is mainly used to execute feature learning and classification process. CNN achieved performance enhancement in automatic speech recognition (ASR). ...
This work is focused on single word Arabic automatic speech recognition (AASR). ...
HESHAM: Convolutional Neural Network for Arabic Speech Recognition
Egyptian Journal of Language Engineering, Vol.8, No. 1, 2021 ...
doi:10.21608/ejle.2020.47685.1015
fatcat:jdhgag25vbgqpl3ucywpird5tu
« Previous
Showing results 1 — 15 out of 9,940 results