Filters








9,940 Hits in 2.6 sec

Densely Connected Convolutional Networks for Speech Recognition [article]

Chia Yu Li, Ngoc Thang Vu
2018 arXiv   pre-print
This paper presents our latest investigation on Densely Connected Convolutional Networks (DenseNets) for acoustic modelling (AM) in automatic speech recognition.  ...  DenseN-ets are very deep, compact convolutional neural networks, which have demonstrated incredible improvements over the state-of-the-art results on several data sets in computer vision.  ...  Figure 1 : 1 The examples of 3-layer traditional convolutional networks (top) and 3-layer dense block DenseNet (bottom). Figure 2 : 2 A deep DenseNet with two dense blocks for Speech Recognition.  ... 
arXiv:1808.03570v1 fatcat:jvy42m6qjzchpeiq5weof3sjti

Investigation of Densely Connected Convolutional Networks with Domain Adversarial Learning for Noise Robust Speech Recognition [article]

Chia Yu Li, Ngoc Thang Vu
2021 arXiv   pre-print
We investigate densely connected convolutional networks (DenseNets) and their extension with domain adversarial training for noise robust speech recognition.  ...  Our experimental results reveal that DenseNets are more robust against noise than other neural network based models such as deep feed forward neural networks and convolutional neural networks.  ...  [13] for unsupervised domain adaptation in natural language processing and was then applied to deep feed forward neural networks (DNNs) for noise robust speech recognition [14, 15] .  ... 
arXiv:2112.10108v1 fatcat:fzxluw7xp5djfiahukz5xhczqu

Acoustic Modeling with Densely Connected Residual Network for Multichannel Speech Recognition

Jian Tang, Yan Song, Lirong Dai, Ian McLoughlin
2018 Interspeech 2018  
Motivated by recent advances in computer vision research, this paper proposes a novel acoustic model called Densely Connected Residual Network (DenseRNet) for multichannel speech recognition.  ...  It adopts the basic "building blocks" of ResNet with different convolutional layers, receptive field sizes and growth rates as basic components that are densely connected to form socalled denseR blocks  ...  DenseNet: Densely connected convolutional network DenseNet is composed of multiple dense blocks. Each block can be further divided into several densely connected convolution layers (see Fig. 1a ).  ... 
doi:10.21437/interspeech.2018-1089 dblp:conf/interspeech/TangSDM18 fatcat:v2g6licrpraqzfe6ciqn7wlciq

Exploiting Nontrivial Connectivity for Automatic Speech Recognition [article]

Marius Paraschiv, Lasse Borgholt, Tycho Max Sylvester Tax, Marco Singh, Lars Maaløe
2017 arXiv   pre-print
In this paper we make a comparison between residual networks, densely-connected networks and highway networks on an image classification task.  ...  Next, we show that these methodologies can easily be deployed into automatic speech recognition and provide significant improvements to existing models.  ...  networks in the context of speech recognition is justified.  ... 
arXiv:1711.10271v1 fatcat:nr4ely2kn5bmpjrm3sxgfpfiny

Dense Prediction on Sequences with Time-Dilated Convolutions for Speech Recognition [article]

Tom Sercu, Vaibhava Goel
2016 arXiv   pre-print
When doing dense prediction we pay specific attention to strided pooling in time and introduce an asymmetric dilated convolution, called time-dilated convolution, that allows for efficient and elegant  ...  Convolutional neural networks achieve good performance on this task, while being computationally efficient.  ...  Convolutional networks are also used in end-to-end models for speech recognition.  ... 
arXiv:1611.09288v2 fatcat:pkkkbkkudzbfpgbp2btvl5xm34

Bandwidth Embeddings for Mixed-bandwidth Speech Recognition [article]

Gautam Mantena, Ozlem Kalinli, Ossama Abdel-Hamid, Don McAllaster
2019 arXiv   pre-print
Furthermore, we propose to use parallel convolutional layers to handle the mismatch between the narrow and wideband speech better, where separate convolution layers are used for each type of input speech  ...  Our best system achieves 13% relative improvement on narrowband speech, while not degrading on wideband speech.  ...  Convolutional layers are used to reduce the spectral variations in the features and have shown to perform well for speech recognition [20, 21] . the AM which uses an embedding layer connected to all dense  ... 
arXiv:1909.02667v1 fatcat:acq3on567va3tpw65anvw4fxa4

Bandwidth Embeddings for Mixed-Bandwidth Speech Recognition

Gautam Mantena, Ozlem Kalinli, Ossama Abdel-Hamid, Don McAllaster
2019 Interspeech 2019  
Furthermore, we propose to use parallel convolutional layers to handle the mismatch between the narrow and wideband speech better, where separate convolution layers are used for each type of input speech  ...  Our best system achieves 13% relative improvement on narrowband speech, while not degrading on wideband speech.  ...  Note that we use shared parameters for the weights connecting the convolutional layers and the dense layer.  ... 
doi:10.21437/interspeech.2019-2589 dblp:conf/interspeech/MantenaKAM19 fatcat:mjrr3bbujrg4jpctd5rdrz7wce

Deep Neural Network Architectures for Modulation Classification [article]

Xiaoyu Liu, Diyu Yang, Aly El Gamal
2018 arXiv   pre-print
In this work, we investigate the value of employing deep learning for the task of wireless signal modulation recognition.  ...  We then develop architectures based on the recently introduced ideas of Residual Networks (ResNet [2]) and Densely Connected Networks (DenseNet [3]) to achieve high SNR accuracies of approximately 83.5%  ...  RNNs are neural networks with memory that are suitable for learning sequence tasks such as speech recognition and handwritten recognition.  ... 
arXiv:1712.00443v3 fatcat:qetjoxg3ivhgbpb5wypxduyc3e

Speech Synthesis from ECoG using Densely Connected 3D Convolutional Neural Networks: [article]

Miguel Angrick, Christian Herff, Emily Mugler, Matthew C. Tate, Marc W. Slutzky, Dean J. Krusienski, Tanja Schultz
2018 bioRxiv   pre-print
The proposed method uses a densely connected convolutional neural network topology which is well-suited to work with the small amount of data available from each participant.  ...  Approach: Here we show that deep neural networks can be used to map ECoG from speech production areas onto an intermediate representation of speech (logMel spectrogram).  ...  Here, we show that densely-connected convolutional neural networks can be trained on limited training data to map ECoG dynamics directly to a speech spectrogram.  ... 
doi:10.1101/478644 fatcat:5eizj7fu4fc3tm2o4737wubg6a

A Hybrid Technique using CNN+LSTM for Speech Emotion Recognition

2020 International Journal of Engineering and Advanced Technology  
This paper is motivated by using spectrograms as inputs to the hybrid deep convolutional LSTM for speech emotion recognition.  ...  Automatic speech emotion recognition is a very necessary activity for effective human-computer interaction.  ...  [15] used auto-encoders followed by single-layer Convolution Neural Network (CNN) for speech emotion recognition and achieved good performance.  ... 
doi:10.35940/ijeat.e1027.069520 fatcat:rflenzwfcjcfblygfn3lxt5vzy

Densely Connected Networks for Conversational Speech Recognition

Kyu Han, Akshay Chandrashekaran, Jungsuk Kim, Ian Lane
2018 Interspeech 2018  
We propose densely connected LSTMs (namely, dense LSTMs), inspired by the densely connected convolutional neural networks recently introduced for image classification tasks.  ...  With RNN-LM rescoring and lattice combination on the 5 systems (including 2 dense LSTM based systems) trained across three different phone sets, Capio's conversational speech recognition system has obtained  ...  Dense connection can be easily applied to existing LSTMbased neural network architectures for speech recognition, thanks to a simple connectivity pattern.  ... 
doi:10.21437/interspeech.2018-1486 dblp:conf/interspeech/HanCKL18 fatcat:yupvfdlforethijj47yxlmefkm

Deep neural network architectures for modulation classification

Xiaoyu Liu, Diyu Yang, Aly El Gamal
2017 2017 51st Asilomar Conference on Signals, Systems, and Computers  
We then develop architectures based on the recently introduced ideas of Residual Networks (ResNet, He et al. (2015) ) and Densely Connected Networks (DenseNet, Huang et al. (2016)) and achieve high SNR  ...  In this work, we investigate the value of employing statistical machine learning in general and deep learning in particular for the task of wireless signal modulation recognition.  ...  In this paper, we present our experiments of the deep neural network application on modulation recognition using optimized CNN, Densely connected network and CLDNN.  ... 
doi:10.1109/acssc.2017.8335483 dblp:conf/acssc/LiuYG17 fatcat:hzwto7ywdnbrtnuktuktxeg3qe

Convolutional Attention-based Seq2Seq Neural Network for End-to-End ASR [article]

Dan Lim
2017 arXiv   pre-print
Finally the proposed model proved its effectiveness for speech recognition achieving 15.8% phoneme error rate on TIMIT dataset.  ...  It also describes various neural network algorithms including Batch normalization, Dropout and Residual network which constitute the convolutional attention-based seq2seq neural network.  ...  In this thesis, I proposed the convolutional attention-based seq2seq neural network for end-to-end automatic speech recognition.  ... 
arXiv:1710.04515v1 fatcat:dzuhksgzejdxpi2idsecbnnmg4

Multi-Modal Emotion recognition on IEMOCAP Dataset using Deep Learning [article]

Samarth Tripathi, Sarthak Tripathi, Homayoon Beigi
2019 arXiv   pre-print
In this paper we attempt to exploit this effectiveness of Neural networks to enable us to perform multimodal Emotion recognition on IEMOCAP dataset using data from Speech, Text, and Motion capture data  ...  With the advancement of technology our understanding of emotions are advancing, there is a growing need for automatic emotion recognition systems.  ...  Our ensemble consists of Long Short Term Memory networks, Convolution Neural Networks, fully connected Multi-Layer Perceptrons and we complement them using techniques such as Dropout, adaptive optimizers  ... 
arXiv:1804.05788v3 fatcat:5bxu2yszsjcjbli5ec3ju3lwky

CONVOLUTIONAL NEURAL NETWORK FOR ARABIC SPEECH RECOGNITION

Engy Abdelmaksoud, Arafa Hassen, Nabila Hassan, Mohamed Hesham
2020 The Egyptian Journal of Language Engineering  
The convolutional neural network (CNN) is mainly used to execute feature learning and classification process. CNN achieved performance enhancement in automatic speech recognition (ASR).  ...  This work is focused on single word Arabic automatic speech recognition (AASR).  ...  HESHAM: Convolutional Neural Network for Arabic Speech Recognition Egyptian Journal of Language Engineering, Vol.8, No. 1, 2021  ... 
doi:10.21608/ejle.2020.47685.1015 fatcat:jdhgag25vbgqpl3ucywpird5tu
« Previous Showing results 1 — 15 out of 9,940 results