Filters








194 Hits in 7.2 sec

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition

Titouan Parcollet, Ying Zhang, Mohamed Morchid, Chiheb Trabelsi, Georges Linares, Renato de Mori, Yoshua Bengio
2018 Interspeech 2018  
Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end  ...  This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model.  ...  The authors would like to thank Kyle Kastner and Mirco Ravanelli for their helpful comments.  ... 
doi:10.21437/interspeech.2018-1898 dblp:conf/interspeech/ParcolletZMTLMB18 fatcat:4pykru2h2raelk7osfng6kylme

Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition [article]

Titouan Parcollet, Ying Zhang, Mohamed Morchid, Chiheb Trabelsi, Georges Linarès, Renato De Mori, Yoshua Bengio
2018 arXiv   pre-print
Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end  ...  This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model.  ...  The authors would like to thank Kyle Kastner and Mirco Ravanelli for their helpful comments. References  ... 
arXiv:1806.07789v1 fatcat:u5gmmshx7fdbxpcxiupa3i2k4q

Speech recognition with quaternion neural networks [article]

Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Renato De Mori
2018 arXiv   pre-print
Neural network architectures are at the core of powerful automatic speech recognition systems (ASR).  ...  We propose to investigate modern quaternion-valued models such as convolutional and recurrent quaternion neural networks in the context of speech recognition with the TIMIT dataset.  ...  Introduction During the last decade, deep neural networks (DNN) have encountered a wide success in automatic speech recognition.  ... 
arXiv:1811.09678v1 fatcat:xdcc23u5xfcmxgzvxiqot7mywu

Learning Speech Emotion Representations in the Quaternion Domain [article]

Eric Guizzo, Tillman Weyde, Simone Scardapane, Danilo Comminiello
2022 arXiv   pre-print
Our method, named RH-emo, is a novel semi-supervised architecture aimed at extracting quaternion embeddings from real-valued monoaural spectrograms, enabling the use of quaternion-valued networks for speech  ...  RH-emo is a hybrid real/quaternion autoencoder network that consists of a real-valued encoder in parallel to a real-valued emotion classifier and a quaternion-valued decoder.  ...  convolutional neural network (QCNN) is an extension of the real-valued convolutional neural network to the quaternion domain.  ... 
arXiv:2204.02385v1 fatcat:zyb5zzxzcrgq3lsq6xrwfu4kg4

2021 Index IEEE Transactions on Cognitive and Developmental Systems Vol. 13

2021 IEEE Transactions on Cognitive and Developmental Systems  
Departments and other items may also be covered if they have been judged to have archival value. The Author Index contains the primary entry for each item, listed under the first author's name.  ...  ., An End-to-End Mammo- Deep VariationalAutoencoder for Mapping Functional Brain Networks.  ...  ., +, TCDS Sept. 2021 631-644 Color Facial Expression Recognition by Quaternion Convolutional Neural Network With Gabor Attention.  ... 
doi:10.1109/tcds.2021.3137068 fatcat:r2zbw6js65fpnenn4kybim3kw4

Bidirectional Quaternion Long-Short Term Memory Recurrent Neural Networks for Speech Recognition [article]

Titouan Parcollet, Mohamed Morchid, Georges Linarès, Renato De Mori
2018 arXiv   pre-print
Recurrent neural networks (RNN) are at the core of modern automatic speech recognition (ASR) systems.  ...  In particular, long-short term memory (LSTM) recurrent neural networks have achieved state-of-the-art results in many speech recognition tasks, due to their efficient representation of long and short term  ...  In particular, a deep quaternion network [11, 12] , a deep quaternion convolutional network [13, 14] , or a quaternion recurrent neural network [15] have been successfully employed for challenging  ... 
arXiv:1811.02566v1 fatcat:ky3qlz3jwrgwta4nx5jz56q3zu

Quaternion Recurrent Neural Networks [article]

Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Chiheb Trabelsi, Renato De Mori, Yoshua Bengio
2019 arXiv   pre-print
We show that both QRNN and QLSTM achieve better performances than RNN and LSTM in a realistic application of automatic speech recognition.  ...  We propose a novel quaternion recurrent neural network (QRNN), alongside with a quaternion long-short term memory neural network (QLSTM), that take into account both the external relations and these internal  ...  Quaternion convolutional neural networks for end-to-end automatic speech recognition.  ... 
arXiv:1806.04418v3 fatcat:kraoh6r2freo5mrrmhjk4dc7d4

Quaternion Convolutional Neural Network for Color Image Classification and Forensics

Qilin Yin, Jinwei Wang, Xiangyang Luo, Jiangtao Zhai, Sunil Kr. Jha, Yun-Qing Shi
2019 IEEE Access  
The convolutional neural network is widely popular for solving the problems of color image feature extraction.  ...  Therefore, a novel quaternion convolutional neural network (QCNN) is proposed in this paper, which always treats color triples as a whole to avoid information loss.  ...  Focusing on the question above, we propose a quaternion convolutional neural network (QCNN) model. According to Evans et al.  ... 
doi:10.1109/access.2019.2897000 fatcat:4j6bgwvwazatzkps6viarpuqum

Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data [article]

Vincent Roger, Jérôme Farinas, Julien Pinquier
2020 arXiv   pre-print
Most state-of-the-art speech systems are using Deep Neural Networks (DNNs). Those systems require a large amount of data to be learned.  ...  In this paper we position ourselves for the following speech processing tasks: Automatic Speech Recognition, speaker identification and emotion recognition.  ...  AUTOMATIC SPEECH RECOGNITION SYSTEMS In this section, we will review SOTA ASR systems using multi-models and end-to-end models.  ... 
arXiv:2003.04241v1 fatcat:mdtry5jdozfkbhpow72ui67a7u

A Quaternion Gated Recurrent Unit Neural Network for Sensor Fusion

Uche Onyekpe, Vasile Palade, Stratis Kanarachos, Stavros-Richard G. Christopoulos
2021 Information  
Recurrent Neural Networks (RNNs) are known for their ability to learn relationships within temporal sequences.  ...  GRUs are also known to be more computationally efficient than their variant, the Long Short-Term Memory neural network (LSTM), due to their less complex structure and as such, are more suitable for applications  ...  The UCI-HAR dataset is located at http: //archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones (accessed on 30 December 2020) and described in [33] .  ... 
doi:10.3390/info12030117 fatcat:4a4qk5fq5bfndi6f6rggslje4e

Quaternion-Based Graph Convolution Network for Recommendation [article]

Yaxing Fang, Pengpeng Zhao, Guanfeng Liu, Yanchi Liu, Victor S. Sheng, Lei Zhao, Xiaofang Zhou
2021 arXiv   pre-print
To this end, in this paper, we propose a simple yet effective Quaternion-based Graph Convolution Network (QGCN) recommendation model.  ...  Graph Convolution Network (GCN) has been widely applied in recommender systems for its representation learning capability on user and item embeddings.  ...  convolutional and recurrent quaternion neural networks in the context of speech recognition.  ... 
arXiv:2111.10536v1 fatcat:k3rekddj5japxgaa774k5trjam

Lightweight Convolutional Neural Networks By Hypercomplex Parameterization [article]

Eleonora Grassucci, Aston Zhang, Danilo Comminiello
2021 arXiv   pre-print
Such a malleability allows processing multidimensional inputs in their natural domain without annexing further dimensions, as done, instead, in quaternion neural networks for 3D inputs like color images  ...  Hypercomplex neural networks have proved to reduce the overall number of parameters while ensuring valuable performances by leveraging the properties of Clifford algebras.  ...  Deep quaternion neural networks for spoken language understanding. In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 504-511, Okinawa, Japan, December 2017. T. Parcollet, M.  ... 
arXiv:2110.04176v1 fatcat:4lfrmkzywrfadmzy54qmns7bwq

Comparative Analysis of Performance of Deep Learning Classification Approach based on LSTM-RNN for Textual and Image Datasets

Alaa Sahl Gaafar, Jasim Mohammed Dahr, Alaa Khalaf Hamoud
2022 Informatica (Ljubljana, Tiskana izd.)  
The need for generating faster and effective decisions about systems, processes, and applications gave rise to many artificial intelligences motivated approaches such as convolutional neural networks (  ...  CNNs), recurrent neural networks (RNNs), fuzzy analytics, etc.  ...  Another deep quaternion network is put forward by [19] in which its quaternion convolution basically substitutes the real multiplications, and its quaternion kernel is not parameterized further.  ... 
doi:10.31449/inf.v46i5.3872 dblp:journals/informaticaSI/GaafarDH22 fatcat:igza2z4vyfduxeaep2qcvxglle

End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech [article]

Hai X. Pham, Yuting Wang, Vladimir Pavlovic
2017 arXiv   pre-print
Our deep neural network directly maps an input sequence of speech audio to a series of micro facial action unit activations and head rotations to drive a 3D blendshape face model.  ...  We present a deep learning framework for real-time speech-driven 3D facial animation from just raw waveforms.  ...  DEEP END-TO-END LEARNING FOR 3D FACE SYNTHESIS FROM SPEECH A.  ... 
arXiv:1710.00920v2 fatcat:bis4z3hys5dxhg2tf3hx24b7eq

IEEE Access Special Section Editorial: Advanced Data Mining Methods for Social Computing

Yongqiang Zhao, Shirui Pan, Jia Wu, Huaiyu Wan, Huizhi Liang, Haishuai Wang, Huawei Shen
2020 IEEE Access  
., ''Exploring deep spectrum representations via attention-based recurrent and convolutional neural networks for speech emotion recognition,'' develops a model which leverages a parallel combination of  ...  ., ''MV-GCN: Multi-view graph convolutional networks for link prediction,'' proposes a novel multiview graph convolutional neural network (MV-GCN) model based on the Matrix Completion method by simultaneously  ... 
doi:10.1109/access.2020.3043060 fatcat:qbqk5f4ojvadlazhk2mc343sra
« Previous Showing results 1 — 15 out of 194 results