Filters








980 Hits in 4.7 sec

Voice conversion using speaker-dependent conditional restricted Boltzmann machine

Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
2015 EURASIP Journal on Audio, Speech, and Music Processing  
This paper presents a voice conversion (VC) method that utilizes conditional restricted Boltzmann machines (CRBMs) for each speaker to obtain high-order speaker-independent spaces where voice features  ...  Through voice-conversion experiments, we confirmed the high performance of our method especially in terms of objective evaluation, comparing it with conventional GMM, NN, RNN, and our previous work, speaker-dependent  ...  Experiments Conditions In our experiments, we conducted voice conversion using the ATR Japanese speech database [42] , comparing our method (speaker-dependent restricted Boltzmann machines; say 'SD-CRBM  ... 
doi:10.1186/s13636-014-0044-3 fatcat:m5ktckwrxbfspa7rdl2cejxphe

Sparse nonlinear representation for voice conversion

Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
2015 2015 IEEE International Conference on Multimedia and Expo (ICME)  
In our approach, we follow the latter approach and systematically estimate the parallel dictionaries using a joint-density restricted Boltzmann machine with sparse constraints.  ...  In these approaches, voice conversion is achieved by estimating a sparse vector that determines which dictionaries of the target speaker should be used, calculated from the matching of the input vector  ...  (they use a restricted Boltzmann machine (RBM)), and in [22] by Wu et al. (they use a conditional restricted Boltzmann machine (CRBM [23] )).  ... 
doi:10.1109/icme.2015.7177437 dblp:conf/icmcs/NakashikaTA15 fatcat:r5bn6swx4bgnjeoc5arofxsnky

Whisper-to-speech conversion using restricted Boltzmann machine arrays

Jing-jie Li, Li-Rong Dai, Ian V. McLoughlin, Zhen-hua Ling
2014 Electronics Letters  
To address these issues, the novel use of multiple restricted Boltzmann machines (RBMs) is reported as a statistical conversion model between whisper and speech spectral envelopes.  ...  Moreover, the accuracy of estimated pitch is improved using machine learning techniques for pitch estimation within only voiced (V) regions.  ...  New approach: This Letter advances the state-of-the art in statistical whisper-to-speech conversion in three areas: (i) restricted Boltzmann machines (RBMs) [4] and deep learning techniques [5] are  ... 
doi:10.1049/el.2014.1645 fatcat:inbb23uae5agdghzhszwkgrphi

Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines

Toru NAKASHIKA, Tetsuya TAKIGUCHI, Yasuo ARIKI
2014 IEICE transactions on information and systems  
method and an ordinary NN. key words: voice conversion, restricted Boltzmann machine, deep learning, speaker individuality  ...  Toru NAKASHIKA †a) , Nonmember, Tetsuya TAKIGUCHI †b) , and Yasuo ARIKI †c) , Members SUMMARY This paper presents a voice conversion technique using speaker-dependent Restricted Boltzmann Machines (RBM  ...  This paper investigates the voice conversion approach using restricted Boltzmann machines (RBMs) [24] or their stacked version (deep belief networks; DBNs), for capturing the latent representation.  ... 
doi:10.1587/transinf.e97.d.1403 fatcat:dvwyr24h7zau3lktd74ph2emlq

Voice conversion in time-invariant speaker-independent space

Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
2014 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
In this paper, we present a voice conversion (VC) method that utilizes conditional restricted Boltzmann machines (CRBMs) for each speaker to obtain time-invariant speaker-independent spaces where voice  ...  Through voice-conversion experiments, we confirmed the high performance of our method in terms of objective and subjective evaluations, comparing it with conventional GMM, NN, and speaker-dependent DBN  ...  PRELIMINARIES Our voice conversion system uses conditional restricted Boltzmann machines (CRBMs) to capture high-order conversion-friendly features.  ... 
doi:10.1109/icassp.2014.6855136 dblp:conf/icassp/NakashikaTA14 fatcat:fzjzxqrvuzegll25tppoq2i37q

Generative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine

Toru Nakashika, Yasuhiro Minami
2016 Interspeech 2016  
In this paper, we argue the way of modeling speech signals based on three-way restricted Boltzmann machine (3WRBM) for separating phonetic-related information and speaker-related information from an observed  ...  In our experiments, we discuss the effectiveness of the speech modeling through a speaker recognition, a speech (continuous phone) recognition, and a voice conversion tasks.  ...  three-way restricted Boltzmann machine (3WRBM) [1] .  ... 
doi:10.21437/interspeech.2016-1105 dblp:conf/interspeech/NakashikaM16 fatcat:swa6kejk7rhznhbwyzr7bzd3ee

Speaker-adaptive-trainable Boltzmann machine and its application to non-parallel voice conversion

Toru Nakashika, Yasuhiro Minami
2017 EURASIP Journal on Audio, Speech, and Music Processing  
In this paper, we present a voice conversion (VC) method that does not use any parallel data while training the model.  ...  Speech signals are represented using a probabilistic model based on the Boltzmann machine that defines phonological information and speaker-related information explicitly.  ...  We attempted the non-parallel training using another probabilistic model named the adaptive restricted Boltzmann machine (ARBM) [25] in our previous work.  ... 
doi:10.1186/s13636-017-0112-6 fatcat:2x4w7mc7rva6fadhixs4o66cgy

Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine

Toru Nakashika, Tetsuya Takiguchi, Yasuhiro Minami
2016 IEEE/ACM Transactions on Audio Speech and Language Processing  
Index Terms-Restricted Boltzmann machine, speaker adaptation, unsupervised training, voice conversion. Toru Nakashika (M'11) received the B.E., M.E., and Dr. Eng. degrees in computer  ...  In this paper, we present a voice conversion (VC) method that does not use any parallel data while training the model.  ...  ADAPTIVE RESTRICTED BOLTZMANN MACHINE A.  ... 
doi:10.1109/taslp.2016.2593263 fatcat:dw6xc25uergylnncwjk36zm444

Modeling spectral envelopes using deep conditional restricted Boltzmann machines for statistical parametric speech synthesis

Xiang Yin, Zhen-Hua Ling, Ya-Jun Hu, Li-Rong Dai
2016 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
This paper proposes a spectral modeling method using a deep conditional restricted Boltzmann machine (DCRBM) for statistical parametric speech synthesis.  ...  In this method, a DCRBM, which combines a deep neural network (DNN) with a conditional restricted Boltzmann machine (CRBM), is utilized to describe the conditional distribution of spectral envelopes given  ...  METHODS Conditional restricted Boltzmann machine The conditional restricted Boltzmann machine (CRBM) was originally proposed to model the temporal dependency of human motion features [10] and was later  ... 
doi:10.1109/icassp.2016.7472654 dblp:conf/icassp/YinLHD16 fatcat:qxl6ptwssvdkrolg47tiim7kfm

Vowels and Prosody Contribution in Neural Network Based Voice Conversion Algorithm with Noisy Training Data

Olaide Ayodeji Agbolade
2020 European Journal of Engineering Research and Science  
This research presents a neural network based voice conversion model.  ...  determine the contributions of the voiced, unvoiced and supra-segmental components of sounds to the voice conversion model.  ...  A Recurrent Temporal Restricted Boltzmann Machine based voice conversion model was proposed by [19] and [20] .  ... 
doi:10.24018/ejers.2020.5.3.1802 fatcat:3tgb5f6vzvbk7chqul4lh6boqu

Simultaneous Conversion of Speaker Identity and Emotion Based on Multiple-Domain Adaptive RBM

Takuya Kishida, Shin Tsukamoto, Toru Nakashika
2020 Interspeech 2020  
In this paper, we propose a multiple-domain adaptive restricted Boltzmann machine (MDARBM) for simultaneous conversion of speaker identity and emotion.  ...  Index Terms: voice conversion, emotion conversion, speaker recognition, emotional speech recognition, generative model 20] probabilistic models are used for representing latent features that cannot be  ...  An MDARBM-based VC is an expansion of the adaptive restricted Boltzmann machine (ARBM)-based VC [12] .  ... 
doi:10.21437/interspeech.2020-2262 dblp:conf/interspeech/KishidaTN20 fatcat:vbjp6zwhgncnjnsnvtv6t3trry

Weighted Generative Adversarial Network for many-to-many Voice Conversion

Dipjyoti Paul, Yannis Pantazis, Yannis Stylianou
2019 Proceedings of the ICA congress  
The goal of voice conversion (VC) is to convert speech from a source speaker to that of a target, without changing phonetic contents.  ...  Over the time, several non-linear spectral mapping techniques based on restricted Boltzmann machine (RBM) [3] , feed-forward deep neural networks (DNNs) [4] and recurrent DNNs [5] were also proposed  ...  INTRODUCTION Voice conversion (VC) modifies the para/non-linguistic information contained in the speech uttered by a source speaker, while keeping the linguistic contents unchanged.  ... 
doi:10.18154/rwth-conv-239420 fatcat:uizpygmhifeitjvmmurvgii2cy

A brief survey on deep belief networks and introducing a new object oriented toolbox (DeeBNet) [article]

Mohammad Ali Keyvanrad, Mohammad Mehdi Homayounpour
2016 arXiv   pre-print
Deep Belief Networks (DBNs) are deep architectures that use stack of Restricted Boltzmann Machines (RBM) to create a powerful generative model using training data.  ...  Nowadays, this is very popular to use the deep architectures in machine learning.  ...  It is explained how DBNs are constructed using Restricted Boltzmann Machines (RBMs). The Boltzmann Machine is a type of MRF.  ... 
arXiv:1408.3264v7 fatcat:3bwvoxgpvjbgfnaqx5dahmtwpy

Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition

Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
2019 EURASIP Journal on Audio, Speech, and Music Processing  
the conventional NMF-based dictionarylearning method [14] , and an adaptive restricted Boltzmann machine (ARBM)-based method [20] that does not use parallel data.  ...  Other VC methods, such as approaches based on non-negative matrix factorization (NMF) [13] [14] [15] , neural networks [16] , deep learning [17, 18] , restricted Boltzmann machines [19] [20] [21] ,  ...  Availability of data and materials All data used in this study are included in the ATR Japanese speech database [41] . Ethics approval and consent to participate Not applicable.  ... 
doi:10.1186/s13636-019-0160-1 fatcat:elgwsqt73ndzzotbnoplnwjx54

Modeling deep bidirectional relationships for image classification and generation

Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki
2016 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
As with restricted Boltzmann machines (RBMs) and deep Boltzmann machines (DBMs), all connections (weights) between two adjacent layers are undirected.  ...  [4] , speech recognition [5] , and voice conversion [6] .  ...  The DRM will be defined as an energy-based model, which resembles a restricted Boltzmann machine (RBM) and a deep Boltzmann machine (DBM).  ... 
doi:10.1109/icassp.2016.7471892 dblp:conf/icassp/NakashikaTA16 fatcat:rmqaspf22bcabatyjt33flhwgm
« Previous Showing results 1 — 15 out of 980 results