376 Hits in 5.4 sec

Music Similarity Estimation with the Mean-Covariance Restricted Boltzmann Machine

J. Schluter, C. Osendorfer
2011 2011 10th International Conference on Machine Learning and Applications and Workshops  
We train a recently proposed model, the mean-covariance Restricted Boltzmann Machine [1], on music spectrogram excerpts and employ it for music similarity estimation.  ...  Existing content-based music similarity estimation methods largely build on complex hand-crafted feature extractors, which are difficult to engineer.  ...  A recently proposed method for unsupervised feature extraction from images, the mean-covariance Restricted Boltzmann Machine (mcRBM) [1] , has successfully been applied to spectrogram excerpts to model  ... 
doi:10.1109/icmla.2011.102 dblp:conf/icmla/SchluterO11 fatcat:yftfkxqnjbchzpae7t3j3d4fi4

Autotagging music with conditional restricted Boltzmann machines [article]

Michael Mandel, Razvan Pascanu, Hugo Larochelle, Yoshua Bengio
2011 arXiv   pre-print
This paper describes two applications of conditional restricted Boltzmann machines (CRBMs) to the task of autotagging music.  ...  The second is the use of a discriminative RBM, a type of CRBM, to autotag music.  ...  Figure 5 : 5 Comparison of autotagging retrieval performance with and without conditional restricted Boltzmann machine-based smoothing for discriminative restricted Boltzmann machine (DRBM), multi-layer  ... 
arXiv:1103.2832v1 fatcat:yynwmustszhehczs4n3rae4wdq

Complex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra [article]

Toru Nakashika, Shinji Takaki, Junichi Yamagishi
2018 arXiv   pre-print
The proposed model, the complex-valued restricted Boltzmann machine (CRBM), is designed to deal with complex-valued visible units as an extension of the well-known restricted Boltzmann machine (RBM).  ...  Like the RBM, the CRBM learns the relationships between visible and hidden units without having connections between units in the same layer, which dramatically improves training efficiency by using Gibbs  ...  These restrictions make it exceedingly easy to estimate the parameters using Gibbs sampling or CD [2] , which cannot be seen in an extension of a Boltzmann machine (directional-unit Boltzmann machine  ... 
arXiv:1803.09946v1 fatcat:fiozg5v7ezewhms4queswqrwpe

Learning Musical Relations using Gated Autoencoders [article]

Stefan Lattner, Maarten Grachten, Gerhard Widmer
2017 arXiv   pre-print
In this preliminary work, we study the potential of two unsupervised learning techniques - Restricted Boltzmann Machines (RBMs) and Gated Autoencoders (GAEs) - to capture pre-defined transformations from  ...  We believe these results show that models such as GAEs may provide the basis for more encompassing music analysis systems, by endowing them with a better understanding of the structures underlying music  ...  Acknowledgments This work is supported by the European Research Council (ERC) under the EU's Horizon 2020 Framework Programme (ERC Grant Agreement number 670035, project CON ESPRESSIONE).  ... 
arXiv:1708.05325v1 fatcat:6zthgbdpvrbhfemrzi2idwpove

Learning Transformations of Musical Material using Gated Autoencoders

Stefan Lattner, Maarten Gratchen, Gerhard Widmer
2017 Zenodo  
In this preliminary work, we study the potential of two unsupervised learning techniques—Restricted Boltzmann Machines (RBMs) and Gated Autoencoders (GAEs)—to capture pre-defined transformations from constructed  ...  We believe these results show that models such as GAEs may provide the basis for more encompassing music analysis systems, by endowing them with a better understanding of the structures underlying music  ...  Acknowledgments This work is supported by the European Research Council (ERC) under the EU's Horizon 2020 Framework Programme (ERC Grant Agreement number 670035, project CON ESPRESSIONE).  ... 
doi:10.5281/zenodo.4285594 fatcat:utcmpfk46nbxff3np4bc4s3anm

Speaker-adaptive-trainable Boltzmann machine and its application to non-parallel voice conversion

Toru Nakashika, Yasuhiro Minami
2017 EURASIP Journal on Audio, Speech, and Music Processing  
Speech signals are represented using a probabilistic model based on the Boltzmann machine that defines phonological information and speaker-related information explicitly.  ...  In the conversion stage, a given speech signal is decomposed into phonological and speaker-related information, the speaker-related information is replaced with that of the desired speaker, and then voice-converted  ...  Competing interests The authors declare that they have no competing interests.  ... 
doi:10.1186/s13636-017-0112-6 fatcat:2x4w7mc7rva6fadhixs4o66cgy

Contextual tag inference

Michael I. Mandel, Razvan Pascanu, Douglas Eck, Yoshua Bengio, Luca M. Aiello, Rossano Schifanella, Filippo Menczer
2011 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)  
We show that users agree more on tags applied to clips temporally "closer" to one another; that conditional restricted Boltzmann machine models of tags can more accurately predict related tags when they  ...  This paper examines the use of two kinds of context to improve the results of content-based music taggers: the relationships between tags and between the clips of songs that are tagged.  ...  This work was partly supported by the project Social Integration of Semantic Annotation Networks for Web Applications funded by National Science Foundation award IIS-0811994.  ... 
doi:10.1145/2037676.2037689 fatcat:6j45wigqqvbb7pvjwwqtalyxoa

Learning Sparse Feature Representations For Music Annotation And Retrieval

Juhan Nam, Jorge Herrera, Malcolm Slaney, Julius O. Smith
2012 Zenodo  
Likewise, Schlter et. al. compared Restricted Boltzmann Machine (RBM), mean-covariance RBM and DBN on similarity-based music classification [17] . Our approach is also similar to these pipelines.  ...  Sparse Restricted Boltzmann Machine (sparse RBM): The Restricted Boltzmann Machine is a bipartite undirected graphical model that consists of visible nodes x and hidden nodes h [18] .  ... 
doi:10.5281/zenodo.1415202 fatcat:awjob44rmfdaplcskckqyeqvwm

Polyphonic Music Modelling with LSTM-RTRBM

Qi Lyu, Zhiyong Wu, Jun Zhu
2015 Proceedings of the 23rd ACM international conference on Multimedia - MM '15  
Our model integrates the ability of Long Short-Term Memory (LSTM) in memorizing and retrieving useful history information, together with the advantage of Restricted Boltzmann Machine (RBM) in high dimensional  ...  Our approach greatly improves the performance of polyphonic music sequence modelling, achieving the state-of-the-art results on multiple datasets.  ...  For example, the system in [5] could learn entire songs given a melody and the associated chord sequence, and the RNNs combined with restricted Boltzmann machines (RBMs) for feature representation [  ... 
doi:10.1145/2733373.2806383 dblp:conf/mm/LyuWZ15 fatcat:yop3t3tcw5apxbhk6gmn4puo5i

An Assessment of Learned Score Features for Modeling Expressive Dynamics in Music

Maarten Grachten, Florian Krebs
2014 IEEE transactions on multimedia  
For feature learning we use Restricted Boltzmann machines, and contrast this with features learned using matrix decomposition methods.  ...  The study of musical expression is an ongoing and increasingly data-intensive endeavor, in which machine learning techniques can play an important role.  ...  Restricted Boltzmann machines Boltzmann machines are stochastic neural networks, whose global state is characterized by an energy function (that depends on the activation of units, their biases and the  ... 
doi:10.1109/tmm.2014.2311013 fatcat:quo5uwabdrcblltoqiimtlvoii

DVAE#: Discrete Variational Autoencoders with Relaxed Boltzmann Priors [article]

Arash Vahdat, Evgeny Andriyash, William G. Macready
2018 arXiv   pre-print
We propose two approaches for relaxing Boltzmann machines to continuous distributions that permit training with importance-weighted bounds.  ...  Experiments on the MNIST and OMNIGLOT datasets show that these relaxations outperform previous discrete VAEs with Boltzmann priors.  ...  The underlying Boltzmann distribution is a restricted Boltzmann machine (RBM) with bipartite connectivity which allows for parallel Gibbs updates.  ... 
arXiv:1805.07445v4 fatcat:dimyikfoabgshejk3ub4qkhqke

Deep Learning Approach in DOA Estimation: A Systematic Literature Review

Shengguo Ge, Kuo Li, Siti Nurulain Binti Mohd Rum, Sang-Bing Tsai
2021 Mobile Information Systems  
Finally, various evaluation criteria (root-mean-squared error, accuracy, and mean absolute error) are used to evaluate the DL technology in DOA estimation, and various factors (signal-to-noise ratio, number  ...  In array signal processing, the direction of arrival (DOA) of the signal source has drawn broad research interests with its wide applications in fields such as sonar, radar, communications, medical detection  ...  Figure 2 : 2 General Boltzmann machine and restricted Boltzmann machine. Figure 3 : 3 Deep belief network and deep Boltzmann machine. Figure 6 : 6 Publications per year.  ... 
doi:10.1155/2021/6392875 fatcat:jtmyuje6zff5bnonpui5qc2vym

A Review on Machine Learning for Audio Applications

Nagesh B, R V College of Engineering, Bengaluru, India., Dr. M. Uttara Kumari, R V College of Engineering, Bengaluru, India.
2021 Journal of University of Shanghai for Science and Technology  
There is a need for machine learning or deep learning algorithms which can be implemented so that the audio signal processing can be achieved with good results and accuracy.  ...  It deals with the manipulation of the audio signals to achieve a task like filtering, data compression, speech processing, noise suppression, etc. which improves the quality of the audio signal.  ...  Deep belief networks are a sort of generative graphical model that can be thought of as a stack of basic restricted Boltzmann machines. A DBN has both directed and undirected edges between layers.  ... 
doi:10.51201/jusst/21/06508 fatcat:iwj523grmnfm3awyiks6ebncgm

Representation Learning: A Review and New Perspectives

Y. Bengio, A. Courville, P. Vincent
2013 IEEE Transactions on Pattern Analysis and Machine Intelligence  
representation learning, density estimation and manifold learning.  ...  The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory  ...  Acknowledgments The author would like to thank David Warde-Farley, Razvan Pascanu and Ian Goodfellow for useful feedback, as well as NSERC, CIFAR and the Canada Research Chairs for funding.  ... 
doi:10.1109/tpami.2013.50 pmid:23787338 fatcat:2ozfdsn2bjaa5jwtdovszpr5xa

Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription [article]

Nicolas Boulanger-Lewandowski, Yoshua Bengio
2012 arXiv   pre-print
We investigate the problem of modeling symbolic sequences of polyphonic music in a completely general piano-roll representation.  ...  We show how our musical language model can serve as a symbolic prior to improve the accuracy of polyphonic transcription.  ...  Acknowledgments The authors would like to thank NSERC, CIFAR and the Canada Research Chairs for funding, and Compute Canada/Calcul Québec for computing resources.  ... 
arXiv:1206.6392v1 fatcat:64ynoazx2baofbjig6qm33qsze
« Previous Showing results 1 — 15 out of 376 results