20 Hits in 8.3 sec

Using weakly aligned score–audio pairs to train deep chroma models for cross-modal music retrieval

Frank Zalkow, Meinard Müller
2020 Zenodo  
In our study, we use weakly aligned scoreaudio pairs for training, where only the beginning and end of a score excerpt is annotated in an audio recording, without aligned correspondences in between.  ...  To exploit such weakly aligned data, we employ the Connectionist Temporal Classification (CTC) loss to train a deep learning model for computing an enhanced chroma representation.  ...  The International Audio Laboratories Erlangen are a joint institution of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer Institute for Integrated Circuits IIS.  ... 
doi:10.5281/zenodo.4245399 fatcat:rtla4xknznb7fbgl4hgsf4itqa

CTC-Based Learning of Chroma Features for Score–Audio Music Retrieval

Frank Zalkow, Meinard Muller
2021 IEEE/ACM Transactions on Audio Speech and Language Processing  
As one contribution, we show how to apply the Connectionist Temporal Classification (CTC) loss in the training procedure, which only uses weakly aligned training pairs.  ...  This paper deals with a score-audio music retrieval task where the aim is to find relevant audio recordings of Western classical music, given a short monophonic musical theme in symbolic notation as a  ...  Typical cross-modal retrieval strategies (e.g., for score-audio retrieval) employ a common mid-level representation to compare the different modalities.  ... 
doi:10.1109/taslp.2021.3110137 fatcat:ozokra6olbh7xfz6hnv3yjstx4

Training Deep Pitch-Class Representations With a Multi-Label CTC Loss

Christof Weiss, Geoffroy Peeters
2021 Zenodo  
For this reason, training a chroma representation using deep learning ("deep chroma") has become an interesting strategy.  ...  on corresponding score--audio segment pairs.  ...  We thank Curtis Wigington for advice on implementation and Meinard Müller and Frank Zalkow for fruitful discussions.  ... 
doi:10.5281/zenodo.5624359 fatcat:lmqwdtinzjfzdf6cv4qteagtqi

MULTIMODAL ANALYSIS: Informed content estimation and audio source separation [article]

Gabriel Meseguer-Brocal
2021 arXiv   pre-print
Among the many text sources related to music that can be used (e.g. reviews, metadata, or social network feedback), we concentrate on lyrics.  ...  Our study focuses on the audio and lyrics interaction for targeting source separation and informed content estimation.  ...  Our Teacher student Our goal is to improve our SVD system. We use the teacher to select the retrieved audio and align the annotation to it. This new data is used for training a new SVD system.  ... 
arXiv:2104.13276v3 fatcat:wirjfj4iwjgfteejmeujydey7u

Artificial Musical Intelligence: A Survey [article]

Elad Liebman, Peter Stone
2020 arXiv   pre-print
Computers have been used to analyze and create music since they were first introduced in the 1950s and 1960s.  ...  Beginning in the late 1990s, the rise of the Internet and large scale platforms for music recommendation and retrieval have made music an increasingly prevalent domain of machine learning and artificial  ...  Kapanci and Pfeffer treated the melody extraction problem from an audio-to-score matching perspective, and trained a graphical model to align an audio recording to a score, recovering melodic lines in  ... 
arXiv:2006.10553v1 fatcat:2j6i27wrsfawpgcr2unxdgngd4

Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with Visual Computing for Improved Music Video Analysis [article]

Alexander Schindler
2020 arXiv   pre-print
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.  ...  Evaluations range from low-level visual features to high-level concepts retrieved by means of Deep Convolutional Neural Networks.  ...  Common approaches to audio-to-score alignment systems is to convert both music modalities into a comparable representation.  ... 
arXiv:2002.00251v1 fatcat:6cz6rivc3fbg7fahdsnokxfrk4

Affective Computing for Large-scale Heterogeneous Multimedia Data

Sicheng Zhao, Shangfei Wang, Mohammad Soleymani, Dhiraj Joshi, Qiang Ji
2019 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)  
., image, music, and video), resulting in a great demand for managing, retrieving, and understanding these data.  ...  ., images, music, videos, and multimodal data, with the focus on both handcrafted features-based methods and deep learning methods.  ...  The more recent work mainly use deep learning models [55, 93] . Language is a commonly used modality in addition to vision and audio.  ... 
doi:10.1145/3363560 fatcat:m56udtjlxrauvmj6d5z2r2zdeu

Table of Contents

2021 IEEE/ACM Transactions on Audio Speech and Language Processing  
Prawda Harmonic-Temporal Factor Decomposition for Unsupervised Monaural Separation of Harmonic Sounds . . . . . . . . ......Kameoka CTC-Based Learning of Chroma Features for Score-Audio Music Retrieval  ...  Inui Multimodal Processing of Language End-to-End Recurrent Cross-Modality Attention for Video Dialogue . . . ..Y.-W.Chu, K.-Y.Lin, C.-C. Hsu, and L.  ... 
doi:10.1109/taslp.2021.3137066 fatcat:ocit27xwlbagtjdyc652yws4xa

A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions [article]

Shulei Ji, Jing Luo, Xinyu Yang
2020 arXiv   pre-print
This paper attempts to provide an overview of various composition tasks under different music generation levels, covering most of the currently popular music generation tasks using deep learning.  ...  levels of music generation: score generation produces scores, performance generation adds performance characteristics to the scores, and audio generation converts scores with performance characteristics  ...  In addition, they also created a new drum dataset named Groove MIDI Dataset with paired scores and performance for training above models.  ... 
arXiv:2011.06801v1 fatcat:cixou3d2jzertlcpb7kb5x5ery

Music Information Retrieval: Recent Developments and Applications

Markus Schedl, Emilia Gómez, Julián Urbano
2014 Foundations and Trends in Information Retrieval  
We first elaborate on well-established and proven methods for feature extraction and music indexing, from both the audio signal and contextual data sources about music items, such as web pages or collaborative  ...  We provide a survey of the field of Music Information Retrieval (MIR), in particular paying attention to latest developments, such as semantic auto-tagging and user-centric retrieval and recommendation  ...  Bosch for his comments on the manuscript. Furthermore, the authors would like to express their gratitude to the anonymous reviewers for their highly valuable suggestions for improving the manuscript.  ... 
doi:10.1561/1500000042 fatcat:c5tjdcy3xrfqvp6isnktbr6lpy

Musical Score Following and Audio Alignment [article]

Lin Hao Lee
2022 arXiv   pre-print
Real-time tracking of the position of a musical performance on a musical score, i.e. score following, can be useful in music practice, performance and production.  ...  implemented; it is shown that this CQT-based approach consistently and significantly outperforms a commonly used FFT-based approach in extracting audio features for score following.  ...  Acknowledgements I would like to thank my supervisor, Professor Patrick A. Naylor, for his invaluable guidance and suggestions.  ... 
arXiv:2205.03247v1 fatcat:oqpzybu66bfcvklidhoz4k3qca

Introduction [chapter]

2016 Music Data Analysis  
The interface between the computer and statistical sciences is increasing, as each discipline seeks to harness the power and resources of the other.  ...  This series aims to foster the integration between the computer sciences and statistical, numerical, and probabilistic methods by publishing a broad range of reference works, textbooks, and handbooks.  ...  Instead of finding similar artists, the problem of the Audio Music Similarity and Retrieval task in the annual Music Information Retrieval eXchange (MIREX) is to retrieve a set of suitable tracks, i.e.  ... 
doi:10.1201/9781315370996-5 fatcat:avooqogcpnbjngqmzuonil3exq

Colour Association with Music Is Mediated by Emotion: Evidence from an Experiment Using a CIE Lab Interface and Interviews

PerMagnus Lindborg, Anders K. Friberg, Xiaoang Wan
2015 PLoS ONE  
Using partial least squares regression, we tested models for predicting colour patch responses from audio features and ratings of perceived emotion in the music.  ...  The CIE Lab interface promises to be a useful tool in perceptual ratings of music and other sounds.  ...  It was used in [27] in an experiment aimed at validating perceptual features in music information retrieval.  ... 
doi:10.1371/journal.pone.0144013 pmid:26642050 pmcid:PMC4671663 fatcat:ygu3iqzusbbhth2hcwtzqea5ma

Proceedings of eNTERFACE 2015 Workshop on Intelligent Interfaces [article]

Matei Mancas, Christian Frisson, Joëlle Tilmanne, Nicolas d'Alessandro, Petr Barborka, Furkan Bayansar, Francisco Bernard, Rebecca Fiebrink, Alexis Heloir, Edgar Hemery, Sohaib Laraba, Alexis Moinet (+58 others)
2018 arXiv   pre-print
The 11th Summer Workshop on Multimodal Interfaces eNTERFACE 2015 was hosted by the Numediart Institute of Creative Technologies of the University of Mons from August 10th to September 2015.  ...  During the four weeks, students and researchers from all over the world came together in the Numediart Institute of the University of Mons to work on eight selected projects structured around intelligent  ...  The team would like to thank the Musée des Beaux-Arts of Lille for allowing us to stay several days to test the setup and perform the experiments.  ... 
arXiv:1801.06349v1 fatcat:qauytivdq5axxis2xlknp3r2ne

Nonlinear acoustics of water‐saturated marine sediments

Leif Bjørnø
1976 Journal of the Acoustical Society of America  
and Development to develop an on train system for train accident reducerich (STAR).  ...  language permitting about 10 •z possible sentences for an Information Retrieval task using a computerized data base.  ...  the scores for words that should be verified correctly versus those scores for false word hypotheses.  ... 
doi:10.1121/1.2003625 fatcat:xzwtedhhn5b2rfurl3f246z6m4
« Previous Showing results 1 — 15 out of 20 results