A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Using weakly aligned score–audio pairs to train deep chroma models for cross-modal music retrieval
2020
Zenodo
In our study, we use weakly aligned score–audio pairs for training, where only the beginning and end of a score excerpt is annotated in an audio recording, without aligned correspondences in between. ...
To exploit such weakly aligned data, we employ the Connectionist Temporal Classification (CTC) loss to train a deep learning model for computing an enhanced chroma representation. ...
The International Audio Laboratories Erlangen are a joint institution of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer Institute for Integrated Circuits IIS. ...
doi:10.5281/zenodo.4245399
fatcat:rtla4xknznb7fbgl4hgsf4itqa
CTC-Based Learning of Chroma Features for Score–Audio Music Retrieval
2021
IEEE/ACM Transactions on Audio Speech and Language Processing
As one contribution, we show how to apply the Connectionist Temporal Classification (CTC) loss in the training procedure, which only uses weakly aligned training pairs. ...
This paper deals with a score-audio music retrieval task where the aim is to find relevant audio recordings of Western classical music, given a short monophonic musical theme in symbolic notation as a ...
Typical cross-modal retrieval strategies (e.g., for score-audio retrieval) employ a common mid-level representation to compare the different modalities. ...
doi:10.1109/taslp.2021.3110137
fatcat:ozokra6olbh7xfz6hnv3yjstx4
Training Deep Pitch-Class Representations With a Multi-Label CTC Loss
2021
Zenodo
For this reason, training a chroma representation using deep learning ("deep chroma") has become an interesting strategy. ...
on corresponding score--audio segment pairs. ...
We thank Curtis Wigington for advice on implementation and Meinard Müller and Frank Zalkow for fruitful discussions. ...
doi:10.5281/zenodo.5624359
fatcat:lmqwdtinzjfzdf6cv4qteagtqi
MULTIMODAL ANALYSIS: Informed content estimation and audio source separation
[article]
2021
arXiv
pre-print
Among the many text sources related to music that can be used (e.g. reviews, metadata, or social network feedback), we concentrate on lyrics. ...
Our study focuses on the audio and lyrics interaction for targeting source separation and informed content estimation. ...
Our Teacher student Our goal is to improve our SVD system. We use the teacher to select the retrieved audio and align the annotation to it. This new data is used for training a new SVD system. ...
arXiv:2104.13276v3
fatcat:wirjfj4iwjgfteejmeujydey7u
Artificial Musical Intelligence: A Survey
[article]
2020
arXiv
pre-print
Computers have been used to analyze and create music since they were first introduced in the 1950s and 1960s. ...
Beginning in the late 1990s, the rise of the Internet and large scale platforms for music recommendation and retrieval have made music an increasingly prevalent domain of machine learning and artificial ...
Kapanci and Pfeffer treated the melody extraction problem from an audio-to-score matching perspective, and trained a graphical model to align an audio recording to a score, recovering melodic lines in ...
arXiv:2006.10553v1
fatcat:2j6i27wrsfawpgcr2unxdgngd4
Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with Visual Computing for Improved Music Video Analysis
[article]
2020
arXiv
pre-print
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective. ...
Evaluations range from low-level visual features to high-level concepts retrieved by means of Deep Convolutional Neural Networks. ...
Common approaches to audio-to-score alignment systems is to convert both music modalities into a comparable representation. ...
arXiv:2002.00251v1
fatcat:6cz6rivc3fbg7fahdsnokxfrk4
Affective Computing for Large-scale Heterogeneous Multimedia Data
2019
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
., image, music, and video), resulting in a great demand for managing, retrieving, and understanding these data. ...
., images, music, videos, and multimodal data, with the focus on both handcrafted features-based methods and deep learning methods. ...
The more recent work mainly use deep learning models [55, 93] . Language is a commonly used modality in addition to vision and audio. ...
doi:10.1145/3363560
fatcat:m56udtjlxrauvmj6d5z2r2zdeu
Table of Contents
2021
IEEE/ACM Transactions on Audio Speech and Language Processing
Prawda Harmonic-Temporal Factor Decomposition for Unsupervised Monaural Separation of Harmonic Sounds . . . . . . . . ......Kameoka CTC-Based Learning of Chroma Features for Score-Audio Music Retrieval ...
Inui Multimodal Processing of Language End-to-End Recurrent Cross-Modality Attention for Video Dialogue . . . ..Y.-W.Chu, K.-Y.Lin, C.-C. Hsu, and L. ...
doi:10.1109/taslp.2021.3137066
fatcat:ocit27xwlbagtjdyc652yws4xa
A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions
[article]
2020
arXiv
pre-print
This paper attempts to provide an overview of various composition tasks under different music generation levels, covering most of the currently popular music generation tasks using deep learning. ...
levels of music generation: score generation produces scores, performance generation adds performance characteristics to the scores, and audio generation converts scores with performance characteristics ...
In addition, they also created a new drum dataset named Groove MIDI Dataset with paired scores and performance for training above models. ...
arXiv:2011.06801v1
fatcat:cixou3d2jzertlcpb7kb5x5ery
Music Information Retrieval: Recent Developments and Applications
2014
Foundations and Trends in Information Retrieval
We first elaborate on well-established and proven methods for feature extraction and music indexing, from both the audio signal and contextual data sources about music items, such as web pages or collaborative ...
We provide a survey of the field of Music Information Retrieval (MIR), in particular paying attention to latest developments, such as semantic auto-tagging and user-centric retrieval and recommendation ...
Bosch for his comments on the manuscript. Furthermore, the authors would like to express their gratitude to the anonymous reviewers for their highly valuable suggestions for improving the manuscript. ...
doi:10.1561/1500000042
fatcat:c5tjdcy3xrfqvp6isnktbr6lpy
Musical Score Following and Audio Alignment
[article]
2022
arXiv
pre-print
Real-time tracking of the position of a musical performance on a musical score, i.e. score following, can be useful in music practice, performance and production. ...
implemented; it is shown that this CQT-based approach consistently and significantly outperforms a commonly used FFT-based approach in extracting audio features for score following. ...
Acknowledgements I would like to thank my supervisor, Professor Patrick A. Naylor, for his invaluable guidance and suggestions. ...
arXiv:2205.03247v1
fatcat:oqpzybu66bfcvklidhoz4k3qca
Introduction
[chapter]
2016
Music Data Analysis
The interface between the computer and statistical sciences is increasing, as each discipline seeks to harness the power and resources of the other. ...
This series aims to foster the integration between the computer sciences and statistical, numerical, and probabilistic methods by publishing a broad range of reference works, textbooks, and handbooks. ...
Instead of finding similar artists, the problem of the Audio Music Similarity and Retrieval task in the annual Music Information Retrieval eXchange (MIREX) is to retrieve a set of suitable tracks, i.e. ...
doi:10.1201/9781315370996-5
fatcat:avooqogcpnbjngqmzuonil3exq
Colour Association with Music Is Mediated by Emotion: Evidence from an Experiment Using a CIE Lab Interface and Interviews
2015
PLoS ONE
Using partial least squares regression, we tested models for predicting colour patch responses from audio features and ratings of perceived emotion in the music. ...
The CIE Lab interface promises to be a useful tool in perceptual ratings of music and other sounds. ...
It was used in [27] in an experiment aimed at validating perceptual features in music information retrieval. ...
doi:10.1371/journal.pone.0144013
pmid:26642050
pmcid:PMC4671663
fatcat:ygu3iqzusbbhth2hcwtzqea5ma
Proceedings of eNTERFACE 2015 Workshop on Intelligent Interfaces
[article]
2018
arXiv
pre-print
The 11th Summer Workshop on Multimodal Interfaces eNTERFACE 2015 was hosted by the Numediart Institute of Creative Technologies of the University of Mons from August 10th to September 2015. ...
During the four weeks, students and researchers from all over the world came together in the Numediart Institute of the University of Mons to work on eight selected projects structured around intelligent ...
The team would like to thank the Musée des Beaux-Arts of Lille for allowing us to stay several days to test the setup and perform the experiments. ...
arXiv:1801.06349v1
fatcat:qauytivdq5axxis2xlknp3r2ne
Nonlinear acoustics of water‐saturated marine sediments
1976
Journal of the Acoustical Society of America
and Development to develop an on train system for train accident reducerich (STAR). ...
language permitting about 10 •z possible sentences for an Information Retrieval task using a computerized data base. ...
the scores for words that should be verified correctly versus those scores for false word hypotheses. ...
doi:10.1121/1.2003625
fatcat:xzwtedhhn5b2rfurl3f246z6m4
« Previous
Showing results 1 — 15 out of 20 results