Filters








25 Hits in 12.3 sec

Score-informed syllable segmentation for a cappella singing voice with convolutional neural networks [article]

Jordi Pons, Rong Gong, Xavier Serra
2017 arXiv   pre-print
This paper introduces a new score-informed method for the segmentation of jingju a cappella singing phrase into syllables.  ...  Then, we identify which are the challenges that jingju a cappella singing poses. Further, we investigate how to improve the syllable ODF estimation with convolutional neural networks (CNNs).  ...  "Scoreinformed syllable segmentation for a cappella singing voice with convolutional neural networks", 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017.  ... 
arXiv:1707.03544v1 fatcat:3pnipfudefgsnh3mdojpy7vi5y

Score-Informed Syllable Segmentation For A Cappella Singing Voice With Convolutional Neural Networks

Jordi Pons, Rong Gong, Xavier Serra
2017 Zenodo  
"Scoreinformed syllable segmentation for a cappella singing voice with convolutional neural networks", 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017.  ...  In this paper we tackle the problem of score-informed automatic syllable segmentation for a cappella singing (bold rectangle in Figure 1 ).  ... 
doi:10.5281/zenodo.1415631 fatcat:b4wo7fug2bajvhoe5pekrtc55u

Score-Informed Syllable Segmentation For Jingju A Cappella Singing Voice With Mel-Frequency Intensity Profiles

Rong Gong, Nicolas Obin, Georgi Dzhambazov, Xavier Serra
2017 Zenodo  
This is the presentation slides of the paper: Score-Informed Syllable Segmentation for Jingju a Cappella Singing Voice with Mel-Frequency Intensity Profiles in Session 4 of International Workshop on Folk  ...  function -[update] Convolutional neural networks (CNNs) based onset detection function Background Jingju singing • Jingju (Beijing opera): a traditional Chinese art form • Characters are played  ...  Structure • Structure Background -jingju singing evaluation • Syllable segmentation -Intensity-based onset detection function -Score-informed onset sequence decoding A laosheng aria class in The National  ... 
doi:10.5281/zenodo.814800 fatcat:w32fkjqpw5cl7mhiccllgzyjxy

Automatic Assessment Of Singing Voice Pronunciation: A Case Study With Jingju Music

Rong Gong, Xavier Serra
2018 Zenodo  
Automatic singing voice assessment, as an important task in Music Information Research (MIR), aims to extract musically meaningful information and measure the quality of learners' singing voice.  ...  However, online music learning cannot be extended to a large-scale unless there is an automatic system to provide assessment feedback for the student music performances.  ...  Acknowledgements I am grateful to my stuttering -my lifelong companion, who teaches me compassion for the weak, patience, modesty and to never give up.  ... 
doi:10.5281/zenodo.1490343 fatcat:f3mrhstkdff6ppmdadeasfuo7m

Audio to score matching by combining phonetic and duration information [article]

Rong Gong, Jordi Pons, Xavier Serra
2017 arXiv   pre-print
We approach the singing phrase audio to score matching problem by using phonetic and duration information - with a focus on studying the jingju a cappella singing case.  ...  Three acoustic models are investigated: (i) convolutional neural networks (CNNs), (ii) deep neural networks (DNNs) and (iii) Gaussian mixture models (GMMs).  ...  ACKNOWLEDGEMENTS We are grateful for the GPUs donated by NVidia.  ... 
arXiv:1707.03547v1 fatcat:cnihiu72rbcm3g2sfou67u3r5u

Knowledge-Based Probabilistic Modeling For Tracking Lyrics In Music Audio Signals

Georgi Dzhambazov, Xavier Serra
2017 Zenodo  
This confirms that music-specific knowledge is an important stepping stone for computationally tracking lyrics, especially in the challenging case of singing with instrumental accompaniment.  ...  In one model we exploit the fact the expected syllable durations depend on their position within a lyrics line.  ...  However, a shortcoming of convolutional neural networks is the necessity of a big amount of clean singing voice training data, which was not available for OTMM.  ... 
doi:10.5281/zenodo.841980 fatcat:tohf6dcvobhe3ei77nvp3wg3ba

Peking Opera Synthesis via Duration Informed Attention Network [article]

Yusong Wu, Shengchen Li, Chengzhu Yu, Heng Lu, Chao Weng, Liqiang Zhang, Dong Yu
2020 arXiv   pre-print
In this work, we propose to deal with this issue and synthesize expressive Peking Opera singing from the music score based on the Duration Informed Attention Network (DurIAN) framework.  ...  This inconsistency poses a great challenge in Peking Opera singing voice synthesis from a music score.  ...  As a typical case among such systems, Neural Parametric Singing Synthesizer (NPSS) [2] using a phoneme timing model, a pitch model and a timbre model each consist a set of neural networks * Yusong Wu  ... 
arXiv:2008.03029v1 fatcat:3g2jurc2sbbxlg46nebjprbvbi

Peking Opera Synthesis via Duration Informed Attention Network

Yusong Wu, Shengchen Li, Chengzhu Yu, Heng Lu, Chao Weng, Liqiang Zhang, Dong Yu
2020 Interspeech 2020  
In this work, we propose to deal with this issue and synthesize expressive Peking Opera singing from the music score based on the Duration Informed Attention Network (DurIAN) framework.  ...  This inconsistency poses a great challenge in Peking Opera singing voice synthesis from a music score.  ...  As a typical case among such systems, Neural Parametric Singing Synthesizer (NPSS) [2] using a phoneme timing model, a pitch model and a timbre model each consist a set of neural networks * Yusong Wu  ... 
doi:10.21437/interspeech.2020-1724 dblp:conf/interspeech/WuLYLWZY20 fatcat:4cmlb5kdzvet7eofmlgtoijtey

Towards an efficient deep learning model for musical onset detection [article]

Rong Gong, Xavier Serra
2018 arXiv   pre-print
Our experiments are conducted using two different datasets: one mainly consists of instrumental music excerpts, and another developed by ourselves includes only solo singing voice excerpts.  ...  a different dataset.  ...  Scoreinformed syllable segmentation for a cappella singing voice with convolutional neural networks.  ... 
arXiv:1806.06773v2 fatcat:4d36evxcwbcdxcgity5eas5umi

Motivic Pattern Classification of Music Audio Signals Combining Residual and LSTM Networks

Aitor Arronte Alvarez, Francisco Gómez
2021 International Journal of Interactive Multimedia and Artificial Intelligence  
Convolutional Neural Networks (CNN) have proven to be very effective algorithms in image classification.  ...  Recent work in large-scale audio classification has shown that CNN architectures, originally developed for image problems, can be applied successfully to audio event recognition and classification with  ...  A state-of-theart Convolutional Recurrent Neural Network (CRNN) architecture developed specifically for music classification is also used as a baseline [26] .  ... 
doi:10.9781/ijimai.2021.01.003 fatcat:bchjriosjfgylgu626dvet3y5a

Catch-A-Waveform: Learning to Generate Audio from a Single Short Example [article]

Gal Greshler, Tamar Rott Shaham, Tomer Michaeli
2021 arXiv   pre-print
This enables a long line of interesting applications, including generating new jazz improvisations or new a-cappella rap variants based on a single short example, producing coherent modifications to famous  ...  Models for audio generation are typically trained on hours of recordings.  ...  . , 0, (2) where z n is white Gaussian noise, G n is a convolutional neural network generator, α n = d n+1 /d n is the resolution ratio between scales n + 1 and n, and (•) ↑ α stands for up-sampling by  ... 
arXiv:2106.06426v2 fatcat:7m442n2fj5h2li4xwk2pmsk2ie

The musical dimension of Chinese traditional theatre: An analysis from computer aided musicology

Rafael Caro Repetto, Manel Ollé, Xavier Serra
2018 Zenodo  
I propose a novel approach based on computer aided musicology. A corpus of machine readable music scores for 92 arias is created, covering 899 melodic lines.  ...  To support and expand these results, a series of computational tools are developed to computationally extract statistical and quantitative information.  ...  In a more recent implementation of this method, developed in collaboration with Jordi Pons , the authors draw on Convolutional Neural Networks (CNN) to improve the ODT, since a high accuracy for segment  ... 
doi:10.5281/zenodo.2030599 fatcat:v4sbggmrzjb5tg5wlsess7kie4

D2.1 Libraries and tools for multimodal content analysis

Doukhan; David, Danny Francis, Benoit Huet, Sami Keronen, Mikko Kurimo, Jorma Laaksonen, Tiina Lindh-Knuutila, Bernard Merialdo, Mats Sjöberg, Umut Sulubacak, Jörg Tiedemann, Kim Viljanen
2018 Zenodo  
This deliverable describes a joint collection of libraries and tools for multimodal content analysis created by the MeMAD project partners.  ...  As part of this deliverable, the open source components have been gathered into a joint software collection of tools and libraries publicly available on GitHub.  ...  1 Acknowledgements Computational resources were provided by the Aalto Science-IT project and the CSC -IT Center for Science, Finland.  ... 
doi:10.5281/zenodo.3697989 fatcat:bde5x3yggzb2jk2fh2mu6t5wxy

Data-driven Vocal Pitch Extraction for Indian Art Music Melodic Analysis

Genís Plaja I Roglans, Xavier Serra, Marius Miron
2021 Zenodo  
and testing of a state of the art vocal pitch extraction method to obtain a trained model that outperforms the actual proposals for Indian Art Music signals.  ...  In this work, we aim at overcoming this issue by addressing two main contributions: (1) The creation of a dataset of properly annotated vocal melody for Indian Art Music, and (2) The training, evaluation  ...  The model also includes a fully convolutional branch that acts as a voicing detection system to improve the VR and VFA scores.  ... 
doi:10.5281/zenodo.5554703 fatcat:poiog55knrhfjncg7ri5lxyjiy

Proceedings of the 10th International Workshop on Folk Music Analysis (FMA2022) [article]

Andre Holzapfel, Ali-MacLachlan
2022 Zenodo  
We would like to thank Xavier Serra for the keynote talk that built bridges to the AAWM community. We would like to thank the musicians who created a great atmosphere during the event.  ...  information retrieval researchers.  ...  We thank Gakuto Chiba, Melody Ho, and Jiei Kuroyanagi for early help with recording curation and analysis, and PQP's research assistants for providing ground-truth ratings of pitch discreteness.  ... 
doi:10.5281/zenodo.7100287 fatcat:dyptbi4xzjhnxgl2kydolgbvna
« Previous Showing results 1 — 15 out of 25 results