Filters








1,069 Hits in 9.0 sec

Using Sequence Alignment And Voting To Improve Optical Music Recognition From Multiple Recognizers

Esben Paul Bugge, Kim Lundsteen Juncher, Brian Søborg Mathiasen, Jakob Grue Simonsen
2011 Zenodo  
Our results confirm the earlier work of Byrd et al. suggesting that recognizers may be improved somewhat by sequence alignment and voting, but that more elaborate methods may be needed to obtain substantial  ...  We have shown that a simple OMR system based on multiple recognizers and sequence alignment can outperform the commercially available tools.  ...  Progressive sequence alignment of symbolic music data and voting We briefly outline the method for multiple sequence alignment below.  ... 
doi:10.5281/zenodo.1418174 fatcat:fbewd7gy5bhwrma3qss23w6q6a

Improving Optical Music Recognition By Combining Outputs From Multiple Sources

Victor Padilla, Alex McLean, Alan Marsden, Kia Ng
2015 Zenodo  
One potential solution to this is to use Optical Music Recognition (OMR) software to generate symbolic data such as MusicXML from score images.  ...  Bottom-Up Alignment and Correction, Single Parts Bottom-up alignment is applied to sequences of symbols from a single staff or a sequence of staves that correspond to a single part in the music.  ... 
doi:10.5281/zenodo.1416258 fatcat:bkavoupgfffzjhtr3m3kgazyom

MIDI-assisted egocentric optical music recognition

Liang Chen, Kun Duan
2016 2016 IEEE Winter Conference on Applications of Computer Vision (WACV)  
We formulate the problem as a structured sequence alignment problem as opposed to the blind recognition in traditional OMR systems.  ...  We view our work as the first step towards egocentric optical music recognition, and believe it will bring insights for next-generation music pedagogy and music entertainment.  ...  We prune the least voted hypothesized models and only keep those satisfying two different criteria through non- Music Recognition We model egocentric optical music recognition as a note sequence alignment  ... 
doi:10.1109/wacv.2016.7477714 dblp:conf/wacv/ChenD16 fatcat:of42g3zb3vbgvathkeqc2szsvi

Prospects For Improving Omr With Multiple Recognizers

Donald Byrd, Megan Schindele
2006 Zenodo  
Mellon Foundation; we are particularly grateful to Suzanne Lodato and Don Waters of the Foundation for their interest in our work.  ...  Acknowledgements It is a pleasure to acknowledge the assistance of Michael Droettboom and Ichiro Fujinaga (information on OMR technology, especially Gamut/Gamera); Bill Clemmons, Kelly Demoline, Andy Glick  ...  Recognizers and Multiple Recognizers in Text and Music This report describes research on an approach to improved OMR (Optical Music Recognition) that, to our knowledge, has never been tried with music  ... 
doi:10.5281/zenodo.1418306 fatcat:7rgrsnalv5dhdfu2gk4znr3klm

Tools For Annotatiing Musical Measures In Digital Music Editions

Yevgen Mexin, Aristotelis Hadjakos, Axel Berndt, Simon Waloschek, Anastasia Wawilow, Gerd Szwillus
2017 Proceedings of the SMC Conferences  
(Abstract to follow)  ...  The majority voting approach is used to decide which elements are correctly recognized.  ...  The optical measure recognition for lute tablatures can be performed relatively robustly because they do not contain complex systems of multiple voices such as in other music score.  ... 
doi:10.5281/zenodo.1401941 fatcat:clknpglbyjcergz5nkhgbtfkjm

Image Quality Estimation For Multi-Score Omr

Dan Ringwalt, Roger B. Dannenberg
2015 Zenodo  
INTRODUCTION Optical music recognition (OMR) is the problem of converting scanned music scores into a symbolic format such as MIDI.  ...  Furthermore, alignment-based MS-OMR systems require a multiple sequence alignment, and finding the globally optimal such alignment is NP-complete [26] .  ... 
doi:10.5281/zenodo.1414890 fatcat:3hf5yxuz2rgypp3alnvqcbjeji

Beat Estimation from Musician Visual Cues

Sutirtha Chakraborty, Senem Aktaş, William Clifford, Joseph Timoney
2021 Zenodo  
Decomposition and filtering algorithms were used to clean and fuse multiple signals.  ...  dataset was used to carry out a comparative study of two different approaches: (a) motiongram, and (b) poseestimation, to detect phase from body sway.  ...  We used openpose, an improved and robust model pose estimation model that applies Part Affinity Fields (PAFs) to predict body keypoints for multiple humans [17] .  ... 
doi:10.5281/zenodo.5045006 fatcat:gr4emnloarhqfchyowjnuozwfe

Curvature: A signature for Action Recognition in Video Sequences [article]

He Chen, Gregory S. Chirikjian
2019 arXiv   pre-print
The video sequence, viewed as a curve in pixel space, is aligned by reparameterization using the arclength of the curve in pixel space.  ...  Moreover, we see latent capacity in transferring this idea into other sequence-based recognition applications such as speech recognition, machine translation, and text generation.  ...  We also appreciate Mengdi Xu, Thomas Mitchel, Weixiao Liu, and Sipu Ruan for discussion.  ... 
arXiv:1904.13003v2 fatcat:sguopc43inewpedemgkqok462i

Optical music recognition: state-of-the-art and open issues

Ana Rebelo, Ichiro Fujinaga, Filipe Paszkiewicz, Andre R. S. Marcal, Carlos Guedes, Jaime S. Cardoso
2012 International Journal of Multimedia Information Retrieval  
Programs analogous to optical character recognition systems called optical music recognition (OMR) systems have been under intensive development for many years.  ...  However, the results to date are far from ideal. Each of the proposed methods emphasizes different properties and therefore makes it difficult to effectively evaluate its competitive advantages.  ...  sequences are time-aligned using algorithms based on dynamic time warping (DTW).  ... 
doi:10.1007/s13735-012-0004-6 fatcat:bdzaxwyzcvgajhz3nsk26zbwfa

Emotion Recognition from Multiple Modalities: Fundamentals and Methodologies [article]

Sicheng Zhao, Guoli Jia, Jufeng Yang, Guiguang Ding, Kurt Keutzer
2021 arXiv   pre-print
Enabling machines to have emotional intelligence, i.e., recognizing, interpreting, processing, and simulating emotions, is becoming increasingly important.  ...  In this tutorial, we discuss several key aspects of multi-modal emotion recognition (MER). We begin with a brief introduction on widely used emotion representation models and affective modalities.  ...  Some methods extract optical flow from gait videos and then extract sequence representations using these networks.  ... 
arXiv:2108.10152v1 fatcat:hwnq7hoiqba3pdf6aakcxjq33i

A Data Fusion Perspective on Human Motion Analysis Including Multiple Camera Applications [chapter]

Rodrigo Cilla, Miguel A. Patricio, Antonio Berlanga, José M. Molina
2013 Lecture Notes in Computer Science  
This papers presents a view of human motion analysis from the viewpoint of data fusion. JDL process model and Dasarathy's input-output hierarchy are employed to categorize the works in the area.  ...  A survey of the literature in human motion analysis from multiple cameras is included. Future research directions in the area are identified after this review.  ...  Section 4 surveys the area of human action recognition from multiple cameras.  ... 
doi:10.1007/978-3-642-38622-0_16 fatcat:7xlfqdxp3nftpinstim5vzavdm

Foley Music: Learning to Generate Music from Videos [chapter]

Chuang Gan, Deng Huang, Peihao Chen, Joshua B. Tenenbaum, Antonio Torralba
2020 Lecture Notes in Computer Science  
We first identify two key intermediate representations for a successful video to music generator: body keypoints from videos and MIDI events from audio recordings.  ...  More importantly, the MIDI representations are fully interpretable and transparent, thus enabling us to perform music editing flexibly.  ...  This work is supported by ONR MURI N00014-16-1-2007, the Center for Brain, Minds, and Machines (CBMM, NSF STC award CCF-1231216), and IBM Research.  ... 
doi:10.1007/978-3-030-58621-8_44 fatcat:7rcvic77mjbkxmrmx4r6vgvw3i

Foley Music: Learning to Generate Music from Videos [article]

Chuang Gan, Deng Huang, Peihao Chen, Joshua B. Tenenbaum, Antonio Torralba
2020 arXiv   pre-print
We first identify two key intermediate representations for a successful video to music generator: body keypoints from videos and MIDI events from audio recordings.  ...  More importantly, the MIDI representations are fully interpretable and transparent, thus enabling us to perform music editing flexibly.  ...  This work is supported by ONR MURI N00014-16-1-2007, the Center for Brain, Minds, and Machines (CBMM, NSF STC award CCF-1231216), and IBM Research.  ... 
arXiv:2007.10984v1 fatcat:a5ktcxsufnftvdtqb7j4rmnc44

Sports Video Analysis: Semantics Extraction, Editorial Content Creation and Adaptation

Changsheng Xu, Jian Cheng, Yi Zhang, Yifan Zhang, Hanqing Lu
2009 Journal of Multimedia  
generation, tactic analysis, player action recognition, virtual content insertion, and mobile sports video adaptation.  ...  Advances in computing, networking, and multimedia technologies have led to a tremendous growth of sports video content and accelerated the need of analysis and understanding of sports video content.  ...  We used it to label the visual feature sequence extracted from the video in order to determine the exact boundaries of events. B.  ... 
doi:10.4304/jmm.4.2.69-79 fatcat:xytusontr5cyxlxpyqgljnhkqu

Multimodal music information processing and retrieval: survey and future challenges [article]

Federico Simonetta, Stavros Ntalampiras, Federico Avanzini
2019 arXiv   pre-print
Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music.  ...  Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus  ...  About the first point, more effort should be devoted to the development of algorithms for the alignment of various sequences.  ... 
arXiv:1902.05347v1 fatcat:i2indkxk3vcmxajn6ajkh56wva
« Previous Showing results 1 — 15 out of 1,069 results