Filters








358 Hits in 10.9 sec

Multi-modal Speech Processing Methods: An Overview and Future Research Directions Using a MATLAB Based Audio-Visual Toolbox [chapter]

Andrew Abel, Amir Hussain
2009 Lecture Notes in Computer Science  
This paper presents an overview of the main multi-modal speech enhancement methods reported to date.  ...  Finally, some future research directions in the area of multi-modal speech processing are outlined, including future research that the authors aim to carry out with the aid of this newly developed audio-visual  ...  Acknowledgements This work was funded with the aid of an euCognition travel grant and a research studentship from the University of Stirling. Thanks are also due to Prof.  ... 
doi:10.1007/978-3-642-00525-1_12 fatcat:isqp4onudzesjl36ivziw26hc4

Learning Deep and Wide: A Spectral Method for Learning Deep Networks

Ling Shao, Di Wu, Xuelong Li
2014 IEEE Transactions on Neural Networks and Learning Systems  
Chapter 6: Conclusion and Future Directions. This chapter briefly summarises the contributions and discusses possible future research directions.  ...  It begins with a brief technical overview of Restricted Boltzmann Machines (RBMs) from an energy-based model aspect.  ...  An illustration of the RGB, depth (with user segmentation) and skeletal modalities is shown in Fig 6. 2.  ... 
doi:10.1109/tnnls.2014.2308519 pmid:25420251 fatcat:4mnl6tv2xnf3jpzwhp76cvl4ti

An Overview on Perceptually Motivated Audio Indexing and Classification

Gael Richard, Shiva Sundaram, Shrikanth Narayanan
2013 Proceedings of the IEEE  
including affect-based audio retrieval.  ...  Since the resulting audio classification and indexing is meant for direct human consumption, it is highly desirable that it produces perceptually relevant results.  ...  Another direction of research is to use multiple sources of information, for example coming from multiple modalities.  ... 
doi:10.1109/jproc.2013.2251591 fatcat:myywr5bztzeezi7mity4gwnpha

Multimodal Data Fusion: An Overview of Methods, Challenges, and Prospects

Dana Lahat, Tulay Adali, Christian Jutten
2015 Proceedings of the IEEE  
We use the term "modality" for each such acquisition framework.  ...  In order to address the second question, "diversity" is introduced as a key concept, and a number of datadriven solutions based on matrix and tensor decompositions are discussed, emphasizing how they account  ...  , whose expertise, insightful remarks and feedback have greatly helped extend the scope of this paper; and the anonymous reviewers, for their careful reading and valuable remarks.  ... 
doi:10.1109/jproc.2015.2460697 fatcat:ve7t3be66zfnnahgb7xrlt2lri

A 3D Simplification Method based on Dual Point Sampling

Juan Cao, Yitian Zhao, Ran Song, Yingchun Zhang
2013 Journal of Multimedia  
The watermark information is embedded in a halftone process for image screening with phase modulation method. The watermark is extracted by a template, which is optimized using the PSO algorithm.  ...  A watermarking algorithm against print-and-scan attack based on PSO (Particle Swarm Optimization) is proposed in this paper.  ...  We also thank the anonymous reviewers for their suggestions and feedback that helped us to improve the quality of this paper.  ... 
doi:10.4304/jmm.8.3.191-197 fatcat:z5gmzeiwqfadlakggntajifdru

Modulation Spectral Signal Representation for Quality Measurement and Enhancement of Wearable Device Data: A Technical Note

Abhishek Tiwari, Raymundo Cassani, Shruti Kshirsagar, Diana P. Tobon, Yi Zhu, Tiago H. Falk
2022 Sensors  
We conclude with a discussion on possible future research directions, such as context awareness, signal compression, and improved input representations for deep learning algorithms.  ...  In this technical note, we overview a signal processing representation called the modulation spectrum.  ...  The University of Washington Modulation Matlab Toolbox [94] has been available since the early 2000s and has been widely used by the authors, especially in earlier work involving speech and audio signals  ... 
doi:10.3390/s22124579 pmid:35746361 pmcid:PMC9229858 fatcat:wa6feesbz5gitgxbmqynhnvw2e

Multimodal Affect Models: An Investigation of Relative Salience of Audio and Visual Cues for Emotion Prediction

Jingyao Wu, Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah
2021 Frontiers in Computer Science  
People perceive emotions via multiple cues, predominantly speech and visual cues, and a number of emotion recognition systems utilize both audio and visual cues.  ...  In this paper, we investigate the relative salience of audio and video modalities to emotion state prediction and emotion change prediction using a Multimodal Markovian affect model.  ...  Backend Implementation The OMSVM subsystem for AOL prediction is implemented using ClassificationECOC MATLAB toolbox (an error correction output code multi-class classifier) (Escalera et al., 2009).  ... 
doi:10.3389/fcomp.2021.767767 fatcat:mndainbz2raj7ce4iubnwqqolm

Data Analysis Methods for Software Systems

Jolita Bernatavičienė
2021 Vilnius University Proceedings  
It started as the workshop and has now grown into a well-known conference.  ...  DAMSS-2021 is the 12th international conference on data analysis methods for software systems, organized in Druskininkai, Lithuania. The same place and the same time every year.  ...  In this work, we present a method to compare music based entirely on its audio signal properties.  ... 
doi:10.15388/damss.12.2021 fatcat:iefv6bz3drcrfpcwxoaqmu3gra

REPP: A robust cross-platform solution for online sensorimotor synchronization experiments

Manuel Anglada-Tort, Peter M. C. Harrison, Nori Jacoby
2022 Behavior Research Methods  
We also show that REPP is fully automated and customizable, enabling researchers to monitor experiments in real time and to implement a wide variety of SMS paradigms.  ...  Here we present REPP (Rhythm ExPeriment Platform), a novel technology for measuring SMS in online experiments that can work efficiently using the built-in microphone and speakers of standard laptop computers  ...  Availability of data and materials The datasets generated and analyzed in this study are available in an OSF repository: https:// osf. io/ r2pxd/  ... 
doi:10.3758/s13428-021-01722-2 pmid:35149980 pmcid:PMC8853279 fatcat:ekor6wefxrdhnl5t4qnq25enxu

16th Sound and Music Computing Conference SMC 2019 (28–31 May 2019, Malaga, Spain)

Lorenzo J. Tardón, Isabel Barbancho, Ana M. Barbancho, Alberto Peinado, Stefania Serafin, Federico Avanzini
2019 Applied Sciences  
The 16th Sound and Music Computing Conference (SMC 2019) took place in Malaga, Spain, 28–31 May 2019 and it was organized by the Application of Information and Communication Technologies Research group  ...  The SMC 2019 TOPICS OF INTEREST included a wide selection of topics related to acoustics, psychoacoustics, music, technology for music, audio analysis, musicology, sonification, music games, machine learning  ...  Committee and other collaborators.  ... 
doi:10.3390/app9122492 fatcat:tcacoupffjewnpjhpw4oy7x6h4

Perceptual-based quality assessment for audio–visual services: A survey

Junyong You, Ulrich Reiter, Miska M. Hannuksela, Moncef Gabbouj, Andrew Perkis
2010 Signal processing. Image communication  
We consider emerging trends in audio-visual quality assessment, and propose feasible solutions for future work in perceptual-based audio-visual quality metrics.  ...  In this paper, we survey state-of-the-art signal-driven perceptual audio and video quality assessment methods independently, and investigate relevant issues in developing joint audio-visual quality metrics  ...  Acknowledgement The authors would like to thank three anonymous reviewers and the editor for their valuable and constructive comments on this paper.  ... 
doi:10.1016/j.image.2010.02.002 fatcat:u7hfnefpafbwjm4tmgp55mdlwu

Classification Study of Sound and Image Events Using Event Detection Systems

DR. G.Murugaboopathy Murugaboopathy, R. Rajan
2016 International Journal Of Engineering And Computer Science  
For intelligent systems to make best use of the audio modality, it is important that they can recognize not just speech and music, which have been researched as specific tasks, but also general sounds  ...  To stimulate research in this field we conducted a public research challenge: the IEEE Audio and Acoustic Signal Processing Technical Committee challenge on Detection and Classification of Acoustic Scenes  ...  Firstly, various frequency-based and time-based features are extracted. The audio stream is subsequently segmented using a speech segmenter that uses energy-based features.  ... 
doi:10.18535/ijecs/v5i2.8 fatcat:56xkk4xk5vezbayvx5v2u6cbca

Audio Features for Music Emotion Recognition: a Survey

Renato Panda, Ricardo Manuel Malheiro, Rui Pedro Paiva
2020 IEEE Transactions on Affective Computing  
Based on this review, current gaps and needs are identified and strategies for future research on feature engineering for MER are proposed, namely ideas for computational audio features that capture elements  ...  of musical form, texture and expressivity that should be further researched.  ...  ACKNOWLEDGMENT This work was supported by the MERGE project financed by Fundação para Ciência e a Tecnologia (FCT) -Portugal.  ... 
doi:10.1109/taffc.2020.3032373 fatcat:2dtcdqmyffcbna3bhw4bj242hm

AMT 1.x: A toolbox for reproducible research in auditory modeling

Piotr Majdak, Clara Hollomey, Robert Baumgartner
2022 Acta Acustica  
The Auditory Modeling Toolbox (AMT) is a MATLAB/Octave toolbox for the development and application of computational auditory models with a particular focus on binaural hearing.  ...  The motivation is to provide a toolbox able to reproduce the model predictions and allowing students and researchers to work with and to advance existing models.  ...  Acknowledgments We would like to thank Peter Søndergaard for initiating the AMT as a project for facilitating reproducible research by collecting auditory models and making them available in a common framework  ... 
doi:10.1051/aacus/2022011 fatcat:u62sojabwjaafbqkhjsf4fcpku

A Multifaceted Investigation into Speech Reading [chapter]

Trent W. Lewis, David M. W. Powers
2002 Hybrid Information Systems  
We have been following a line of research that started with a simple audio-visual speech recognition system to what is now a multifaceted investigation into speech reading.  ...  Speech reading is the act of speech perception using both acoustic and visual information.  ...  To extract the mel-cepstrum coefficients from the speech signal the Matlab speech processing toolbox VOICEBOX was used [3] , exploiting the first 12 cepstral coefficients, 12 delta-cepstral coefficients  ... 
doi:10.1007/978-3-7908-1782-9_7 fatcat:rjjofwths5hnbbjutuej3227k4
« Previous Showing results 1 — 15 out of 358 results