Filters








349 Hits in 3.7 sec

Open Set Audio Classification Using Autoencoders Trained on Few Data

Javier Naranjo-Alcazar, Sergi Perez-Castanos, Pedro Zuccarello, Fabio Antonacci, Maximo Cobos
2020 Sensors  
Another problem arising in practical scenarios is few-shot learning (FSL), which appears when there is no availability of a large number of positive samples for training a recognition system.  ...  An extensive set of experiments is carried out considering multiple combinations of openness factors (OSR condition) and number of shots (FSL condition), showing the validity of the proposed approach and  ...  Acoustic event classification (AEC) and acoustic scene classification (ASC) are two areas that have grown significantly in the last years [1] [2] [3] [4] , often included within the machine listening  ... 
doi:10.3390/s20133741 pmid:32635378 pmcid:PMC7374438 fatcat:nl3iuuijpjhnxic6nyvkgm57ba

Segmentation to Sound Conversion

Anindita Chatterjee, Himadri Nath Moulick
2014 IOSR Journal of Computer Engineering  
Our motive, the task of unsupervised topic segmentation of speech data operating over raw acoustic information.  ...  This approach uses the signal to model itself, and thus does not rely on particular acoustic cues nor requires training.  ...  Speech and Music Segmentation A variety of supervised and unsupervised methods have been employed to segment speech input.  ... 
doi:10.9790/0661-16354448 fatcat:xfxj7nwhdbazzgzgfp6hdkncfu

An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition

Barış Bayram, Gökhan İnce
2021 Sensors  
The self-learning on different types of audio features extracted from the acoustic signals of various events occurs without human supervision.  ...  In this study, a self-learning-based ASA for acoustic event recognition (AER) is presented to detect and incrementally learn novel acoustic events by tackling catastrophic forgetting.  ...  the development of a semi-supervised AND method to detect new acoustic events for ICL, 5.  ... 
doi:10.3390/s21196622 pmid:34640943 fatcat:jezml7fmeja4lkqfk3hg5mcsvq

Unsupervised Contrastive Learning of Sound Event Representations [article]

Eduardo Fonseca, Diego Ortego, Kevin McGuinness, Noel E. O'Connor, Xavier Serra
2020 arXiv   pre-print
We evaluate the learned representations using linear evaluation, and in two in-domain downstream sound event classification tasks, namely, using limited manually labeled data, and using noisy labeled data  ...  Self-supervised representation learning can mitigate the limitations in recognition tasks with few manually labeled data but abundant unlabeled data---a common scenario in sound event research.  ...  Alternatives to conventional supervised learning include the paradigms of semi-supervised learning [3] , few-shot learning [4] , learning from noisy labels [5, 6] , or self-supervision [7] .  ... 
arXiv:2011.07616v1 fatcat:nqufzefujfgrjofdeqzfe2b2gu

Directional Embedding Based Semi-supervised Framework For Bird Vocalization Segmentation [article]

Anshul Thakur, Padmanabhan Rajan
2019 arXiv   pre-print
The framework employs supervised information only for obtaining the reference directional model and avoids the background modeling. Hence, it can be regarded as semi-supervised in nature.  ...  This paper proposes a data-efficient, semi-supervised, two-pass framework for segmenting bird vocalizations.  ...  The semi-supervised NMF features (see Fig. 8 ) obtained for spectrograms shown in Fig.4 (c) and 6(a) show the significant presence of background acoustic events in the feature domain.  ... 
arXiv:1902.09765v1 fatcat:whkdav7kcndxlpnwiorl64oj3y

Semantic video analysis for psychological research on violence in computer games

Markus Mühling, Ralph Ewerth, Thilo Stadelmann, Bernd Freisleben, Rene Weber, Klaus Mathiak
2007 Proceedings of the 6th ACM international conference on Image and video retrieval - CIVR '07  
This system requires manual annotations for a single video only to facilitate the semi-supervised learning process.  ...  To investigate this question, the extraction of meaningful content from computer games is required to gain insights into the interrelationship of violent game events and the underlying neurophysiologic  ...  The following parts of our system are discussed in more detail in sections 3.2-3.4: audiovisual feature extraction, feature selection, classification, and a semi-supervised classification approach.  ... 
doi:10.1145/1282280.1282367 dblp:conf/civr/MuhlingESFWM07 fatcat:pbocyglulraytg57wpuu4pjdba

ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection [article]

Yuma Koizumi, Shoichiro Saito, Hisashi Uematsu, Noboru Harada, and Keisuke Imoto
2019 arXiv   pre-print
To the best our knowledge, no large-scale datasets are available for ADMOS, although large-scale datasets have contributed to recent advancements in acoustic signal processing.  ...  Each sub-dataset includes over 180 hours of normal machine-operating sounds and over 4,000 samples of anomalous sounds collected with four microphones at a 48-kHz sampling rate.  ...  Uematsu, tection and Classification of Acoustic Scenes and Events chal- “SNIPER: Few-shot Learning for Anomaly Detection to Min- lenge (DCASE), 2017.  ... 
arXiv:1908.03299v1 fatcat:zic4x7lnx5c77i52sukkz7t5ky

Multimodal Co-learning: Challenges, Applications with Datasets, Recent Advances and Future Directions [article]

Anil Rahate, Rahee Walambe, Sheela Ramanna, Ketan Kotecha
2021 arXiv   pre-print
In the current state of multimodal machine learning, the assumptions are that all modalities are present, aligned, and noiseless during training and testing time.  ...  However, in real-world tasks, typically, it is observed that one or more modalities are missing, noisy, lacking annotated data, have unreliable labels, and are scarce in training or testing and or both  ...  audio event classification, and zero-shot video retrieval.  ... 
arXiv:2107.13782v2 fatcat:s4spofwxjndb7leqbcqnwbifq4

Ensemble deep learning: A review [article]

M.A. Ganaie and Minghui Hu and A.K. Malik and M. Tanveer and P.N. Suganthan
2022 arXiv   pre-print
ensemble, decision fusion strategies, unsupervised, semi-supervised, reinforcement learning and online/incremental, multilabel based deep ensemble models.  ...  Currently, deep learning models with multilayer processing architecture is showing better performance as compared to the shallow or traditional classification models.  ...  Broadly speaking, there are different approaches of classification like supervised, unsupervised classification, few-shot, one-shot.  ... 
arXiv:2104.02395v2 fatcat:lq73jqso5vadvnqfnnmw4zul4q

2021 Index IEEE Transactions on Multimedia Vol. 23

2021 IEEE transactions on multimedia  
The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, month, and inclusive pagination.  ...  The Subject Index contains entries describing the item under all appropriate subject headings, plus the first author's name, the publication abbreviation, month, and year, and inclusive pages.  ...  Liu, Q., +, TMM 2021 2114-2126 Learning Dual-Pooling Graph Neural Networks for Few-Shot Video Classification.  ... 
doi:10.1109/tmm.2022.3141947 fatcat:lil2nf3vd5ehbfgtslulu7y3lq

Event Mining in Multimedia Streams

Lexing Xie, H. Sundaram, M. Campbell
2008 Proceedings of the IEEE  
The review includes detection of events and actions in one or more continuous sequences, events in edited video streams, unsupervised event discovery, events in a collection of media objects, and a discussion  ...  This paper contains a survey on the problems and solutions in event mining, approached from three aspects: event description, event-modeling components, and current event mining systems.  ...  Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the U.S. Government.  ... 
doi:10.1109/jproc.2008.916362 fatcat:b3utldtbwvehjo4brlnteetdbq

On the Applicability of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/Speech Video Soundtracks

Robert Mertens, Po-Sen Huang, Luke Gottlieb, Gerald Friedland, Ajay Divakaran, Mark Hasegawa-Johnson
2012 International Journal of Multimedia Data Engineering and Management  
It also discusses how diarization can be tuned in order to better reflect the acoustic properties of general sounds as opposed to speech and introduces a proof-of-concept system for multimedia event classification  ...  This paper explores how unsupervised audio segmentation systems like speaker diarization can be adapted to automatically identify low-level sound concepts similar to annotator defined concepts and how  ...  Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied  ... 
doi:10.4018/jmdem.2012070101 fatcat:5hq7lg4frrdm3j2yagf4bpd7li

Training Neural Audio Classifiers with Few Data

Jordi Pons, Joan Serra, Xavier Serra
2019 ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
To this end, we evaluate (i-iv) for the tasks of acoustic event recognition and acoustic scene classification, considering from 1 to 100 labeled examples per class.  ...  We investigate supervised learning strategies that improve the training of neural network audio classifiers on small annotated collections.  ...  Another way to approach the problem is by leveraging additional data sources, like in unsupervised and semi-supervised frameworks where non-labelled data is also utilized [14, 15, 16] .  ... 
doi:10.1109/icassp.2019.8682591 dblp:conf/icassp/PonsSS19 fatcat:lygsemjtevfw3czs23lmfhqlg4

Training neural audio classifiers with few data [article]

Jordi Pons, Joan Serrà, Xavier Serra
2018 arXiv   pre-print
To this end, we evaluate (i-iv) for the tasks of acoustic event recognition and acoustic scene classification, considering from 1 to 100 labeled examples per class.  ...  We investigate supervised learning strategies that improve the training of neural network audio classifiers on small annotated collections.  ...  Another way to approach the problem is by leveraging additional data sources, like in unsupervised and semi-supervised frameworks where non-labelled data is also utilized [14, 15, 16] .  ... 
arXiv:1810.10274v3 fatcat:kryfs2j5dzeeva55v6q7eyo6fy

A Closer Look At Feature Space Data Augmentation For Few-Shot Intent Classification

Varun Kumar, Hadrien Glaude, Cyprien de Lichy, Wlliam Campbell
2019 Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)  
In this paper, we study six feature space data augmentation methods to improve classification performance in FSI setting in combination with both supervised and unsupervised representation learning methods  ...  We formulate it as a Few-Shot Integration (FSI) problem where a few examples are used to introduce a new intent.  ...  ., 2018) , semi-supervised learning (Cho et al., 2019b) are used to improve the performance of existing functionalities, performance for new functionalities suffers from the data scarcity problem.  ... 
doi:10.18653/v1/d19-6101 dblp:conf/acl-deeplo/KumarGLC19 fatcat:2h5y4hzu3rcpzhc2mppsvgfgam
« Previous Showing results 1 — 15 out of 349 results