Filters








721 Hits in 7.4 sec

General audio tagging with ensembling convolutional neural network and statistical features [article]

Kele Xu, Boqing Zhu, Qiuqiang Kong, Haibo Mi, Bo Ding, Dezhi Wang, Huaimin Wang
2018 arXiv   pre-print
The contributions of our solution include: We investigated a variety of convolutional neural network architectures to solve the audio tagging task.  ...  In this paper, we describe our solution for the DCASE 2018 Task 2 general audio tagging challenge.  ...  Recently, deep learning approaches such as convolutional neural networks (CNNs) have achieved stateof-the-art performance for the audio tagging task [4, 5] . * Corresponding author.  ... 
arXiv:1810.12832v1 fatcat:a25wpy4y3fhujg2eeqxwvuiqfq

Multi-Representation Knowledge Distillation For Audio Classification [article]

Liang Gao and Kele Xu and Huaimin Wang and Yuxing Peng
2020 arXiv   pre-print
audio tagging tasks.  ...  The framework takes multiple representations as the input to train the models in parallel. The complementary information provided by different representations is shared by knowledge distillation.  ...  is apparent that the promotion is greater on the general-purpose audio tagging task.  ... 
arXiv:2002.09607v1 fatcat:uprprvdifrf5jkxfgaslvhqvwm

Combining High-Level Features of Raw Audio Waves and Mel-Spectrograms for Audio Tagging [article]

Marcel Lederle, Benjamin Wilhelm
2018 arXiv   pre-print
These features are learned separately by two Convolutional Neural Networks, one for each input type, and then combined by densely connected layers within a single network.  ...  This relatively simple approach along with data augmentation ranks among the best two percent in the Freesound General-Purpose Audio Tagging Challenge on Kaggle.  ...  METHOD Our audio-tagging system comprises two separately trained Convolutional Neural Networks on raw audio and mel-spectrogram, respectively.  ... 
arXiv:1811.10708v1 fatcat:inwblmo7irgz5gfxskemxejlye

SAM-GCNN: A Gated Convolutional Neural Network with Segment-Level Attention Mechanism for Home Activity Monitoring [article]

Yu-Han Shen, Ke-Xin He, Wei-Qiang Zhang
2018 arXiv   pre-print
To tackle this task, we propose a gated convolutional neural network with segment-level attention mechanism (SAM-GCNN).  ...  Furthermore, we adopted model ensemble to enhance the capability of generalization of our model.  ...  After the gated convolutional neural network, the features on multiple channels are flattened into frequency axis.  ... 
arXiv:1810.03986v2 fatcat:dwpwuodjjzehxoe75mcod343ay

Music Genre Classification using Machine Learning Techniques [article]

Hareesh Bahuleyan
2018 arXiv   pre-print
The experiments are conducted on the Audio set data set and we report an AUC value of 0.894 for an ensemble classifier which combines the two proposed approaches.  ...  The features that contribute the most towards this multi-class classification task are identified.  ...  The first model described in this paper uses convolutional neural networks (Krizhevsky et al., 2012) , which is trained end-to-end on the MEL spectrogram of the audio signal.  ... 
arXiv:1804.01149v1 fatcat:towb7oaoefba3ai66fqwki6sa4

An Adversarial Feature Distillation Method for Audio Classification

Liang Gao, Haibo Mi, Boqing Zhu, Dawei Feng, Yicong Li, Yuxing Peng
2019 IEEE Access  
INDEX TERMS Convolutional neural networks, audio tagging, knowledge distillation, model compression.  ...  The extensive experiments are conducted on three audio classification tasks, audio scene classification, general audio tagging, and speech command recognition.  ...  The classifier based on convolutional neural networks (CNN) [7] , [8] shown their remarkable capabilities in multiple acoustic tasks.  ... 
doi:10.1109/access.2019.2931656 fatcat:5i5eahayznahvaumgoizefibxq

MuSLCAT: Multi-Scale Multi-Level Convolutional Attention Transformer for Discriminative Music Modeling on Raw Waveforms [article]

Kai Middlebrook, Shyam Sudhakaran, David Guy Brizan
2021 arXiv   pre-print
We validate the proposed MuSLCAT and MuSLCAN architectures by comparing them to state-of-the-art networks on four benchmark datasets for music tagging and genre recognition.  ...  Both MuSLCAT and MuSLCAN model features from multiple scales and levels by integrating a frontend-backend architecture.  ...  Yet, on medium to large-scale datasets, MuSLCAT generally outperform waveform-based models.  ... 
arXiv:2104.02309v1 fatcat:rldn5djakng6bohslqzujhhpu4

Rare Sound Event Detection Using Deep Learning and Data Augmentation

Yanping Chen, Hongxia Jin
2019 Interspeech 2019  
A convolutional neural network (CNN) was combined with a feed-forward neural network (FNN) to improve the detection performance, and a dynamic time warping based data augmentation (DA) method was proposed  ...  Sound event detection aims to detect multiple target sound events that may happen simultaneously.  ...  Our method consists of three main blocks, audio feature extraction, neural network classifiers and classifier ensemble.  ... 
doi:10.21437/interspeech.2019-1985 dblp:conf/interspeech/0005J19 fatcat:5pvuve5y7zfoniy3oc5urtdlle

An Xception Residual Recurrent Neural Network For Audio Event Detection And Tagging

Tomas Gajarsky, Hendrik Purwins
2018 Proceedings of the SMC Conferences  
Purwins are with Sound and Music Computing and Audio Analysis Lab, Aalborg University Copenhagen.  ...  The system that ranked 3rd [11] in SED is based on two deep neural network methods. One is training sample-level Deep Convolutional Neural Networks (DCNN) on raw waveforms.  ...  by a 1 ⇥ 1 convolution, followed by stacked residual recurrent neural networks.  ... 
doi:10.5281/zenodo.1422562 fatcat:w5neax7j25bunluq4jseuskz2i

Weakly Labelled AudioSet Tagging with Attention Neural Networks [article]

Qiuqiang Kong, Changsong Yu, Turab Iqbal, Yong Xu, Wenwu Wang, Mark D. Plumbley
2019 arXiv   pre-print
We bridge the connection between attention neural networks and multiple instance learning (MIL) methods, and propose decision-level and feature-level attention neural networks for audio tagging.  ...  Experiments on AudioSet show that the feature-level attention neural network achieves a state-of-the-art mean average precision (mAP) of 0.369, outperforming the best multiple instance learning (MIL) method  ...  The DCASE 2018 Challenge includes acoustic scene classification [22] , general purpose audio tagging [23] and bird audio detection [24] tasks.  ... 
arXiv:1903.00765v5 fatcat:rwuwkydj2famfij4wepbceeawa

Weakly Labelled AudioSet Taggingwith Attention Neural Networks

Qiuqiang Kong, Changsong Yu, Turab Iqbal, Yong Xu, Wenwu Wang, Mark D. Plumbley
2019 IEEE/ACM Transactions on Audio Speech and Language Processing  
We bridge the connection between attention neural networks and multiple instance learning (MIL) methods, and propose decision-level and feature-level attention neural networks for audio tagging.  ...  Experiments on AudioSet show that the feature-level attention neural network achieves a state-of-the-art mean average precision (mAP) of 0.369, outperforming the best multiple instance learning (MIL) method  ...  The DCASE 2018 Challenge includes acoustic scene classification [22] , general purpose audio tagging [23] and bird audio detection [24] tasks.  ... 
doi:10.1109/taslp.2019.2930913 fatcat:ftfkz37vl5fw7j5zkiryxr6tfu

Weakly supervised CRNN system for sound event detection with large-scale unlabeled in-domain data [article]

Dezhi Wang, Lilun Zhang, Changchun Bao, Kele Xu, Boqing Zhu, Qiuqiang Kong
2018 arXiv   pre-print
On the other hand, a weakly supervised architecture based on the convolutional recurrent neural network (CRNN) is developed to solve the strong annotations of sound events with the aid of the unlabeled  ...  In particular, a state-of-the-art general audio tagging model is first employed to predict weak labels for unlabeled data.  ...  Our contributions In this paper, we aim to develop a scalable system based on a CRNN framework using the well-developed neural networks for weakly-supervised SED.  ... 
arXiv:1811.00301v1 fatcat:jxjiqfzjevep7kr34owqxiv7ua

HOG-ESRs Face Emotion Recognition Algorithm Based on HOG Feature and ESRs Method

Yuanchang Zhong, Lili Sun, Chenhao Ge, Huilian Fan
2021 Symmetry  
At present, although convolutional neural network has achieved great success in face emotion recognition algorithms, it has a rising space in effective feature extraction and recognition accuracy.  ...  The experimental results on the FER2013 dataset show that the new algorithm can not only effectively extract features and reduce the residual generalization error, but also improve the accuracy and robustness  ...  By changing the branch level of ESR, the ensembles with shared representations (ESRs) based on convolutional neural network can reduce the computational complexity and redundancy without losing the generalization  ... 
doi:10.3390/sym13020228 fatcat:efzsm7qs55ejfehkwhwdw3vecq

Music Genre Classification using Deep Learning

Sheeba Fathima
2021 International Journal for Research in Applied Science and Engineering Technology  
In this paper, we propose two methods for boosting music genre classification with convolutional neural networks: 1) using a process inspired by residual learning to combine peak- and average pooling to  ...  provide more statistical information to higher level neural networks; and 2) To bypass one or more layers, use shortcut connections.  ...  The first approach uses a Convolutional Neural Network that is trained from beginning to end using audio signal Spectrogram features (images).  ... 
doi:10.22214/ijraset.2021.36087 fatcat:2tkneqaehrb6nd6atng3ifgziq

Convolutional Neural Networks Based System for Urban Sound Tagging with Spatiotemporal Context [article]

Jisheng Bai, Jianfeng Chen, Mou Wang, Xiaolei Zhang
2020 arXiv   pre-print
In this paper, we proposed convolutional neural networks (CNNs) based system for UST with spatiotemporal context.  ...  The goal of UST is to tag a recording, which is collected by the sensors from urban environment, and returns whether noise pollution is audible or not.  ...  neural networks [19] .  ... 
arXiv:2011.00175v1 fatcat:3cw5saynhbbo5c2gsy7rucnd34
« Previous Showing results 1 — 15 out of 721 results