A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
CNN-LTE: A class of 1-X pooling convolutional neural networks on label tree embeddings for audio scene classification
2017
2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
A class of simple 1-X (i.e. 1-max, 1-mean, and 1-mix) pooling convolutional neural networks, which are tailored for the task at hand, are finally learned on top of the image features for scene recognition ...
Firstly, given the label set of the scenes, a label tree is automatically constructed where the labels are grouped into meta-classes. ...
Afterward, we train different 1-X pooling convolutional neural networks (CNN), including 1-max, 1-mean, and 1-mix pooling CNNs, on top of these images for classification. ...
doi:10.1109/icassp.2017.7952133
dblp:conf/icassp/PhanKHMMM17
fatcat:42wa2ygnfrgqnl2qg6dxwg6mkq
CNN-LTE: a Class of 1-X Pooling Convolutional Neural Networks on Label Tree Embeddings for Audio Scene Recognition
[article]
2016
arXiv
pre-print
Different convolutional neural networks, which are tailored for the task at hand, are finally learned on top of the image features for scene recognition. ...
This category taxonomy is then used in the feature extraction step in which an audio scene instance is represented by a label tree embedding image. ...
Afterward, we trained different 1-X pooling convolutional neural networks (CNN) [9] , including 1-max, 1-mean, and 1-mix pooling CNNs, on top of these images for recognition. ...
arXiv:1607.02303v2
fatcat:u7xgv6sn7vaubart6rmewpnyeq