Mask Detection and Breath Monitoring from Speech: on Data Augmentation, Feature Representation and Modeling [article]

Haiwei Wu, Lin Zhang, Lin Yang, Xuyang Wang, Junjie Wang, Dong Zhang, Ming Li
2020 arXiv   pre-print
This paper introduces our approaches for the Mask and Breathing Sub-Challenge in the Interspeech COMPARE Challenge 2020. For the mask detection task, we train deep convolutional neural networks with filter-bank energies, gender-aware features, and speaker-aware features. Support Vector Machines follows as the back-end classifiers for binary prediction on the extracted deep embeddings. Several data augmentation schemes are used to increase the quantity of training data and improve our models'
more » ... ustness, including speed perturbation, SpecAugment, and random erasing. For the speech breath monitoring task, we investigate different bottleneck features based on the Bi-LSTM structure. Experimental results show that our proposed methods outperform the baselines and achieve 0.746 PCC and 78.8% UAR on the Breathing and Mask evaluation set, respectively.
arXiv:2008.05175v2 fatcat:5icoyftgkza6xik4bsk3pj6yra