Continuous Metric Learning For Transferable Speech Emotion Recognition and Embedding Across Low-resource Languages

Sneha Das, Nicklas Leander Lund, Nicole Nadine Lønfeldt, Anne Katrine Pagsberg, Line H. Clemmensen
2022 Proceedings of the Northern Lights Deep Learning Workshop  
Speech emotion recognition (SER) refers to the technique of inferring the emotional state of an individual from speech signals. SERs continue to garner interest due to their wide applicability. While the domain is mainly founded on signal processing, machine learning and deep learning methods, generalizing over languages continues to remain a challenge. To improve performance over languages, in this paper we propose a denoising autoencoder with semi-supervision using a continuous metric loss.
more » ... e novelty of this work lies in our proposal for continuous metric learning, which is among the first proposals on the topic to the best of our knowledge. Furthermore, we contribute labels corresponding to the dimensional model, that were used to evaluate the quality of embedding (the labels will be made available by the time of the publication). We show that the proposed method consistently outperforms the baseline method in terms of the classification accuracy and correlation with respect to the dimensional variables.
doi:10.7557/18.6300 fatcat:na4emlzzhvdy5jfnc6mumbacwi