Model-Agnostic Fast Adaptive Multi-Objective Balancing Algorithm for Multilingual Automatic Speech Recognition Model Training
Conference of the International Speech Communication Association
This paper regards multilingual automatic speech recognition model training as a multi-objective problem because learning different languages may conflict, necessitating a trade-off. Most previous works on multilingual ASR model training mainly used data sampling to balance the performance of multiple languages but ignore the conflicts between different languages, resulting in an imbalance in multiple languages. The languagespecific parameters of the multilingual ASR model are updated by the
... gle language gradients while the update of the shared parameter is jointly determined by the gradient of every language on its shared parameter, namely shared gradient. Therefore, we propose a model-agnostic fast adaptive (MAFA) multiobjective balancing algorithm to balance multiple languages by avoiding the mutual interferences between their shared gradients. In the algorithm, based on the decrease in the training loss, we dynamically normalize the shared gradient magnitudes representing the speed of learning to balance the learning speed. To evenly learn multiple languages, the language with the worst performance is selected, and a balancing gradient nearest to the normalized gradient of the selected language and positively correlated with other normalized ones is obtained to eliminate the mutual interferences. The model trained by MAFA outperforms the baseline model on the Common Voice corpus.