Recognizing Bird Species in Audio Files Using Transfer Learning
Conference and Labs of the Evaluation Forum
In this paper, a method to identify bird species in audio recordings is presented. For this purpose, a pre-trained Inception-v3 convolutional neural network was used. The network was fine-tuned on 36,492 audio recordings representing 1,500 bird species in the context of the BirdCLEF 2017 task. Audio records were transformed into spectrograms and further processed by applying bandpass filtering, noise filtering, and silent region removal. For data augmentation purposes, time shifting, time
... hing, pitch shifting, and pitch stretching were applied. This paper shows that fine-tuning a pre-trained convolutional neural network performs better than training a neural network from scratch. Domain adaptation from image to audio domain could be successfully applied. The networks' results were evaluated in the BirdCLEF 2017 task and achieved an official mean average precision (MAP) score of 0.567 for traditional records and a MAP score of 0.496 for records with background species on the test dataset.