Modeling perception with hierarchical prediction: Auditory segmentation with deep predictive coding locates candidate evoked potentials in EEG

André Ofner, Sebastian Stober
2020 Zenodo  
The human response to music combines low-level expectations that are driven by the perceptual characteristics of audio with high-level expectations from the context and the listener's expertise. This paper discusses surprisal based music representation learning with a hierarchical predictive neural network. In order to inspect the cognitive validity of the network's predictions along their time-scales, we use the network's prediction error to segment electroencephalograms (EEG) based on the
more » ... o signal. Using the NMED-T dataset on passive natural music listening we explore the automatic segmentation of audio and EEG into events using the suggested model. By averaging only the EEG signal at predicted locations, we were able to visualize auditory evoked potentials connected to local and global musical structures. This indicates the potential of unsupervised predictive learning with deep neural networks as means to retrieve musical structure from audio and as a basis to uncover the corresponding cognitive processes in the human brain.
doi:10.5281/zenodo.4245495 fatcat:qhrdh5e7mzh75cnvffx3gayuiu