Deep Learning for Emotion Recognition in Affective Virtual Reality and Music Applications

2019 International journal of recent technology and engineering  
This paper presents a deep learning approach to emotion recognition as applied to virtual reality and music predictive analytics. Firstly, it investigates the deep parameter tuning of the multi-hidden layer neural networks, which are also commonly referred to simply as deep networks that are used to conduct emotion detection in virtual reality (VR)- electroencephalography (EEG) predictive analytics. Deep networks have been studied extensively over the last decade and have shown to be among the
more » ... ost accurate methods for predictive analytics in image recognition and speech processing domains. However, most predictive analytics deep network studies focus on the shallow parameter tuning when attempting to boost prediction accuracies, which includes deep network tuning parameters such as number of hidden layers, number of hidden nodes per hidden layer and the types of activation functions used in the hidden nodes. Much less effort has been put into investigating the tuning of deep parameters such as input dropout ratios, L1 (lasso) regularization and L2 (ridge regularization) parameters of the deep networks. As such, the goal of this study is to perform a parameter tuning investigation on these deep parameters of the deep networks for predicting emotions in a virtual reality environment using electroencephalography (EEG) signal obtained when the user is exposed to immersive content. The results show that deep tuning of deep networks in VR-EEG can improve the accuracies of predicting emotions. The best emotion prediction accuracy was improved to over 96% after deep tuning was conducted on the deep network parameters of input dropout ratio, L1 and L2 regularization parameters. Secondly, it investigates a similar possible approach when applied to 4-quadrant music emotion recognition. Recent studies have been characterizing music based on music genres and various classification techniques have been used to achieve the best accuracy rate. Several researches on deep learning have shown outstanding results in relation to dimensional music emotion recognition. Yet, there is no concrete and concise description to express music. In regards to this research gap, a research using more detailed metadata on twodimensional emotion annotations based on the Russell's model is conducted. Rather than applying music genres or lyrics into machine learning algorithm to MER, higher representation of music information, acoustic features are used. In conjunction with the four classes classification problem, an available dataset named AMG1608 is feed into a training model built from deep neural network. The dataset is first preprocessed to get full access of variables before any machine learning is done. The classification rate is then collected by running the scripts in R environment. The preliminary result showed a classification rate of 46.0%.
doi:10.35940/ijrte.b1030.0782s219 fatcat:3utdzdkkxbhebnimqsl7na7auq