Infant Cry Signal Processing, Analysis, and Classification with Artificial Neural Networks [thesis]

Chunyan Ji
As a special type of speech and environmental sound, infant cry has been a growing research area covering infant cry reason classification, pathological infant cry identification, and infant cry detection in the past two decades. In this dissertation, we build a new dataset, explore new feature extraction methods, and propose novel classification approaches, to improve the infant cry classification accuracy and identify diseases by learning infant cry signals. We propose a method through
more » ... ing weighted prosodic features combined with acoustic features for a deep learning model to improve the performance of asphyxiated infant cry identification. The combined feature matrix captures the diversity of variations within infant cries and the result outperforms all other related studies on asphyxiated baby crying classification. We propose a non-invasive fast method of using infant cry signals with convolutional neural network (CNN) based age classification to diagnose the abnormality of infant vocal tract development as early as 4-month age. Experiments discover the pattern and tendency of the vocal tract changes and predict the abnormality of infant vocal tract by classifying the cry signals into younger age category. We propose an approach of generating hybrid feature set and using prior knowledge in a multi-stage CNNs model for robust infant sound classification. The dominant and auxiliary features within the set are beneficial to enlarge the coverage as well as keeping a good resolution for modeling the diversity of variations within infant sound and the experimental results give encouraging improvements on two relative databases. We propose an approach of graph convolutional network (GCN) with transfer learning for robust infant cry reason classification. Non-fully connected graphs based on the similarities among the relevant nodes are built to consider the short-term and long-term effects of infant cry signals related to inner-class and inter-class messages. With as limited as 20% of labeled training data, [...]
doi:10.57709/25943253 fatcat:cbeaarjlmzawfivxx6falbtpb4