WDSAE-DNDT BASED SPEECH FLUENCY DISORDER CLASSIFICATION
Malaysian Journal of Computer Science
In this paper, Weight Decorrelated Stacked Autoencoder-Deep Neural Decision Trees (WDSAE-DNDT), a novel hybrid model is proposed for automating the assessment of children's speech fluency disorders by discerning their disfluencies. In fluency disorder classification, it is imperative to know how each feature contributes to the disorder classification rather than the diagnosis itself and so the depth modified DNDT acts as the best discriminator since it is interpretable by its very nature. The
... SAE presents DNDT with a high-level latent representation of the disfluent speech. A fusion feature vector was built by combining the prosodic cues from disfluent speech segments combined with the WDSAE-based Bottleneck features. The proposed hybrid model was compared with the performance of the experimented baseline models. Further analysis was carried out to check the impact of tree cut points for each feature and epochs on the accuracy of prediction of the hybrid model. The proposed hybrid model when trained on the fusion feature set has shown appreciable improvement in the area under the Receiver Operating Characteristics (ROC) curve, classification accuracy, Kappa statistical value, and Jaccard similarity index. The WDSAE-DNDT demonstrates high precision than the baseline models in setting clinical benchmark to distinguish subjects with dysphemia from those with Specific Language Impairment.