Twitter Data Clustering on issues of Children with Special Needs using Hybrid Topic Models with Multi-viewpoints Similarity Metric

Noorullah R.M
2020 International Journal of Early Childhood Special Education  
Social networks are an excellent source for users to share or exchange information on topics. Twitter is the most prioritized social network concerning the issues of children with special needs related topics of social users. Extracting good quality of topics from twitter corpus depends on the quality of text pre-processing and in finding optimal cluster tendency. With traditional topic models, cluster tendency identification is difficult because they use less frequent words in tweets. In
more » ... ional topic models, k value (number of clusters) decided manually and used Euclidean distance metric in most methods and cosine distance metrics in some methods. Proper Visualization of cluster tendency is also essential as corpus consists of a large number of documents and billions of words. In this paper, hybrid topic models with multi-viewpoints based similarity metric proposed to Visualize topic clouds, to find cluster tendency of various topics related to issues of children with special needs twitter datasets. Experimental evaluation and comparison of these proposed hybrid models done with other distance metrics. Empirical analysis performed with convergence speed and computational complexities. Cluster validity of proposed models done with external validity indices to quantify the quality of cluster and with internal validity indices to evaluate clustering structure. Visual Non-Matrix Factorization (VIS NMF) under multi-viewpoints similarity metric performed well than other models with a more informative assessment.
doi:10.9756/int-jecse/v12i1.201003 fatcat:tvzv7qr4obef7bi6z3sj4ed4fa