Exploiting YouTube's Video ASR scripts to Extend Educational Videos Textual Representative Tags Based on Gibb's Sampling Technique

Ambele Robert Mtafya, Dongjun Huang, Gaudence Uwamahoro
2015 International Journal of Multimedia and Ubiquitous Engineering  
Given the importance of the textual information in content retrieval, it is desirable that the textual representation of educational videos contents in social media platforms like YouTube capture the semantics of what is really in content they represent. Such coherent textual representations are important in objective video content retrieval, repurposing, reuse and sense-making of the content. In this study,the Automatic Speech Recognition (ASR) in the video tracks was leveraged to supplement
more » ... e insufficient video content representations done through video title alone. The Latent Dirichlet allocation (LDA) implementation of Gibb's sampling topic modeling approach was used to evaluate the suitability of various textual representations for YouTube educational videos and extract the candidate topic that extends well the original YouTube keywords. The results show that in topics space, YouTube ASR script performs well as a representative textual source in dominant topic than the combined textual representations. The automatic keywords extension obtained using our method add value to applications that use tags for content discovery or retrieval 361 textual representation proportional contribution to the dominant topic is seen to be superior to YouTube's native metadata and keywords ( Figure 5) ; the observation validates the hypothesis that native YouTube metadata are sparse and do not represent well the actual video content and adds significance of this study. On the other hand it is surprising to see that the combined textual representations scores low as compared to ASR even the union representation that has three components, including the ASR keywords in it ( Figure 6 ).
doi:10.14257/ijmue.2015.10.5.33 fatcat:uxptgmhg7ngt3eamiqmmsyfzvy