Multimedia web information fusion and analysis [thesis]

Jiang Tao
Information on the World Wide Web appears in diverse forms, including text, image, audio, and video. Presented with a wide range of information, an information user often takes great effort to correlate and track online information related to specific topics of interest. Fusion of multimedia information in a unified framework is thus needed for efficiently understanding and further analyzing the semantically related information. This thesis addresses the problem of multimedia web information
more » ... web information fusion and analysis by presenting an approach for modelling multimedia information in a unified semantic framework, based on which cross-media information analysis and mining is realized. As multimedia data are heterogeneous in their contents and formats, we employ a strategy for multimedia information fusion based on semantics of the data. Specifically, we develop two methods, one using a statistical vague transformation technique and the other employing a self-organizing neural network, to associate web images with related surrounding texts, based on which the semantics of the media objects can be extracted. Our experiments show that the proposed methods can identify associated image and text pairs with good accuracy and outperform a state-of-the-art method for image annotation using a statistical relevance model. To support cross-media analysis, this thesis develops a semantic representation schema, that combines MPEG-7 multimedia description, RDF language specification, and conceptual graph based knowledge representation techniques for modelling multimedia information. In addition, we develop a semantic metadata extraction algorithm utilizing a myriad of natural language processing (NLP) techniques to automatically extract concepts and relations from text contents. The extracted concepts are formally represented as bags of WordNet senses, based on which an incremental clustering approach i ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library is applied for organizing the concepts into a taxonomy. The constructed taxonomy, encoded in the form of RDF metadata, is subsequently used for facilitating semantic based multimedia analysis. For multimedia analysis, this thesis presents an algorithm, called GP-Close, for discovering generalized concept-relation association patterns from RDF semantic metadata collection. By adopting the notion of generalization closure, the proposed GP-Close algorithm can eliminate redundant over-generalized patterns during the mining process. We evaluate the GP-Close algorithm on two synthetic data sets and one real-world data set. In addition, a case study is conducted for analyzing an online terror attack document collection. Our experiments show that the proposed method can efficiently identify interesting patterns, effectively remove pattern redundancies, and significantly outperform the existing algorithms in terms of time efficiency.
doi:10.32657/10356/2602 fatcat:wk7plmeta5bndfg4q4igtcstrq