Distributed Learning over Massive XML Documents in ELM Feature Space

Xin Bi, Xiangguo Zhao, Guoren Wang, Zhen Zhang, Shuang Chen
2015 Mathematical Problems in Engineering  
With the exponentially increasing volume of XML data, centralized learning solutions are unable to meet the requirements of mining applications with massive training samples. In this paper, a solution to distributed learning over massive XML documents is proposed, which provides distributed conversion of XML documents into representation model in parallel based on MapReduce and a distributed learning component based on Extreme Learning Machine for mining tasks of classification or clustering.
more » ... thin this framework, training samples are converted from raw XML datasets with better efficiency and information representation ability and taken to distributed learning algorithms in Extreme Learning Machine (ELM) feature space. Extensive experiments are conducted on massive XML documents datasets to verify the effectiveness and efficiency for both classification and clustering applications.
doi:10.1155/2015/923097 fatcat:2qxlvlx2pfesbnsgxzsgaplaea