A Plausible Comprehensive Web Intelligent System for Investigation of Web User Behaviour Adaptable to Incremental Mining
International Journal of Database Management Systems
With the continued increase in the usage of the World Wide Web (WWW) Web mining has been established as an important area of research. The WWW is a vast repository of unstructured information, in the form of interrelated files, distributed on numerous web servers over wide geographical regions. Web mining deals with the discovering and analyzing of useful information from the WWW. Web usage mining focuses on investigating the potential knowledge from the browsing patterns of users and to find
... e correlation between the pages on analysis. To proceed towards web intelligence, obviating the need for human interaction, need to incorporate and embed artificial intelligence into web tools. Before applying mining techniques, the data in the web log has to be pre-processed, integrated and transformed. The data pre-processing stage is the most important phase in the process of web mining and is critical and complex in successful extraction of useful data. The web log is non scalable, impractical and distributed in nature thus conventional data pre-processing techniques are proved to be not suitable as they assume that the data is static. Hence intelligent system is required for capable of pre processing weblog efficiently. Due to the incremental nature of the web log, it is necessary for web miners to use incremental mining techniques to extract the usage patterns and study the visiting characteristics of user, hence one can require a comprehensive algorithm which reduces the computing cost significantly. This paper introduces an Intelligent System IPS for pre-processing of web log, in addition a learning algorithm IFP-tree model is proposed for pattern recognition. The Intelligent Pre-processing System (IPS) can differentiate human user and web search engine accesses intelligently in less time, and discards search engine accesses. The present system reduces the error rate and improves significant learning performance of the algorithm. The Incremental Frequent Pattern Tree (IFP-Tree) is to suit for continuously growing web log, based on association rule mining with incremental technique. IFP-Tree is to store user-specific browsing path information in a condensed way. The algorithm is more efficient as it avoids the generation of candidates, reduces the number of scans and allows interactive mining with different supports. The experimental results that prove this claim are given in the present paper. KEYWORDS Web usage mining, intelligent pre-processing system, incremental frequent pattern tree.