2,947 Hits in 4.8 sec

Web mining: Machine learning for web applications

Hsinchun Chen, Michael Chau
2005 Annual Review of Information Science and Technology  
This creates a unique problem for performing text classification and clustering of Web documents because the format of HTML documents and the structure of the Web provide additional information for analysis  ...  Similarly, Web retrieval and Web mining share many similarities. Web document clustering has been studied both in the context of Web retrieval and of Web mining.  ... 
doi:10.1002/aris.1440380107 fatcat:wdqwbszj7valbnyjfysbb4ap4y

Web mining in soft computing framework: relevance, state of the art and future directions

S.K. Pal, V. Talwar, P. Mitra
2002 IEEE Transactions on Neural Networks  
The reason for considering web mining, a separate field from data mining, is explained.  ...  The limitations of some of the existing web mining methods and tools are enunciated, and the significance of soft computing (comprising fuzzy logic (FL), artificial neural networks (ANNs), genetic algorithms  ...  Web document retrieval by genetic learning of importance factors of HTML tags has been described in [77] . Here, the method learns the importance of tags from a training text set.  ... 
doi:10.1109/tnn.2002.1031947 pmid:18244512 fatcat:a2ea5nfnczgjlpwsbwe6ebt5hi

Information retrieval in the Web: beyond current search engines

Ricardo Baeza-Yates
2003 International Journal of Approximate Reasoning  
In this paper we briefly explore the challenges to expand information retrieval (IR) on the Web, in particular other types of data, Web mining and issues related to crawling.  ...  We also mention the main relations of IR and soft computing and how these techniques address these challenges.  ...  Acknowledgements We are grateful to the editors of this issue for the invitation to write this paper and their helpful comments to improve it.  ... 
doi:10.1016/j.ijar.2003.07.002 fatcat:4httknhv4fairag2vloeblwbhq

A Survey Study on Relation Extraction for Web Pages

Ghada Alsaigh, Ghayda Al-Talib, Alaa Y. Taqa
2020 Journal of education and science  
The natural language for the web pages consists of many semantic relations between entities. Discovering significant types of relations from the web is challenging because of its open nature.  ...  Three relation extraction algorithms are discussed: Support Vector Machine (SVM), Genetic algorithm and Naive Bayes classifier This survey would be useful for three kinds of readers First the Newcomers  ...  In this case the web is for human use because of the displaying content as syntax based HTML. Query ambiguity reduces HTML retrieval quality.  ... 
doi:10.33899/edusj.2020.164377 fatcat:lununwxf4ndwrgyyn4vtpo5hum

Review on Applicability of Genetic Algorithm to Web Search

S.Siva Sathya, Philomina Simon
2009 Journal of clean energy technologies  
The applicability of Genetic algorithms in the field of web search and a review on how a GA is applied to different problem domains in web search is discussed.  ...  Information Retrieval (IR) is concerned with searching and retrieving information within the documents and also searching the online databases and internet.  ...  Ali proposed a framework for web mining, the applications of data mining and knowledge discovery techniques to data collected in World Wide Web (WWW), and a genetic search [10] for search engines.  ... 
doi:10.7763/ijcte.2009.v1.73 fatcat:zaqf5wcifvgdxnob2wbm2sbu7q

Information Retrieval Techniques based on Ontology for High Effectiveness

Komal ShivajiMule, Arti Waghmare
2015 International Journal of Computer Applications  
Basic methods for information retrieval include Boolean Retrieval, Fuzzy retrieval, Vector Space model. Searching depends on matching keywords between user-query and document.  ...  In software engineering and information science, ontology is a formal naming and meaning of the types, properties, and interrelationships of the elements that truly or in a broad sense exist for a specific  ...  An expanding number of databases have ended up web open through HTML structure based search interfaces.  ... 
doi:10.5120/20748-3138 fatcat:rybtjricdvbbxccdlwbgokksoe

Web Mining Functions in an Academic Search Application

2009 Informatică economică  
This paper deals with Web mining and the different categories of Web mining like content, structure and usage mining.  ...  The application of Web mining in an academic search application has been discussed. The paper concludes with open problems related to Web mining.  ...  The other kind of the web structure mining is mining the document structure.  ... 
doaj:e93d9dbdfa5e460f811cf288fc583850 fatcat:t6r4yeiq75fx5epaqtlbrac6ru

Web Content Classification: A Survey

Prabhjot Kaur
2014 International Journal of Computer Trends and Technology  
transform it into an understandable structure for further use.  ...  Classification of web page content is essential to many tasks in web information retrieval such as maintaining web directories and focused crawling.The uncontrolled type of nature of web content presents  ...  The web content is semi structured and contains formatting information in form of HTML tags. A web page consists of hyperlinks to point to other pages.  ... 
doi:10.14445/22312803/ijctt-v10p117 fatcat:b6jugy7kb5gnnewakvpwy4fjxq

Automatic Document Collection

Shashikant Shashikant, Mukesh Rawat
2013 International Journal of Computer Applications  
Now a day's classification of document is an important area for research, as large amount of electronic documents are available in form of unstructured, semi structured and structured information.  ...  Document classification will be applicable for World Wide Web, electronic book sites, online forums, electronic mails, online blogs, digital libraries and online government repositories.  ...  Today web is the main resource for the text documents.  ... 
doi:10.5120/12221-8137 fatcat:pfkiknygmban7diqvxa2dzrtjq


Arpit Deo
2018 International Journal of Advanced Research in Computer Science  
In presented system, to extract the text from web documents, all html tags are removed.  ...  Due to information overloading, there is a need for better techniques to retrieve most relevant information from web. This paper presents the information retrieval system by using the PSO algorithm.  ...  [12] proposed a genetic algorithm based novel approach for information retrieval system to provide the web pages effectively and accurately.  ... 
doi:10.26483/ijarcs.v9i1.5505 fatcat:7bqrf25rejfabkjqb6nibuga4e

A Systematic Review Web Content Mining Tools and its Applications

Manjunath Pujar, Monica R Mundada
2021 International Journal of Advanced Computer Science and Applications  
Web content mining tools were needed to scan text, images and HTML documents and provide results to the search engine.  ...  Keywords-Web content mining; web structure mining; web usage mining; data mining; information retrieval; information extraction 752 | P a g e  ...  Where Database method will retrieve semi-structured data from the web document.  ... 
doi:10.14569/ijacsa.2021.0120886 fatcat:qqoyefg5hjgutiqaasxmbwhoq4

An Integrated Set of Web Mining Tools for Research

D Aravind
2018 Zenodo  
This paper describes th e design and implementation of web data mining research. It gives a method of identifying, extracting, filtering and analyzing data for web resources.  ...  In a sense, the web data mining with its design and implementation provides well bound utilizing information from the web for research.  ...  The phase of web mining research is to identify web resources for a specific research topic. Providing an efficient and effective web information retrieval tool is important in such a system.  ... 
doi:10.5281/zenodo.1410994 fatcat:fgeuwa7mlnfbnhmlewdkrmh3x4

An Effective Web Ontology Using Web Crawler Systems to Measures Web Similarities

Florence Dayana M, Dr.Chidambaram M
2017 International Journal of Engineering and Technology  
Web mining is an information mining strategies which naturally find information from web documents.  ...  Information mining is the process of extraction of hidden predictive information from the colossal databases.  ...  Web structure mining alludes to mining information about link structure connecting Web pages and other Web objects.  ... 
doi:10.21817/ijet/2017/v9i3/170903077 fatcat:2l5ltxfrfjhghgvg5x43vtv6xq

Soft Computing for Information Retrieval in the WEB

Enrique Herrera-Viedma, María J. Martín-Bautista, Sergio Guadarrama, Alejandro Sobrino, José A. Olivas
2005 European Society for Fuzzy Logic and Technology  
The main existing differences between Web retrieval and traditional IR, highlighting the following ones: (1) The HTML-based nature of Web documents, that make them present a structure defined by the HTML  ...  and Web mining, inductive query by example and relevance feedback, textual and Web document classification and clustering, and information filtering and recommendation systems.  ...  -Problems of the information retrieval and access in the web. (Enrique Herrera-Viedma)  ... 
dblp:conf/eusflat/Herrera-ViedmaMGSO05 fatcat:auv422dl6bdq3p2ywy55ertmeq


Rasha Hany Salman, Mahmood Zaki, Nadia A. Shiltag
2020 Al-Qadisiyah Journal Of Pure Science  
The fundamental employments of web content mining are to gather, sort out, classify, providing the best data accessible on the web for the client who needs to get it.  ...  The web today has become an archive of information in any structure such content, sound, video, designs, and multimedia, with the progression of time overall web, the world wide web is now crowded with  ...  Conclusion and Future Work The web data mining tools are primordial to scanning the many HTML documents, images, and text provided on Web pages.  ... 
doi:10.29350/qjps.2020.25.2.1067 fatcat:3b2th6byojbi3dgrkzemzjzqf4
« Previous Showing results 1 — 15 out of 2,947 results