Filters








16,795 Hits in 9.3 sec

Effective Performance of Information Retrieval on Web by Using Web Crawling

Sk. AbdulNabi
<span title="2012-04-30">2012</span> <i title="Academy and Industry Research Collaboration Center (AIRCC)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/4htmhd4uqrcpzgx3ecmyrpcufq" style="color: black;">International journal of Web &amp; Semantic Technology</a> </i> &nbsp;
Due to this explosion in size, the effective information retrieval system or search engine can be used to access the information.  ...  We have also proposed to use the data structure concepts for implementation of scheduler & circular Queue to improve the performance of our web crawler. (Abstract)  ...  ACKNOWLEDGEMENTS We would like to thank every one, who has motivated and supported us for preparing this Manuscript.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5121/ijwest.2012.3205">doi:10.5121/ijwest.2012.3205</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/nmxpdk2ilza75aqkhiccjaq4wu">fatcat:nmxpdk2ilza75aqkhiccjaq4wu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20160626181305/http://airccse.org:80/journal/ijwest/papers/3212ijwest05.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a1/c2/a1c29d52b94344f92aa3c88283498968e234b7e4.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5121/ijwest.2012.3205"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Effective Performance of Information Retrieval by using Domain Based Crawler

Sk. Abdul, Dr. P.
<span title="">2013</span> <i title="The Science and Information Organization"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2yzw5hsmlfa6bkafwsibbudu64" style="color: black;">International Journal of Advanced Computer Science and Applications</a> </i> &nbsp;
Due to this explosion in size, the information retrieval system or Search Engines are being upgraded day by day and it can be used to access the information effectively and efficiently.  ...  It is an extension of Effective Performance of Web Crawler (EPOW) System [2] , in which it has two Crawler modules. The first one is Basic Crawler.  ...  DOMAIN BASED INFORMATION RETRIEVAL (DBIR) SYSTEM It is an extension of our earlier Effective performance of Web crawler (EPOW).In this proposed system we have added Ranking adaption with pattern matching  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14569/ijacsa.2013.040713">doi:10.14569/ijacsa.2013.040713</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/iialnvq7mvelxktwu2pwlpeyce">fatcat:iialnvq7mvelxktwu2pwlpeyce</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170922012104/http://www.thesai.org/Downloads/Volume4No7/Paper_13-Effective_Performance_of_Information_Retrieval.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1e/63/1e63faa12dd2101aae729cc3ab64da61bf59d0c2.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14569/ijacsa.2013.040713"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

A Scalable Lightweight Distributed Crawler for Crawling with Limited Resources

Milly Kc, Markus Hagenbuchner, Ah Chung Tsoi
<span title="">2008</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/wl6t75bdqrdlrgsukwo2coosly" style="color: black;">2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology</a> </i> &nbsp;
All currently known crawlers implement approximations or have limitations so as to maximize the throughput of the crawl, and hence, maximize the number of pages that can be retrieved within a given time  ...  A set of experiments, and comparisons highlight the effectiveness of the proposed approach.  ...  Crawling precision is calculated by dividing the number of unique pages by the total number of web pages retrieved by the crawler.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/wiiat.2008.234">doi:10.1109/wiiat.2008.234</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/iat/KcHT08.html">dblp:conf/iat/KcHT08</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/qyo64iremvezbk2wgzxltoytqi">fatcat:qyo64iremvezbk2wgzxltoytqi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190430172144/https://ro.uow.edu.au/cgi/viewcontent.cgi?article=2692&amp;context=infopapers" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ab/b2/abb26d4a40fb0abb0736caaa97878db38a1171e7.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/wiiat.2008.234"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Design and Implementation of Agricultural Information Resources Vertical Search Engine Based on Nutch

E.J. Ding
<span title="">2016</span> <i title="AIDIC Servizi S.r.l."> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/5re6yojrfja7da72zowrdasgca" style="color: black;">Chemical Engineering Transactions</a> </i> &nbsp;
So Nutch-based agricultural vertical search engine is designed, with implementation of agricultural domain ontology on the information acquisition and filtering, retrieval and similar terms recommending  ...  The experimental results show that our agricultural search engine can improve the precision of user retrieval and satisfies the professional demand of user.  ...  Acknowledgment This work is supported by the Fundamental Research Funds for the Central Universities (XDJK2013C071).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.3303/cet1651104">doi:10.3303/cet1651104</a> <a target="_blank" rel="external noopener" href="https://doaj.org/article/93e6aaaad7e247e1851143842d996f52">doaj:93e6aaaad7e247e1851143842d996f52</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hvome4z4orh2pgmujv2zhcwsh4">fatcat:hvome4z4orh2pgmujv2zhcwsh4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20210605053722/https://www.aidic.it/cet/16/51/104.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/cf/d9/cfd954d05dcfb9d7fdbd259d54cb597385f8413f.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.3303/cet1651104"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

An Ontology-based Web Crawling Approach for the Retrieval of Materials in the Educational Domain

Mohammed Ibrahim, Yanyan Yang
<span title="">2019</span> <i title="SCITEPRESS - Science and Technology Publications"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/rrqbmymjsrc75j6bbyyyprde5e" style="color: black;">Proceedings of the 11th International Conference on Agents and Artificial Intelligence</a> </i> &nbsp;
In this paper, among others kind of crawling, we focus on those techniques that extract the content of a web page based on the relations of ontology concepts.  ...  Ontology is a promising technique by which to access and crawl only related data within specific web pages or a domain.  ...  Also, would like to thanks the school of Engineering and school of computing at the University of Portsmouth for their contribution to participate in the experiment.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5220/0007692009000906">doi:10.5220/0007692009000906</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/icaart/IbrahimY19.html">dblp:conf/icaart/IbrahimY19</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/jo3uu7zh5faadpj3ftjrxjgch4">fatcat:jo3uu7zh5faadpj3ftjrxjgch4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200505031948/https://researchportal.port.ac.uk/ws/files/12830367/ICAART_2019_193.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/47/6b/476bc676b98d7338b147eaefd12b37c34bd79a61.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5220/0007692009000906"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Crawler with Search Engine based Simple Web Application System for Forum Mining

M. Maheswari, N. Tharminie
<span title="">2014</span> <i title="IOSR Journals"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/vabuspdninc75epczdurccts4u" style="color: black;">IOSR Journal of Computer Engineering</a> </i> &nbsp;
The designed crawler performs two functions, URL Crawling (structure mining) by page classification and Content Crawling (content mining) by Pattern clustering.  ...  Now-a-days the growth of online users increased infinitely depending upon the information in web sources.  ...  The knowledge extracted from the Web can be used to raise the performances for Web information retrievals, question and answering, and Web based data warehousing.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.9790/0661-16287982">doi:10.9790/0661-16287982</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mc47cagwmneivpkaszcsn4gwou">fatcat:mc47cagwmneivpkaszcsn4gwou</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180602131756/http://www.iosrjournals.org/iosr-jce/papers/Vol16-issue2/Version-8/M016287982.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/38/27/382706ac9a77728d610b2abe05f9a3bd1e26e107.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.9790/0661-16287982"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Self-Adaptive Ontology Technique based on Crawler History

Shwetha Jog
<span title="2015-08-19">2015</span> <i title="ESRSA Publications Pvt. Ltd."> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/3j6n6lpsjndinobmibtprywohe" style="color: black;">International Journal of Engineering Research and</a> </i> &nbsp;
Self-Adaptive Ontology Based on Crawler History is retrieves the pages by searching logically related keywords instead of using keyword search method.  ...  Search Engine uses this intelligent system. There are different techniques available for retrieving most important and relevant information from web. Keyword search is most used technique.  ...  Search engines built by using retrieval techniques which are capable of handling large scale web collections. I.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.17577/ijertv4is080399">doi:10.17577/ijertv4is080399</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rgo2zqy7dbfifamrsnfj3lamlu">fatcat:rgo2zqy7dbfifamrsnfj3lamlu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200323132726/https://www.ijert.org/research/self-adaptive-ontology-technique-based-on-crawler-history-IJERTV4IS080399.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/46/71/46717fa2a865c4caf3707cffd58d381d7467bacd.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.17577/ijertv4is080399"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

A Parametric Layered Approach to Perform Web Page Ranking

Ratika Goel, Anchal Garg
<span title="2013-04-18">2013</span> <i title="Foundation of Computer Science"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/b637noqf3vhmhjevdfk3h5pdsu" style="color: black;">International Journal of Computer Applications</a> </i> &nbsp;
The presented work will provide an recommendation based web page indexing so that effective web crawling will be performed.  ...  Web crawling is the foremost step to perform the effective and efficient web content search so that the user will get the specific web pages initially in an indexed form.  ...  The agents defined in the work will not only perform the information retrieval but also perform the analysis on the process of information fetching.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5120/11467-7251">doi:10.5120/11467-7251</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/p4wl56r6vrcydk3qa4pezsb3da">fatcat:p4wl56r6vrcydk3qa4pezsb3da</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170706134244/http://research.ijcaonline.org/volume67/number14/pxc3887251.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/40/c9/40c96002cf7e88515d4d3d41e7ad3c46f6a03d70.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5120/11467-7251"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

An Effective Parallel Web Crawler based on Mobile Agent and Incremental Crawling

Md. Abu Kausar, V. S. Dhaka, Sanjeev Kumar Singh
<span title="">2013</span> <i title="EJournal Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/i2kbowktuvc7vfvdaoxyvly5hy" style="color: black;">Journal of Industrial and Intelligent Information</a> </i> &nbsp;
A huge amount of new information is placed on the Web every day.  ...  These crawlers also effect load on the remote server by using its CPU cycles and memory, these loads must be taken into account in order to get high performance at a reasonable cost.  ...  Hence, there is a need of a more effective way of retrieving information from the web.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.12720/jiii.1.1.86-90">doi:10.12720/jiii.1.1.86-90</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/3mb2h7xugjhsfhhchxo4x6lxf4">fatcat:3mb2h7xugjhsfhhchxo4x6lxf4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170808042515/http://www.jiii.org/uploadfile/2013/1213/20131213054618231.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/df/f1/dff193ea5e2ee0048efbac00fbc96340878e563b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.12720/jiii.1.1.86-90"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

An Effective Parallel Web Crawler based on Mobile Agent and Incremental Crawling

Md. Abu Kausar, V. S. Dhaka, Sanjeev Kumar Singh
<span title="">2013</span> <i title="EJournal Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/i2kbowktuvc7vfvdaoxyvly5hy" style="color: black;">Journal of Industrial and Intelligent Information</a> </i> &nbsp;
A huge amount of new information is placed on the Web every day.  ...  These crawlers also effect load on the remote server by using its CPU cycles and memory, these loads must be taken into account in order to get high performance at a reasonable cost.  ...  Hence, there is a need of a more effective way of retrieving information from the web.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.12720/jiii.1.2.86-90">doi:10.12720/jiii.1.2.86-90</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/h3aa3i75m5dbnlyo42ovsbs3by">fatcat:h3aa3i75m5dbnlyo42ovsbs3by</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170808042515/http://www.jiii.org/uploadfile/2013/1213/20131213054618231.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/df/f1/dff193ea5e2ee0048efbac00fbc96340878e563b.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.12720/jiii.1.2.86-90"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Performance Optimization of Focused Web Crawling Using Content Block Segmentation

Bireshwar Ganguly, Devashri Raich
<span title="">2014</span> <i title="IEEE"> 2014 International Conference on Electronic Systems, Signal Processing and Computing Technologies </i> &nbsp;
This paper basically focuses on study of the various techniques of data mining for finding the relevant information from World Wide Web using web crawler.  ...  Thus a focused crawler solves this issue of relevancy by focusing on web pages for some given topic or a set of topics.  ...  TECHNIQUES USED IN WEB CRAWLING Lu LIU et al presents a novel clustering-based topical Web Crawling for domain-specific information retrieval guided by linkcontext.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icesc.2014.69">doi:10.1109/icesc.2014.69</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mxm7tsjlqzekvmpgbacpx5gj3e">fatcat:mxm7tsjlqzekvmpgbacpx5gj3e</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170830050352/http://www.ijirst.org/articles/IJIRSTV1I7061.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/26/29/26292406f9aa3c9d79b94b75b848b4567ee2bb60.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/icesc.2014.69"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Efficient Proposed Framework for Semantic Search Engine using New Semantic Ranking Algorithm

M. M., N.Mekky, A. Atwan
<span title="">2015</span> <i title="The Science and Information Organization"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2yzw5hsmlfa6bkafwsibbudu64" style="color: black;">International Journal of Advanced Computer Science and Applications</a> </i> &nbsp;
The amount of information raises billions of databases every year and there is an urgent need to search for that information by a specialize tool called search engine.  ...  This semantic framework operates over a sorting RDF by using efficient proposed ranking algorithm and enhanced crawling algorithm.  ...  RELATED WORKS Information recovery and retrieval by searching on the web is not a fresh idea but has different problems when it is evaluated to general information retrieval.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14569/ijacsa.2015.060818">doi:10.14569/ijacsa.2015.060818</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/6bdu6u7ycbhkdbh6jatlwmow4m">fatcat:6bdu6u7ycbhkdbh6jatlwmow4m</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180719055142/http://thesai.org/Downloads/Volume6No8/Paper_18-Efficient_Proposed_Framework_for_Semantic_Search_Engine.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2d/05/2d05a8c3ef9253f09b73d557c4aa6d52888951b8.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14569/ijacsa.2015.060818"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Web Crawler: A Review

Md. AbuKausar, V. S. Dhaka, Sanjeev Kumar Singh
<span title="2013-02-15">2013</span> <i title="Foundation of Computer Science"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/b637noqf3vhmhjevdfk3h5pdsu" style="color: black;">International Journal of Computer Applications</a> </i> &nbsp;
Based on the type of knowledge, web crawler is usually divided in three types of crawling techniques: General Purpose Crawling, 32 that will index the downloaded pages that help in quick searches.  ...  These pages are retrieved by a Web crawler that is an automated Web browser that follows each link it sees.  ...  CRAWLING TECHNIQUES There are a few crawling techniques used by Web Crawlers, mainly used are: A.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5120/10440-5125">doi:10.5120/10440-5125</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xnkxug5yt5g7rg5ycejfjd4bo4">fatcat:xnkxug5yt5g7rg5ycejfjd4bo4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180604094159/https://research.ijcaonline.org/volume63/number2/pxc3885125.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/70/86/7086cfbc441e1ae956e4600a115b45c8cc84e4a7.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5120/10440-5125"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Web Pages Retrieval by Using Proposed Focused Crawler

Dunia Hamid Hameed, Soukaena Hassan Hashem
<span title="">2016</span> <i title="Journal of Al-Nahrain University-Science"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ygmvmohmsngkjbyylafphua5om" style="color: black;">Al-Nahrain Journal of Science</a> </i> &nbsp;
In this paper, we will explain two methods to retrieve web pages by using traditional crawler and proposed focused crawler.  ...  adaptive with each user, needing for a tool to change the searching strategy, keeping the freshness of the web pages and filtering the links to keep track focusing on the user's preference.  ...  The most important ones are: Proposed Focused Crawler increases the performance of crawling, since it produces related web pages to the information need, this method makes visiting the links not random  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.22401/jnus.19.2.20">doi:10.22401/jnus.19.2.20</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/mdd6q4zs6vevdoxao7hg47evam">fatcat:mdd6q4zs6vevdoxao7hg47evam</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180604044416/http://www.jnus.org/pdf/1/2016/1/1194.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/82/e6/82e63f976b79460f74a576c93abfd8c919804703.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.22401/jnus.19.2.20"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

DwCB - Architecture Specification of Deep Web Crawler Bot with Rules Based on FORM Values for Domain Specific Web Site [chapter]

S. G. Shaila, A. Vadivel, R. Devi Mahalakshmi, J. Karthika
<span title="">2014</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/jajl7qtqc5cy7oavratsldrv2y" style="color: black;">Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering</a> </i> &nbsp;
In this paper, we have proposed architecture specification of a deep web crawler with effective FORM filling strategy using rules.  ...  For each successful FORM submission, the deep web data is extracted and indexed suitably for information retrieval applications.  ...  A set of keywords is used as query and searched into both documents crawled by surface web crawler and DwCB.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-11629-7_28">doi:10.1007/978-3-319-11629-7_28</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/2gjm3td4vzhzbiwoi3qpsbd2dm">fatcat:2gjm3td4vzhzbiwoi3qpsbd2dm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190302182416/http://pdfs.semanticscholar.org/a787/68a9123534d286d5b771744437f9a36e737c.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a7/87/a78768a9123534d286d5b771744437f9a36e737c.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-11629-7_28"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 16,795 results