Filters








6,877 Hits in 2.6 sec

Geographically focused collaborative crawling

Weizheng Gao, Hyun Chul Lee, Yingbo Miao
2006 Proceedings of the 15th international conference on World Wide Web - WWW '06  
We first propose several collaborative crawling strategies for the geographically focused crawling, whose goal is to collect web pages about specified geographic locations, by considering features like  ...  We study the problem of collecting geographically-aware pages using collaborative crawling strategies.  ...  The diversity values of geographically focused collaborative crawling strategies suggest that most of the geographically focused collaborative crawling strategies tend to favor those pages which are found  ... 
doi:10.1145/1135777.1135822 dblp:conf/www/GaoLM06 fatcat:dog2y2naljgxdi3r772w3avnn4

Geographical partition for distributed web crawling

José Exposto, Joaquim Macedo, António Pina, Albano Alves, José Rufino
2005 Proceedings of the 2005 workshop on Geographic information retrieval - GIR '05  
This paper evaluates scalable distributed crawling by means of the geographical partition of the Web.  ...  The work considers a distributed crawler where the assignment of pages to visit is based on page content geographical scope.  ...  The crawl criteria are given by arbitrary predicates. Web space partitioning based on page classifier tools is another work that is based on collaborative focused crawling [6] .  ... 
doi:10.1145/1096985.1096999 dblp:conf/gir/ExpostoMPAR05 fatcat:x2xy6hxponarjor3pln2nyhoye

Topic-oriented collaborative crawling

Chiasen Chung, Charles L. A. Clarke
2002 Proceedings of the eleventh international conference on Information and knowledge management - CIKM '02  
This is not the case with topic-oriented collaborative crawling.  ...  [18] recognize that the target pages of a focused crawl do not necessarily link directly to one another and describe focused crawlers that learn to identify apparently off-topic pages that reliably  ... 
doi:10.1145/584792.584802 dblp:conf/cikm/ChungC02 fatcat:a5aatkrugbdonkjed3aty6exvu

Topic-oriented collaborative crawling

Chiasen Chung, Charles L. A. Clarke
2002 Proceedings of the eleventh international conference on Information and knowledge management - CIKM '02  
This is not the case with topic-oriented collaborative crawling.  ...  [18] recognize that the target pages of a focused crawl do not necessarily link directly to one another and describe focused crawlers that learn to identify apparently off-topic pages that reliably  ... 
doi:10.1145/584800.584802 fatcat:rgcoz7oxuzdivj6tsivbzvnkeu

Crowdcrawling approach for community based plagiarism detection service

Sergey Butakov
2014 Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion  
As an example, some of these issues could be solved with proper geographical allocation of crawling units [5] [11] .  ...  CONCLUSION The crowdcrawling mechanism proposed in this paper focuses on the issue of scalability of crawling efforts for the internet plagiarism detection projects.  ... 
doi:10.1145/2567948.2580057 dblp:conf/www/Butakov14 fatcat:w6ibsd75avhhhiiqjeaandrx5i

Semantic Web Data Mining & Analysis

Abhishek Yadav, Gaurav Srivastava
2014 IOSR Journal of Computer Engineering  
Singh et. al in [11] proposed an ontology agent based focused crawler (O-ABFC) which improves existing agent based focused crawlers by using ontology and contextual information in crawling.  ...  Content mining agent works in collaboration with descriptive metadata agent and semantic metadata agent.  ... 
doi:10.9790/0661-16515760 fatcat:ryle2cebwvfr3jltq7ji7il66y

Location and the Web

Susanne Boll, Christopher Jones, Eric Kansa, Puneet Kishor, Mor Naaman, Ross Purves, Arno Scharl, Erik Wilde
2008 Proceedings of the first international workshop on Location and the web - LOCWEB '08  
The World Wide Web has become the world's largest networked information resource, but references to geographical locations remain unstructured and typically implicit in nature.  ...  At present, spatial knowledge is hidden in many small information fragments such as addresses on Web pages, annotated photos with GPS coordinates, geographic mapping applications, and geotags in usergenerated  ...  Topics cover geographically focused search and crawling, ranking for geographical search, understanding and modeling location and locationbased features, harvesting and mining location from different Web  ... 
doi:10.1145/1367798.1367799 dblp:conf/www/BollJKKNPSW08a fatcat:zlpxzdw2pfdd3c6j6ogo6363p4

Location and the web (LocWeb 2008)

Susanne Boll, Christopher Jones, Eric Kansa, Puneet Kishor, Mor Naaman, Ross Purves, Arno Scharl, Erik Wilde
2008 Proceeding of the 17th international conference on World Wide Web - WWW '08  
The World Wide Web has become the world's largest networked information resource, but references to geographical locations remain unstructured and typically implicit in nature.  ...  At present, spatial knowledge is hidden in many small information fragments such as addresses on Web pages, annotated photos with GPS coordinates, geographic mapping applications, and geotags in usergenerated  ...  Topics cover geographically focused search and crawling, ranking for geographical search, understanding and modeling location and locationbased features, harvesting and mining location from different Web  ... 
doi:10.1145/1367497.1367758 dblp:conf/www/BollJKKNPSW08 fatcat:wd5umm7mejb53ovovwswiq567e

GIRPharma

Francisco M. Rangel Pardo, Loli Rangel Pardo, Davide Buscaldi, Paolo Rosso
2010 Proceedings of the 1st International Conference and Exhibition on Computing for Geospatial Research & Application - COM.Geo '10  
It is a novel investigation, which requires collaboration between multidisciplinary teams and that is beginning to show the first progress.  ...  This paper describes an approximation based on geographic information retrieval with the purpose to give some solutions to the problem of searching pharmacies on duty in the Spanish territory.  ...  Our project is focused especially on inventorying every web site that publish information about pharmacies on duty, executing a geographic and temporal information retrieval about them on the geographical  ... 
doi:10.1145/1823854.1823892 dblp:conf/comgeo/PardoPBR10 fatcat:ycmujsfmkzeexoplec5euak32q

Conceptual considerations for comprehensive and cooperative crawling and indexing the Web

Stefan Voigt, Michael Granitzer
2021 Zenodo  
Candidate factors are: division by language, TLD, geographic region, topic/application field, network topology/latency, hosting facilities etc.  ...  Vertical splits involve coordinated crawling of sub-parts of the Web, storage (and access) to crawls (i.e. via WARC files), processing and enriching crawls as well as the coordinated generation of indices  ... 
doi:10.5281/zenodo.6148346 fatcat:oidkiv3d2nazrnq6tevxl33fmq

A Study on Web usage Data Mining in Online Sales and SASF Crawler in Online Advertisement

M. Anisha, P. Joyce
2015 International Journal of Computer Applications  
With the combination of technologies like semantic focused crawling and ontology learning, its possible to solve internet issues i.e whereby semantic focused crawling technology solves the issues of heterogeneity  ...  SASF framework uses Semantic Focused Crawling to solve the above the problem of heterogeneity, ubiquity, ambiguity in service discovery.  ... 
doi:10.5120/ijca2015907070 fatcat:ao74ztoro5d7fmwmpkg5xs7xhq

Visualising the south yorkshire floods of '07

Paul D. Clough, Rob Pasley, Stefan Siersdorfer, Jose San Pedro, Mark Sanderson
2007 Proceedings of the 4th ACM workshop on Geographical information retrieval - GIR '07  
To develop and test the application we have focused on the flooding in June 2007 that devastated large areas of South Yorkshire (UK).  ...  This information consists of various sources including news, blogs, wikis, all of which are generated increasingly in an interactive and collaborative way.  ... 
doi:10.1145/1316948.1316972 dblp:conf/gir/CloughPSPS07 fatcat:oek4zdv34zbjzlekcd4fsfavpm

Measuring the validity of peer-to-peer data for information retrieval applications

Noam Koenigstein, Yuval Shavitt, Ela Weinsberg, Udi Weinsberg
2012 Computer Networks  
However, some applications, like trend prediction, mandate collection of the data from the "long tail", hence a much more exhaustive crawl is needed.  ...  Furthermore, we show that content and search queries are highly localized, indicating that location-crossing conclusions require a wide spread spatial crawl.  ...  with geographic location [16, 17] .  ... 
doi:10.1016/j.comnet.2011.10.026 fatcat:auvuzsuwyzcghnpgfyw2zqgw4e

PositionRank: An Unsupervised Approach to Keyphrase Extraction from Scholarly Documents

Corina Florescu, Cornelia Caragea
2017 Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
Geographically 0.274 Focused 0.134 Collaborative 0.142 Crawling 0.165 by Weizheng Gao, Hyun Chul Lee and Yingbo Miao A collaborative 0.142 crawler 0.165 is a group 0.025 of crawling 0.165 nodes 0.033 ,  ...  Author-input keyphrases: collaborative crawling, geographically focused crawling, geographic entities Figure 6 : The title and abstract of a WWW paper by Gao et al. (2006) and the author-input keyphrases  ... 
doi:10.18653/v1/p17-1102 dblp:conf/acl/FlorescuC17 fatcat:pmh5n6zcvbdkpmjufxfftrnj5q

Text/Conference Paper

Simon Reichhuber
2019 Jahrestagung der Gesellschaft für Informatik  
The first, collaborative crawling, is an information retrieval task, hence it deals with knowledge distributed over multiple websites.  ...  Whereas the latter is designed to run in a virtual space, the second, denoted as machine park collaboration, can be implemented in industrial 4.0 fields of the real world.  ...  Collaborative crawling In this scenario the agents are usually called crawlers or spiders (see Figure 2 ) and a knowledge source is associated with a certain Uniform Resource Locator (URL).  ... 
doi:10.18420/inf2019_ws54 dblp:conf/gi/Reichhuber19 fatcat:5ropw5uokbc2xe5sthze7tsdl4
« Previous Showing results 1 — 15 out of 6,877 results