1,103 Hits in 6.2 sec

Self-Adaptive Semantic Focused Crawler for Mining Services Information Discovery

Hai Dong, F. K. Hussain
2014 IEEE Transactions on Industrial Informatics  
In this paper, we present the framework of a novel self-adaptive semantic focused crawler -SASF crawler, with the purpose of precisely and efficiently discovering, formatting, and indexing mining service  ...  The innovations of this research lie in the design of an unsupervised framework for vocabulary-based ontology learning, and a hybrid algorithm for matching semantically relevant concepts and metadata.  ...  The StSM algorithm follows an unsupervised training paradigm aimed at finding the maximum probability that and co-occur in the Web pages.  ... 
doi:10.1109/tii.2012.2234472 fatcat:hotreldsujernfadxiuoka2rq4

PolarHub: A large-scale web crawling engine for OGC service discovery in cyberinfrastructure

Wenwen Li, Sizhe Wang, Vidit Bhatia
2016 Computers, Environment and Urban Systems  
In this paper, we introduce our solution of a new cyberinfrastructure platform, the PolarHub, that conducts large-scale web crawling to discover distributed geospatial data and service resources and accomplish  ...  Because of OGC's widespread adoption, OGC-compliant web services become the primary search target of PolarHub.  ...  Based on the services found by PolarHub, we will develop advanced geospatial orchestration services which will combine and enable a chain of workflow to support complex scientific analysis.  ... 
doi:10.1016/j.compenvurbsys.2016.07.004 fatcat:q7qdomhrgvamdiovr7j2dccqfi

Back-Office Web Traffic on The Internet

Enric Pujol, Philipp Richter, Balakrishnan Chandrasekaran, Georgios Smaragdakis, Anja Feldmann, Bruce MacDowell Maggs, Keung-Chi Ng
2014 Proceedings of the 2014 Conference on Internet Measurement Conference - IMC '14  
Back-office traffic, which may or may not be triggered by end-user activity, is essential for today's Web as it supports a number of popular but complex Web services including large-scale content delivery  ...  Our measurements show that back-office traffic accounts for a significant fraction not only of core Internet traffic, but also of Web transactions in the terms of requests and responses.  ...  Acknowledgments We would like to thank Oliver Spatscheck (our shepherd), the anonymous reviewers, and Paul Barford for their constructive feedback.  ... 
doi:10.1145/2663716.2663756 dblp:conf/imc/PujolRCSFMN14 fatcat:tj5r6xufivbyzejkjp2b7pxvdm

A Review on Broker Based Cloud Service Model

Nagarajan Rajganesh, Thirunavukarasu Ramkumar
2016 Journal of Computing and Information Technology  
The proposed cloud broker advocates techniques such as reasoning and decision making capabilities for the improved cloud service selection and composition.  ...  Nowadays, cloud offerings are not limited to range of services and anything can be shared as a service through the Internet.  ...  The second service called Platform as a Service (PaaS) is implausible model for offering to the cloud user the computing platform that includes operating systems, program developing environment, web servers  ... 
doi:10.20532/cit.2016.1002778 fatcat:jhp24sxomreslp3rkgwradls2y

EduBD: A Machine Understandable Approach to Integrate Information of Educational Institutions of Bangladesh

Shima Chakraborty, Hasan Hafizur Rahman, Hanif Seddiqui, Sajal Chandra Debnath
2016 International journal of Web & Semantic Technology  
In this regard, our research demonstrates the feasibility of semantic web technologies for converting and integrating these unstructured or semistructured information by introducing machine understandable  ...  Web contents related to educational institutions as well as their geographic data of a country is an emerging field of data sharing and consolidating with suitable data repositories to extract useful information  ...  to compare the quality of education, exploring compatible implicit knowledge on the web, geographically establish e-education services, develop spatial decision support systems and so on that access various  ... 
doi:10.5121/ijwest.2016.7101 fatcat:i645gx7jvrdedophmpj3qchoq4

Ad-hoc data processing in the cloud

Dionysios Logothetis, Kenneth Yocum
2008 Proceedings of the VLDB Endowment  
Ad-hoc data processing has proven to be a critical paradigm for Internet companies processing large volumes of unstructured data.  ...  is gr anted without fee provided that copies are not made or distr ibuted for pr ofit or co mmercial a dvantage and that copies bear t his notice and the full citation on the first page.  ...  For our demo, multiple crawlers explore disjoint areas of the web while a continuous MapReduce job builds each local index.  ... 
doi:10.14778/1454159.1454204 fatcat:73nyswrxp5e7pd2u76xbnfxekm


Marios D. Dikaiakos, Asterios Katsifodimos, George Pallis
2012 ACM Transactions on Internet Technology  
Experimental results show that Minersoft is a powerful tool for software search and discovery.  ...  In this article we investigate the problem of supporting keyword-based searching for the discovery of software files that are installed on the nodes of large-scale, federated Grid and Cloud computing infrastructures  ...  ACKNOWLEDGMENTS The authors would like to thank EGEE users and administrators, Paris Ionas and Andreas Papadopoulos, who provided characteristic queries and judgements for evaluating Minersoft.  ... 
doi:10.1145/2220352.2220354 fatcat:ncm6syiw5nh33aqiv2xio5bgh4

A TACOMA retrospective

Dag Johansen, K�re J. Lauvset, Robbert van Renesse, Fred B. Schneider, Nils P. Sudmann, Kjetil Jacobsen
2002 Software, Practice & Experience  
For seven years, the TACOMA project has investigated the design and implementation of software support for mobile agents.  ...  Each version of TACOMA has provided a framework to support the execution of programs, called agents, that migrate from host to host in a computer network.  ...  WEB CRAWLER APPLICATION Web crawlers follow links to Web servers and retrieve the data found there for processing at some other server.  ... 
doi:10.1002/spe.451 fatcat:2lznpoi6s5fdhchos2cs57oj7u

Proposing a Framework for Exploration of Crime Data Using Web Structure and Content Mining

Amin Shahraki Moghaddam, Javad Hosseinkhani, Suriayati Chuprat, Hamed Taherdoost, Hadi Barani Baravati
2013 Research Journal of Applied Sciences Engineering and Technology  
architecture of a scalable universal crawler.  ...  In consequent, the set is ready to implement a various criminal network evaluation tools for testing.  ...  A suggested framework uses a special crawler for crime web mining.  ... 
doi:10.19026/rjaset.6.3568 fatcat:ivqdna3sijezrkmjfvcbpfrwfy

The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving

Thomas Risse, Elena Demidova, Stefan Dietze, Wim Peters, Nikolaos Papailiou, Katerina Doka, Yannis Stavrakas, Vassilis Plachouras, Pierre Senellart, Florent Carpentier, Amin Mantrach, Bogdan Cautis (+2 others)
2014 Future Internet  
to guide a novel Web crawler.  ...  The constantly growing amount of Web content and the success of the Social Web lead to increasing needs for Web archiving. These needs go beyond the pure preservation of Web pages.  ...  Conflicts of Interest Thomas Risse and Wim Peters are co-editors of the Special Issue on Archiving Community Memories.  ... 
doi:10.3390/fi6040688 fatcat:jm7aicz6trfadnamhqnfsltcvy

Utilizing Web2.0 in Web Service Ranking

Ioan Toma, Ying Ding, Krissada Chalermsook, Elena Simperl, Dieter Fensel
2009 2009 Third International Conference on Digital Society  
In this paper we explore the idea of using social annotations technology for ranking web services.  ...  This introduces a set of new challenges such as how to organize, search, rank and select services.  ...  Further, we especially thank seekda OG for providing the web service data set used in our experiments.  ... 
doi:10.1109/icds.2009.14 dblp:conf/icds/TomaDCSF09 fatcat:lml6uob53vdkvofh7k7su45anq

Intelligent Peer Networks for Collaborative Web Search

Filippo Menczer, Le-Shin Wu, Ruj Akavipat
2008 The AI Magazine  
Collaborative query routing is a new paradigm for Web search that treats both established search engines and other publicly available indices as intelligent peer agents in a search network.  ...  The approach makes it transparent for anyone to build their own (micro) search engine, by integrating established Web search services, desktop search, and topical crawling techniques.  ...  Because collaborative peer search represents a new paradigm for web search, the interface between the 6S network and its users is critical.  ... 
doi:10.1609/aimag.v29i3.2155 fatcat:3z6d422nb5dlxismc5gedmml5i


2008 Parallel Processing Letters  
Thus, the main contribution of this paper is a proposal for an importance-based monitoring architecture for large-scale grid information services, which is based on an adaptation of the web crawling paradigm  ...  In its abstract form, this can be seen as an adaptation of the web crawling paradigm [3] for the purpose of large-scale grid monitoring [10, 11, 27] .  ... 
doi:10.1142/s0129626408003442 fatcat:22kd5gkyfrblxhbldizisqtbrm

ARCOMEM Crawling Architecture

Vassilis Plachouras, Florent Carpentier, Muhammad Faheem, Julien Masanès, Thomas Risse, Pierre Senellart, Patrick Siehndel, Yannis Stavrakas
2014 Future Internet  
takes 9 into account the type of Web sites and applications to extract structure from crawled content. 10 We also describe a large-scale distributed crawler that has been developed, as well as the 11  ...  We introduce the overall 7 architecture and we describe its modules, such as the online analysis module which computes 8 a priority for the Web pages to be crawled, and the Application-Aware Helper which  ...  551 Conflicts of Interest 552 Thomas Risse is co-editor of the special issue on Archiving Community Memories.  ... 
doi:10.3390/fi6030518 fatcat:ng3b7vqbv5dadowfowj2edkfae

Semantically-enhanced information extraction

Hisham Assal, John Seng, Franz Kurfess, Emily Schwarz, Kym Pohl
2011 2011 Aerospace Conference  
within a more contextually complete picture.  ...  Our demonstration system infers the possibility of a terrorist plot by extracting key events and relationships from a collection of news articles and intelligence reports.  ...  One set of services is responsible for document access, which includes Web crawlers, internal search services, and possibly specialized services to access proprietary repositories.  ... 
doi:10.1109/aero.2011.5747547 fatcat:bgmnkonxgnaj3awh4avm5md5du
« Previous Showing results 1 — 15 out of 1,103 results