Filters








228,979 Hits in 5.3 sec

Automatic Extraction of Meaning from the Web

Rudi Cilibrasi, Paul Vitanyi
2006 2006 IEEE International Symposium on Information Theory  
We consider similarity distances for two types of objects: literal objects that as such contain all of their meaning, like genomes or books, and names for objects.  ...  For the second type we consider similarity distances generated by web users corresponding to particular semantic relations between the (names for) the designated objects.  ...  For example, from genomic data one can extract letter-or block frequencies (the blocks are over the four-letter alphabet); from music files one can extract various specific numerical features, related  ... 
doi:10.1109/isit.2006.261979 dblp:conf/isit/CilibrasiV06 fatcat:kz67ir5ihbeevfvneggokymcze

A Practical Agent-Based Method to Extract Semantic Information from the Web [chapter]

J. L. Arjona, R. Corchuelo, A. Ruiz, M. Toro
2002 Lecture Notes in Computer Science  
In this article, we present a framework for automatic extraction of semantically-meaningful information from the current web.  ...  However, current trends seem to suggest that it is not likely to be adopted in the forthcoming years. In this sense, meaningful information extraction from the web becomes a handicap for web agents.  ...  Conclusions and Future Work In this article, we have sketched some ideas that show that the process of extracting information from the web can be separated from the business logic of a web agent by means  ... 
doi:10.1007/3-540-47961-9_48 fatcat:aterovpj2bhg5dqkfjgarr2ukm

A Survey on Data Annotation for the Web Databases

Miss.Priyanka P.Boraste
2014 IOSR Journal of Computer Engineering  
Every SRR contains multiple data units each of which describes one aspect of a real-world entity. Then SRR get extracted and assigned meaningful labels.  ...  After the successful extraction align the data units into different groups where, data inside the same group have the same semantic(meaning).Then automatically annotation wrapper can generated and used  ...  Acknowledgement I feel great pleasure in submitting this paper "A SURVEY ON DATA ANNOTATION FOR THE WEB DATABASES" ". I wish to Thank IOSR Journals for giving us such a wonderful opportunity.  ... 
doi:10.9790/0661-162116870 fatcat:3zyoclrc2batzfnda6lesqp4tu

Term-Based Clustering and Summarization of Web Page Collections [chapter]

Yongzheng Zhang, Nur Zincir-Heywood, Evangelos Milios
2004 Lecture Notes in Computer Science  
This research aims towards clustering of Web page collections using automatically extracted topical terms, and automatic summarization of the resulting clusters.  ...  A concise and meaningful summary of a Web page collection, which is generated automatically, can help Web users understand the essential topics and main contents covered in the collection quickly without  ...  The research has been supported by grants from the Natural Sciences and Engineering Research Council of Canada.  ... 
doi:10.1007/978-3-540-24840-8_5 fatcat:dk3icxjjebh4ljejlm22zxkyhe

Lexical semantic based Bayesian model for adaptive wrapper generation

R. Nandhi kesavan, K. Latha
2012 2012 International Conference on Data Science & Engineering (ICDSE)  
This paper focuses on an unsupervised information extraction system. Two kinds of features related to the text fragments from the Web documents are investigated.  ...  false positive in the real time web sites.  ...  The limitation is that it cannot distinguish the existing attribute from the semantic meaning of the new attribute from unseen website.  ... 
doi:10.1109/icdse.2012.6281907 fatcat:5re747avf5anbmbxgpzzzk2vcy

Automatic Query Formulation for Extracting Hidden Web: A Review

Manvi Siwach
2016 International Journal Of Engineering And Computer Science  
This can be done by extracting attributes from html pages and compare those attributes with users query and resulting attributes would be used to fill those forms automatically.  ...  There is lot of data on the internet which is not indexed by our conventional search engines. This web content is what we call as Hidden web or Deep web.  ...  To fill the forms of deep web automatically using ontology can be done by extracting attributes from html pages.  ... 
doi:10.18535/ijecs/v5i6.56 fatcat:5ir2zcny5vdnbamraouzgwpme4

The Web-OEM approach to Web information extraction

Luca Iocchi
1999 Journal of Network and Computer Applications  
The enormous amount of information available through the World Wide Web requires the development of effective tools for extracting and summarizing relevant data from Web sources.  ...  Our framework provides an easy-to-use and well-formalized method for automatic generation of wrappers extracting data from Web documents.  ...  This work has been carried out within the framework of an agreement between the Italian PT administration and the Fondazione Ugo Bordoni.  ... 
doi:10.1006/jnca.1999.0095 fatcat:fhum44zu4zbipcbikl5udo4owu

Resource capability discovery and description management system for bioinformatics Data and service Integration - an experiment with gene regulatory networks

Emdad Ahmed
2008 2008 11th International Conference on Computer and Information Technology  
information from the resulting pages by means of an API.  ...  The ability of agents and services to automatically locate and interact with unknown partners is a goal for Web based Data Integration system.  ...  ACKNOWLEDGMENT The work is partially supported by Wayne State University, Computer Science Department conference, travel fund.  ... 
doi:10.1109/iccitechn.2008.4802991 fatcat:h3ycejeftfhvfflkmentn2ryzq

Web Data Extraction System [chapter]

Robert Baumgartner, Wolfgang Gatterbauer, Georg Gottlob
2016 Encyclopedia of Database Systems  
Web process integration. In markets such as the automotive industry, business processes are largely carried out by means of web portals.  ...  Figure 1 depicts a high-level view of a typical fully-fledged semi-automatic interactive web data extraction system.  ... 
doi:10.1007/978-1-4899-7993-3_1154-2 fatcat:6ghtb2rgjzgfvmm5kpah7lcm5e

A Survey of Web Information Extraction Tools

Noha Negm, Passent ElKafrawy, Abdel Badea Salem
2012 International Journal of Computer Applications  
This has resulted in the need for automated Web Information Extraction (IE) tools that analyze the Web pages and harvest useful information from noisy content for any further analysis.  ...  This paper compares them in three dimensions: (1) the source of content extraction, (2) the techniques used, and (3) the features of the tools, moreover the advantages and disadvantages for each tool.  ...  After parsing a page into content blocks, features of each block are extracted. The features mean the meaningful keywords.  ... 
doi:10.5120/6115-8296 fatcat:2ijvncas7zbv5nwsonovfeodc4

Narrative text classification for automatic key phrase extraction in web document corpora

Yongzheng Zhang, Nur Zincir-Heywood, Evangelos Milios
2005 Proceedings of the seventh ACM international workshop on Web information and data management - WIDM '05  
State-of-the-art methods are aimed towards extracting key phrases from traditional text such as technical papers.  ...  Automatic key phrase extraction is a useful tool in many text related applications such as clustering and summarization.  ...  Acknowledgements This research has been supported by grants from the Natural Sciences and Engineering Research Council of Canada, GINIus Inc., and IT Interactive Services Ltd.  ... 
doi:10.1145/1097047.1097059 dblp:conf/widm/ZhangZM05 fatcat:zu24tcww6vevdhkalw6hiunr7q

Clustering Visually Similar Web Page Elements for Structured Web Data Extraction [chapter]

Tomas Grigalis, Lukas Radvilavičius, Antanas Čenys, Juozas Gordevičius
2012 Lecture Notes in Computer Science  
The experimental evaluation results of ClustVX system on three publicly available benchmark data sets outperform state-of-the-art structured data extraction systems.  ...  We propose a novel approach for extraction of structured web data called ClustVX. It clusters visually similar web page elements by exploiting their visual formatting and structural features.  ...  Introduction Automatic extraction of structured data from web pages is one of the key challenges for the Web search engines to advance into a more expressive semantic level.  ... 
doi:10.1007/978-3-642-31753-8_38 fatcat:v5inxjfvyfe6xghcwfkcouxaxa

Exploring the Potentialities of Automatic Extraction of University Webometric Information

Gianpiero Bianchi, Renato Bruni, Cinzia Daraio, Antonio Laureti Palma, Giulio Perani, Francesco Scalfati
2020 Journal of Data and Information Science  
in digitalization of universities, in particular by presenting techniques for the automatic extraction of information from the web to build indicators of quality and impact of universities' websites.  ...  AbstractPurposeThe main objective of this work is to show the potentialities of recently developed approaches for automatic knowledge extraction directly from the universities' websites.  ...  Acknowledgments This work is developed with the support of the H2020 RISIS 2 Project (No. 824091) and of the "Sapienza" Research Awards No. RM1161550376E40E of 2016 and RM11916B8853C925 of 2019.  ... 
doi:10.2478/jdis-2020-0040 fatcat:6upstqfqtvbxjodavzf4f3tikq

Automatically editing book reviews on the web

Koji Eguchi, Shigeki Sugita
2001 Proceeding of the third international workshop on Web information and data management - WIDM '01  
Using this, it retrieves book reviews on the Web, which are then automatically edited using some heuristic rules for segment extraction, filtering and sorting according to a semantic likelihood of their  ...  We propose an automatic editing method that assists users to retrieve book information, especially book reviews scattered on the Web.  ...  Acknowledgments A part of this research is supported by Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (JSPS), and by NII Seminar from the National Institute  ... 
doi:10.1145/502932.502947 dblp:conf/widm/EguchiS01 fatcat:c6srzjyapbejrpdguhdqmkcodm

Automatically editing book reviews on the web

Koji Eguchi, Shigeki Sugita
2001 Proceeding of the third international workshop on Web information and data management - WIDM '01  
Using this, it retrieves book reviews on the Web, which are then automatically edited using some heuristic rules for segment extraction, filtering and sorting according to a semantic likelihood of their  ...  We propose an automatic editing method that assists users to retrieve book information, especially book reviews scattered on the Web.  ...  Acknowledgments A part of this research is supported by Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (JSPS), and by NII Seminar from the National Institute  ... 
doi:10.1145/502944.502947 fatcat:zknpe5erebcj5h4cakxhp5xkyq
« Previous Showing results 1 — 15 out of 228,979 results