Filters








43,077 Hits in 6.3 sec

Conceptual-model-based data extraction from multiple-record Web pages

D.W. Embley, D.M. Campbell, Y.S. Jiang, S.W. Liddle, D.W. Lonsdale, Y.-K. Ng, R.D. Smith
1999 Data & Knowledge Engineering  
can apply a conceptual-modeling approach to extract and structure data automatically.  ...  The approach is based on an ontology-a conceptual model instance-that describes the data of interest, including relationships, lexical appearance, and context keywords.  ...  Our data extraction method is based on conceptual modeling, and, as such, this approach also represents a new direction for research in conceptual modeling.  ... 
doi:10.1016/s0169-023x(99)00027-0 fatcat:v6imf5uvmvfrzlejks5wkstinq

Automatic Location and Separation of Records: A Case Study in the Genealogical Domain [chapter]

Troy Walker, David W. Embley
2004 Lecture Notes in Computer Science  
Our solution is a hybrid of two well established techniques: (1) ontology-based extraction [ECJ + 99] and (2) vector space modeling [SM83] .  ...  Locating specific chunks (records) of information within documents on the web is an interesting and nontrivial problem.  ...  Acknowledgements: This material is based upon work supported by the National Science Foundation under grant No. IIS-0083127.  ... 
doi:10.1007/978-3-540-30466-1_28 fatcat:thif4iuxoffa5h2cwqdw6qsqhy

Collaborative Wrapping: A Turbo Framework for Web Data Extraction

Shui-Lung Chuang, Kevin Chen-Chuan Chang, ChengXiang Zhai
2007 2007 IEEE 23rd International Conference on Data Engineering  
To access data sources on the Web, a crucial step is wrapping, which translates query responses, rendered in textual HTML, back into their relational form.  ...  Observing that sources in the same domain usually share common fields, we propose a novel wrapping concept-collaborative wrappingwhere multiple sources are extracted concurrently with contentbased synchronization  ...  The paper makes three contributions: (a) we propose collaborative wrapping to fundamentally support multi-source wrapping for domain-based integration.  ... 
doi:10.1109/icde.2007.368988 dblp:conf/icde/ChuangCZ07 fatcat:mc2ozx3i55bohiot2jindp2ecu

How the Minotaur Turned into Ariadne: Ontologies in Web Data Extraction [chapter]

Tim Furche, Georg Gottlob, Xiaonan Guo, Christian Schallhart, Andrew Sellers, Cheng Wang
2011 Lecture Notes in Computer Science  
Humans require automated support to profit from the wealth of data nowadays available on the web.  ...  First results from the DIADEM project illustrate that high quality, fully automated data extraction at a web scale is possible, if we combine domain ontologies with a phenomenology describing the representation  ...  Every web page is processed in a single sequential pipeline. First we extract the page model from a live rendering of the web page.  ... 
doi:10.1007/978-3-642-22233-7_2 fatcat:f2ssgiahkjhqvpvxcchhilmxhm

Top K List Extraction from Web Pages

Priyanka Deshmane, Pramod Patil, Abha Pathak
2016 International Journal of Computer Applications  
The paper provides solution to problem by extracting information from top-k websites, which consist top k instances of a subject. For Examples"top 5 football teams in the world".  ...  to extract.  ...  computation records from web pages that time for single are interlink HyliEn is 4.2 Web pages. seconds on average. complexity bounded on the O(M× Computation structural L)+O(𝑀 3 ), time is much complexity  ... 
doi:10.5120/ijca2016911394 fatcat:2ss2gtrqlrh43gfkiogankvrhm

Big Data—Conceptual Modeling to the Rescue [chapter]

David W. Embley, Stephen W. Liddle
2013 Lecture Notes in Computer Science  
We do not envision any silver bullets that will slay the "werewolf" of Big Data, but conceptual modeling can help, as we illustrate with an example from our project that seeks to superimpose a web of knowledge  ...  Every day humans generate several petabytes of data [ZEd + 11] from a variety of sources such as orbital weather satellites, ground-based sensor networks, mobile computing devices, digital cameras, and  ...  Because conceptual models are or can be formally based on predicate calculus, we can use inference rules that map from one conceptual model to another to organize our knowledge base.  ... 
doi:10.1007/978-3-642-41924-9_1 fatcat:iy6yk743szgaxegyy46hscppdi

Intelligent and adaptive web data extraction system using convolutional and long short-term memory deep learning networks

Sudhir Kumar Patnaik, C. Narendra Babu, Mukul Bhave
2021 Big Data Mining and Analytics  
once (Yolo) algorithm and Tesseract LSTM to extract product details, which are detected as images from web pages.  ...  This study investigates an intelligent and adaptive web data extraction system with convolutional and Long Short-Term Memory (LSTM) networks to enable automated web page detection using the You only look  ...  proposed web data extraction framework to extract data from one product page (e.g., Fig.13 Object detection with bounding boxes around multiple product detail and data extracted without changes in website  ... 
doi:10.26599/bdma.2021.9020012 fatcat:eve4l3wr6vcrzbnhtvunnptl6a

A Conceptual-Modeling Approach to Extracting Data from the Web [chapter]

D. W. Embley, D. M. Campbell, Y. S. Jiang, S. W. Liddle, Y. -K. Ng, D. W. Quass, R. D. Smith
1998 Lecture Notes in Computer Science  
The approach is based on an ontology|a conceptual model instance|that describes the data of interest, including relationships, lexical appearance, and context keywords.  ...  approach to extract and structure data.  ...  Conclusions We described a conceptual-modeling approach to extracting and structuring data from the Web.  ... 
doi:10.1007/978-3-540-49524-6_7 fatcat:ne6lvzbcdrfctgrdv73vixjg7a

Databases and the World Wide Web [chapter]

Paolo Atzeni
1999 Lecture Notes in Computer Science  
in order to map physical HTML sources to database objects • Third Step: Extracting Data from the Site by Queries and Navigation 24 Modeling Web Sites: The ARANEUS Data Model • ADM -ODMG-like  ...  modeling background Automatic Wrapper Generators • Different inspirations, different techniques, some common features • Rougly speaking, in all systems: -one HTML page at a time (multiple-record document  ... 
doi:10.1007/3-540-47849-3_9 fatcat:lus2bjydmbdcznsbpswlp4mspy

A Hybrid Web Recommender System Based on Cellular Learning Automata

Mojdeh Talabeigi, Rana Forsati, Mohammad Reza Meybodi
2010 2010 IEEE International Conference on Granular Computing  
We propose a hybrid web page recommender system based on asynchronous cellular learning automata with multiple learning automata in each cell which try to identify user's multiple information needs and  ...  Our experiments show that incorporating conceptual relationship of pages with usage data can significantly enhance the quality of recommendations.  ...  We showed the restrictions that a usage-based system inherently suffers from and demonstrated how combining conceptual information regarding the web pages can improve the system.  ... 
doi:10.1109/grc.2010.153 dblp:conf/grc/TalabeigiFM10 fatcat:t77c5grc4zc2dnfy7nkmg3p43i

INFORMATION SERVICES FOR THE WEB: BUILDING AND MAINTAINING DOMAIN MODELS

AVIGDOR GAL, SCOTT KERR, JOHN MYLOPOULOS
1999 International Journal of Cooperative Information Systems  
These mechanisms are based on conceptual modeling techniques, where concepts are being defined and refined within a metadata repository through the use of instantiation, generalization and attribution.  ...  An Infrastructure for DHIS assumes the existence of communication and data source tools and provides tools for handling information from heterogeneous distributed data sources.  ...  The material is based upon work supported (in part) by the Rutgers University Faculty of Management.  ... 
doi:10.1142/s0218843099000125 fatcat:nosj67j54nhllpbpy3xjtjqtdy

Information services for the Web: building and maintaining domain models

S. Kerr, A. Gal, J. Mylopoulos
1998 Proceedings 3rd IFCIS International Conference on Cooperative Information Systems (Cat No 98EX122) COOPIS-98  
These mechanisms are based on conceptual modeling techniques, where concepts are being defined and refined within a metadata repository through the use of instantiation, generalization and attribution.  ...  An Infrastructure for DHIS assumes the existence of communication and data source tools and provides tools for handling information from heterogeneous distributed data sources.  ...  The material is based upon work supported (in part) by the Rutgers University Faculty of Management.  ... 
doi:10.1109/coopis.1998.706179 dblp:conf/coopis/KerrGM98 fatcat:mljs2ie6s5hgrgafryho6xaemu

A Semantic Based Approach for Knowledge Discovery and Acquisition from Multiple Web Pages Using Ontologies

Abirami A.M, Askarunisa A
2013 International journal of Web & Semantic Technology  
The data from internet are dispersed in multiple documents or web pag-es. Most of them are not properly structured and organized.  ...  The semantic web technologies and ontologies play a vital role in in-formation extraction and new knowledge discovery from the web documents.  ...  Model for Information Extraction from HTML pages (list tags) Once the pre-processing phase gets completed, the clean html pages are fed into structured data extraction phase.  ... 
doi:10.5121/ijwest.2013.4306 fatcat:zrhtbbvcmvh2viz3l2pdix3qpi

A Semantic Based Approach For Knowledge Discovery And Acquisition From Multiple Web Pages Using Ontologies

A.M.Abirami1
2013 Zenodo  
The data from internet are dispersed in multiple documents or web pages. Most of them are not properly structured and organized.  ...  The semantic web technologies and ontologies play a vital role in information extraction and new knowledge discovery from the web documents.  ...  Table 2 . 2 Knowledge Acquisiton from different web pages Records Manual Effort (in mts) Time taken (ms) Single XML/RDF file Time Taken (ms) Multiple XML/RDF files XSLT XPath SPARQL XPath  ... 
doi:10.5281/zenodo.1473881 fatcat:eku4yo4edbaedmkdirq4yk2no4

Conceptual Modeling Foundations for a Web of Knowledge [chapter]

David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale
2011 Handbook of Conceptual Modeling  
Knowledge bundles are conceptual model instances augmented with facilities that provide for both extensional and intensional facts, for linking between knowledge bundles yielding a web of data, and for  ...  Here we propose an answer with conceptual modeling as its foundation. We define a web of knowledge as a collection of interconnected knowledge bundles superimposed over a web of documents.  ...  WoK Formalization We base our foundational conceptualization for a web of knowledge on the conceptual modeling language OSM (Object-oriented Systems Modeling) [EKW92] .  ... 
doi:10.1007/978-3-642-15865-0_15 fatcat:55yzcga2w5b4dm6a3lmdu2usbu
« Previous Showing results 1 — 15 out of 43,077 results