3,220 Hits in 3.0 sec

Semi-automatic wrapper generation for Internet information sources

N. Ashish, C.A. Knoblock
Proceedings of CoopIS 97: 2nd IFCIS Conference on Cooperative Information Systems  
We present an approach for semi-automatically generating wrappers for structured internet sources.  ...  We demonstrate the ease with which we are able to build wrappers for a number of Web sources using our implemented wrapper generation toolkit. 1  ...  We also wish to thank Vipul Kashyap of the InfoSleuth project at MCC for suggestions on future enhancements.  ... 
doi:10.1109/coopis.1997.613813 dblp:conf/coopis/AshishK97 fatcat:jk2wifqe3rcndf2jqqzxpm5md4

Wrapper generation for semi-structured Internet sources

Naveen Ashish, Craig A. Knoblock
1997 SIGMOD record  
However, we can provide database-like querying for semi-structured WWW sources by building wrappers around these sources. We present an approach for semi-automatically generating such wrappers.  ...  We demonstrate the ease with which we are able to build wrappers for a number of internet sources in different domains using our implemented wrapper generation toolkit. ful to have the capability of issuing  ...  Acknowledgements We would like to thank Steve Minton and the other members of the SIMS and Ariadne projects for their contributions to this work.  ... 
doi:10.1145/271074.271078 fatcat:3fek26paxfd4fbe6kph4ih2lhu

Supporting unified interface to wrapper generator in Integrated Information Retrieval

Yue-Shan Chang, Min-Huang Ho, Wen-Chen Sun, Shyan-Ming Yuan
2002 Computer Standards & Interfaces  
In this paper, we present a design for an automatic eXtensible Markup Language (XML)-based framework with which to generate wrappers rapidly.  ...  Retrieved information is either semi-structured or unstructured in format and its sources are extremely heterogeneous.  ...  Acknowledgements We are grateful for the many excellent comments and suggestions made by the anonymous referees.  ... 
doi:10.1016/s0920-5489(02)00016-8 fatcat:tglyakgol5cvppatnycg7zzkh4

The Denodo Data Integration Platform [chapter]

Alberto Pan, Juan Raposo, Manuel Álvarez, Paula Montoto, Vicente Orjales, Justo Hidalgo, Lucía Ardao, Anastasio Molano, Ángel Viña
2002 VLDB '02: Proceedings of the 28th International Conference on Very Large Databases  
The world today is characterised by the proliferation of information sources available through media such as the WWW, databases, semi-structured files (e.g. XML documents), etc.  ...  Nevertheless, this information is usually scattered, heterogeneous and weakly structured, so it is difficult to process it automatically.  ...  The wrapper generation process for Web sources, JDBC, XML databases and structured or semi-structured text files is performed with the assistance of a semi-automatic generation tool which enables wrappers  ... 
doi:10.1016/b978-155860869-6/50097-4 dblp:conf/vldb/PanRAMOHAMV02 fatcat:xdpqmf6o6jfbhkst5jg63a3yem

How to make web sites talk together

Hoang Pham Huy, Takahiro Kawamura, Tetsuo Hasegawa
2005 Special interest tracks and posters of the 14th international conference on World Wide Web - WWW '05  
This proposal was developed in Toshiba with Web Service Gateway and Wrapper Generator System.  ...  From resource view point, current web sites in the Internet already provide quite enough information.  ...  Semi-automatic wrapper generation ([2], [7] , [8] ) can be considered as the most advanced wrapper generation mechanism currently.  ... 
doi:10.1145/1062745.1062766 dblp:conf/www/HuyKH05 fatcat:x2bm2gecxvbsdpjfpu2eaul3jy

Domain-dependent information gathering agent

Aleksander Pivk, Matjaz Gams
2002 Expert systems with applications  
The major advantage of the agent is a semi-automatic creation of a wrapper around a particular site with few human interventions.  ...  the Internet. q  ...  Acknowledgments Financial support was provided by an international project INCO-Copernicus 960154, Cooperative Research in Information Infrastructure, CRII, by the Ministry of education, science and sports  ... 
doi:10.1016/s0957-4174(02)00040-4 fatcat:uwgitol6lzdd5p7k6vi7vbrdja

CREAM: A Mediator Based Environment for Modeling and Accessing Distributed Information on the Web [chapter]

Sudha Ram, Jinsoo Park, Yousub Hwang
2002 Lecture Notes in Computer Science  
data model and ontology to provide an automatic way of identifying and resolving semantic conflicts at both data and schema levels. • The ability to semi-automatically generate wrappers for semi-structured  ...  The wrapper generator assists (via a graphical user interface) the information integrator in easily constructing wrappers for Web sources.  ... 
doi:10.1007/3-540-45495-0_8 fatcat:a5qtw3cy3jfjza3qrzqhcaraui

Supporting Information Integration With Autonomous Agents [chapter]

S. Bergamaschi, G. Cabri, F. Guerra, L. Leonardi, M. Vincini, F. Zambonelli
2001 Lecture Notes in Computer Science  
MOMIS (Mediator envirOnment for Multiple Information Sources) is an infrastructure for semi-automatic information integration that deals with the integration and query of multiple, heterogeneous information  ...  sources (relational, object, XML and semi-structured sources).  ...  Access" and "D2I: Integration, Warehousing, and Mining of Heterogeneous Data Sources", and by the University of Modena and Reggio Emilia with a fund for young researchers.  ... 
doi:10.1007/3-540-44799-7_10 fatcat:xy6hsomp2zc57adlu3motnohd4

An XML-enabled data extraction toolkit for web sources

Ling Liu, Calton Pu, Wei Han
2001 Information Systems  
In this paper, we describe the methodology and the software development of an XMLenabled wrapper construction systemFXWRAP for semi-automatic generation of wrapper programs.  ...  The second phase combines the information extraction rules generated at the first phase with the XWRAP component library to construct an executable wrapper program for the given web source. r  ...  The authors are partially supported by NSF and DARPA/ITO under the Information Technology Expeditions, Ubiquitous Computing, Quorum, and PCES programs.  ... 
doi:10.1016/s0306-4379(01)00040-0 fatcat:5dxghttqgzgnjlu2dcy2bkprxa


2002 International Journal of Cooperative Information Systems  
enables semi-automatic information integration to deal with the integration and query of multiple, heterogeneous information sources (relational, object, XML and semi-structured sources).  ...  The dynamism and the uncertainty of the Internet, along with the heterogeneity of the sources of information are the two main challenges for the today's technologies related to information management.  ...  with Heterogeneous Access" and "D2I: Integration, Warehousing, and Mining of Heterogeneous Data Sources" and AgentLink II -Europe's Network of Excellence for Agent-based Computing.  ... 
doi:10.1142/s0218843002000601 fatcat:5vi57ik7mbgi3bmgikaowdlwsq

Multilingual and multimedia information retrieval from Web documents

M. Gatius, M. Bertran, H. Rodriguez
2004 Proceedings. 15th International Workshop on Database and Expert Systems Applications, 2004.  
Techniques to extract information from Web documents, Wrapper Generation (WG) techniques, are used to access a finer information granularity than the whole Web page.  ...  In this shell Cross-Language IR (CLIR) and query expansion are performed using EuroWordNet (EWN), the best developed and most widely used lexical resource for several languages.  ...  We would like to thank Victoria Arranz for her helpful comments. References  ... 
doi:10.1109/dexa.2004.1333443 dblp:conf/dexaw/GatiusBR04 fatcat:fyij46nuxfaepp7pum6tna3eki

XML-based Web information extraction system design and implementation

Ma Jun, Li Tihong
2010 2010 3rd International Conference on Computer Science and Information Technology  
The system has certain versatility and flexibility, users can quickly customize for the Web information extraction wrapper to be used in different fields.  ...  based on the research of the existing Web information extraction techniques, this paper proposes a XML-based Web information extraction system design.  ...  can facilitate the data to be easily stored in a relational database, for the same class of pages with similar structure, we use semi-automatic wrapper induction method based on sample learning to generate  ... 
doi:10.1109/iccsit.2010.5564746 fatcat:wubpwl2k45f75dt5sov3hnkohi

Semi-Automatic Wrapper Generation for Commercial Web Sources [chapter]

Alberto Pan, Juan Raposo, Manuel Álvarez, Justo Hidalgo, Ángel Viña
2002 IFIP Advances in Information and Communication Technology  
Semi-automatic wrapper generation tools aim to ease the task of building structured views over semi-structured web sources.  ...  We describe our approach for wrapper generation and show the difficulties found with other systems for wrapping this kind of sources.  ...  Architecture at Wrapper-Generation Time Semi-Automatic Wrapper Generationfor Commercial Web 269 Sources The input for the process is the name of the expected input parameters for the wrapper at runtime  ... 
doi:10.1007/978-0-387-35614-3_16 fatcat:qonmhkskubd3tajudk5clnst2q


2014 International Journal of Electronics and Electical Engineering  
The proposed architecture extracts unstructured and un-grammatical data using wrapper induction and show the result in structured format.  ...  Information extraction from unstructured, ungrammatical data such as classified listings is difficult because traditional structural and grammatical extraction methods do not apply.  ...  The wrapper generation process includes two phases: structure analysis, and source-specific XML generation.  ... 
doi:10.47893/ijeee.2014.1121 fatcat:jh5qa2w3offqrcnkonkke7mwcm

Information Aggregation Using the Caméléon# Web Wrapper [chapter]

Aykut Firat, Stuart Madnick, Nor Adnan Yahaya, Choo Wai Kuan, Stéphane Bressan
2005 Lecture Notes in Computer Science  
Caméléon# is a web data extraction and management tool that provides information aggregation with advanced capabilities that are useful for developing value-added applications and services for electronic  ...  Conclusion We described Caméléon#, a tool for extraction and aggregation of data from various sources.  ...  Wrappers can be manual, semi-automatic, or automatic based on how their mapping specifications are generated.  ... 
doi:10.1007/11545163_8 fatcat:spvixh5b7bg2tmfso5i633kj34
« Previous Showing results 1 — 15 out of 3,220 results