Filters








2,657 Hits in 3.9 sec

Semi-Automatic Wrapper Generation for Commercial Web Sources [chapter]

Alberto Pan, Juan Raposo, Manuel Álvarez, Justo Hidalgo, Ángel Viña
2002 IFIP Advances in Information and Communication Technology  
Semi-automatic wrapper generation tools aim to ease the task of building structured views over semi-structured web sources.  ...  In this paper, we present WARGO, a semiautomatic wrapper generation tool, which has been used by nonprogrammer staff to successfully wrap more than 700 commercial web sources in several industrial applications  ...  Architecture at Wrapper-Generation Time Semi-Automatic Wrapper Generationfor Commercial Web 269 Sources The input for the process is the name of the expected input parameters for the wrapper at runtime  ... 
doi:10.1007/978-0-387-35614-3_16 fatcat:qonmhkskubd3tajudk5clnst2q

The Denodo Data Integration Platform [chapter]

Alberto Pan, Juan Raposo, Manuel Álvarez, Paula Montoto, Vicente Orjales, Justo Hidalgo, Lucía Ardao, Anastasio Molano, Ángel Viña
2002 VLDB '02: Proceedings of the 28th International Conference on Very Large Databases  
DENODO Corporation has developed a mediator system for the construction of semi-structured and structured data integration applications.  ...  The world today is characterised by the proliferation of information sources available through media such as the WWW, databases, semi-structured files (e.g. XML documents), etc.  ...  The wrapper generation process for Web sources, JDBC, XML databases and structured or semi-structured text files is performed with the assistance of a semi-automatic generation tool which enables wrappers  ... 
doi:10.1016/b978-155860869-6/50097-4 dblp:conf/vldb/PanRAMOHAMV02 fatcat:xdpqmf6o6jfbhkst5jg63a3yem

Web Data Extraction System [chapter]

Robert Baumgartner, Wolfgang Gatterbauer, Georg Gottlob
2016 Encyclopedia of Database Systems  
the system induces a suitable wrapper using machine learning techniques; and (3) semi-automatic interactive wrapper generation, where the wrapper designer not only provides example data for the system  ...  "semi-automatic" wrapper generation, providing a wrapper designer with visual and interactive support for declaring extraction and formatting patterns, other projects were based on machine learning techniques  ... 
doi:10.1007/978-1-4899-7993-3_1154-2 fatcat:6ghtb2rgjzgfvmm5kpah7lcm5e

Web Data Extraction System [chapter]

Serguei Mankovskii, Maarten van Steen, Minos Garofalakis, Alan Fekete, Christian S. Jensen, Richard T. Snodgrass, Alex Wun, Vanja Josifovski, Andrei Broder, Dennis Fetterly, Marc Najork, Robert Baumgartner (+55 others)
2009 Encyclopedia of Database Systems  
the system induces a suitable wrapper using machine learning techniques; and (3) semi-automatic interactive wrapper generation, where the wrapper designer not only provides example data for the system  ...  "semi-automatic" wrapper generation, providing a wrapper designer with visual and interactive support for declaring extraction and formatting patterns, other projects were based on machine learning techniques  ... 
doi:10.1007/978-0-387-39940-9_1154 fatcat:zamqe55tt5aupa2vvgdba7wy3u

Information Aggregation Using the Caméléon# Web Wrapper [chapter]

Aykut Firat, Stuart Madnick, Nor Adnan Yahaya, Choo Wai Kuan, Stéphane Bressan
2005 Lecture Notes in Computer Science  
Caméléon# is a web data extraction and management tool that provides information aggregation with advanced capabilities that are useful for developing value-added applications and services for electronic  ...  This paper covers the integration of Caméléon# with commercial database management systems, such as MS SQL Server, and XML query languages, such as XQuery.  ...  Conclusion We described Caméléon#, a tool for extraction and aggregation of data from various sources.  ... 
doi:10.1007/11545163_8 fatcat:spvixh5b7bg2tmfso5i633kj34

Information Aggregation using the Cameleon# Web Wrapper

Aykut Firat, Stuart E. Madnick, Nor Adnan Yahaya, Choo Wai Kuan, Stéphane Bressan
2005 Social Science Research Network  
Caméléon# is a web data extraction and management tool that provides information aggregation with advanced capabilities that are useful for developing value-added applications and services for electronic  ...  This paper covers the integration of Caméléon# with commercial database management systems, such as MS SQL Server, and XML query languages, such as XQuery.  ...  Conclusion We described Caméléon#, a tool for extraction and aggregation of data from various sources.  ... 
doi:10.2139/ssrn.771492 fatcat:4zxeuzcbjze25inrg7bm6dx72u

Web Data Extraction for Business Intelligence: The Lixto Approach

Georg Gottlob
2005 Datenbanksysteme für Business, Technologie und Web  
The extraction from semi-structured information sources is mostly done manually and is therefore very time consuming.  ...  This paper describes how public information can be extracted automatically from Web sites, transformed into structured data formats, and used for data analysis in Business Intelligence systems.  ...  Acknowledgements The authors would like to thank Giacomo del Felice from Pirelli Pneumatici S.p.A. for his continuous and reliable project support.  ... 
dblp:conf/btw/Gottlob05 fatcat:vu42jphwwbbg5psbvkjalpev3e

Documentum ECI self-repairing wrappers

Boris Chidlovskii, Bruno Roustant, Marc Brette
2006 Proceedings of the 2006 ACM SIGMOD international conference on Management of data - SIGMOD '06  
The ECI Adapter technology offers an interface to any application for data and metadata extraction from unstructured Web pages.  ...  We benefit from accessing reports on daily tests for all ECI commercially deployed adapters collected from June 2003 to September 2005.  ...  STATE OF ART All wrapper techniques generate wrapper instances in manual, semi-automatic or automatic manner.  ... 
doi:10.1145/1142473.1142555 dblp:conf/sigmod/ChidlovskiiRB06 fatcat:znnwgbjcqbaxjh3m7fpckjrej4

Multilingual and multimedia information retrieval from Web documents

M. Gatius, M. Bertran, H. Rodriguez
2004 Proceedings. 15th International Workshop on Database and Expert Systems Applications, 2004.  
Techniques to extract information from Web documents, Wrapper Generation (WG) techniques, are used to access a finer information granularity than the whole Web page.  ...  In this shell Cross-Language IR (CLIR) and query expansion are performed using EuroWordNet (EWN), the best developed and most widely used lexical resource for several languages.  ...  We would like to thank Victoria Arranz for her helpful comments. References  ... 
doi:10.1109/dexa.2004.1333443 dblp:conf/dexaw/GatiusBR04 fatcat:fyij46nuxfaepp7pum6tna3eki

An XML-enabled data extraction toolkit for web sources

Ling Liu, Calton Pu, Wei Han
2001 Information Systems  
In this paper, we describe the methodology and the software development of an XMLenabled wrapper construction systemFXWRAP for semi-automatic generation of wrapper programs.  ...  The second phase combines the information extraction rules generated at the first phase with the XWRAP component library to construct an executable wrapper program for the given web source. r  ...  Acknowledgements We would like to thank the XWRAP team at Georgia Tech for their implementation effort.  ... 
doi:10.1016/s0306-4379(01)00040-0 fatcat:5dxghttqgzgnjlu2dcy2bkprxa

Automatic information extraction from web pages

Budi Rahardjo, Roland H. C. Yap
2001 Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '01  
This can be applied to generate automatic wrappers or to notify/display web page differences, web page change monitoring, etc.  ...  Many web pages have implicit structure. In this paper, we show the feasibility of automatically extracting data from web pages by using approximate matching techniques.  ...  We focus on the data extraction aspect rather general semi-structured database context for wrappers.  ... 
doi:10.1145/383952.384071 dblp:conf/sigir/YapR01 fatcat:62jtag6phjaupc3t37a7a6uaoq

Pollock

Yi-Hsuan Lu, Yoojin Hong, Jinesh Varia, Dongwon Lee
2005 Proceedings of the 2005 ACM symposium on Applied computing - SAC '05  
Toward this goal, we adopt the Wrapper technology successfully developed and deployed in Database community, and demonstrate how to generate Web Services components (e.g., WSDL, UDDI, SOAP) automatically  ...  In this paper, we propose a methodology that helps to automatically generate Web Services from the FORMbased query interfaces of a web site.  ...  The core technique to such integration tools is the capability to (semi-)automatically generate a programmable interface to the FORM-based query interface of web sites.  ... 
doi:10.1145/1066677.1067052 dblp:conf/sac/LuHVL05 fatcat:sk7vkrm5zjaxnp3dq2qukg4o3m

Multilevel wrApper Verification System with Maintenance Model Enhancement

2017 International Journal of Science and Research (IJSR)  
The online data sources have prompted to an extended usage of wrappers for extract data from Web sources.  ...  The wrapper verification system identifies when a wrapper is not removing right information, for the most part on the grounds that the Web source has changed its organization.  ...  The expansion of online data sources has prompted to an expanded utilization of wrappers for separating information from Web sources.  ... 
doi:10.21275/art20174847 fatcat:qojf5tl44rcsnb6aqqfmgealti

Web data extraction, applications and techniques: A survey

Emilio Ferrara, Pasquale De Meo, Giacomo Fiumara, Robert Baumgartner
2014 Knowledge-Based Systems  
At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users  ...  At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering.  ...  Among these systems we cite Virtuoso Sponger, 22 which generates Linked Data starting from different data sources and supports a wide range of data representation formats and Semantic Fire, 23 which  ... 
doi:10.1016/j.knosys.2014.07.007 fatcat:cb6zazpx7nfgxkmkiuoxqx5zyq

Towards Federated Search Based on Web Services

Jens Graupmann, Michael Biwer, Patrick Zimmer
2003 Datenbanksysteme für Business, Technologie und Web  
One component of the proposed architecture is the service mediator, which generates wrapper classes and additional files to make portals accessible as Web Services.  ...  A major challenge in this context is to cope with portals and the data sources behind the portals, the so-called "Deep Web".  ...  Its architecture is based on data source wrappers, which are generated semi automatically. The wrappers, called Translators, convert the data sources into a common data format.  ... 
dblp:conf/btw/GraupmannBZ03 fatcat:co4fgfoww5hebcwyd43fjuz4kq
« Previous Showing results 1 — 15 out of 2,657 results