Toolkits for Generating Wrappers [chapter]

Stefan Kuhlins, Ross Tredwell
<span title="">2003</span> <i title="Springer Berlin Heidelberg"> <a target="_blank" rel="noopener" href="" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
Various web applications in e-business, such as online price comparisons, competition monitoring and personalised newsletters require retrieval of distributed information from the Internet. This paper examines the suitability of software toolkits for the extraction of data from web sites. The term wrapper is defined and an overview of presently available toolkits for generating wrappers is provided. In order to give a better insight into the workings of such toolkits, a detailed analysis of the
more &raquo; ... non-commercial software program LAPIS is presented. An example application using this toolkit demonstrates how acceptable results can be achieved with relative ease. The functionality of the program is compared with the functionality of the commercial toolkit RoboMaker and the differences are highlighted. With the aim of providing improved ease-of-use and faster wrapper generation in mind, possible areas for further development of toolkits for automated web data extraction are discussed.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="">doi:10.1007/3-540-36557-5_15</a> <a target="_blank" rel="external noopener" href="">fatcat:pibsq32qkzc43dvhma6d3agyge</a> </span>
<a target="_blank" rel="noopener" href="" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href=""> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> </button> </a>