User-Friendly and Extensible Web Data Extraction [chapter]

T. Novella, I. Holubová
2018 Lecture Notes in Information Systems and Organisation  
Creation of web wrappers is a subject of study in the field of web data extraction. Designing a domain-specific language for a web wrapper is a challenging task, because it introduces tradeoffs between expressiveness of a wrapper's language and safety. In addition, little attention has been paid to execution of a wrapper in a restricted environment. In this paper we present a new wrapping language -Serrano -that has three goals: (1) ability to run in a restricted environment, such as a browser
more » ... xtension, (2) extensibility to balance the tradeoffs between expressiveness of a command set and safety, and (3) processing capabilities to eliminate the need for additional programs to clean the extracted data. Serrano has been successfully deployed in a number of projects and provided competitive results.
doi:10.1007/978-3-319-74817-7_14 fatcat:idadb6mmjrg33mr2qy47qcelqm