A Shopping Agent That Automatically Constructs Wrappers for Semi-Structured Online Vendors [chapter]

Jaeyoung Yang, Eunseok Lee, Joongmin Choi
2000 Lecture Notes in Computer Science  
This paper proposes a shopping agent with a robust inductive learning method that automatically constructs wrappers for semistructured online stores. Strong biases assumed in many existing systems are weakened so that the real stores with reasonably complex document structures can be handled. Our method treats a logical line as a basic unit, and recognizes the position and the structure of product descriptions by nding the most frequent pattern from the sequence of logical line information in
more » ... tput HTML pages. This method is capable of analyzing product descriptions that comprise multiple logical lines, and even those with extra or missing attributes. Experimental tests on over 60 sites show that it successfully constructs correct wrappers for most real stores.
doi:10.1007/3-540-44491-2_53 fatcat:2ysrepvugjakxgocitc3dqconi