A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Using the structure of Web sites for automatic segmentation of tables
2004
Proceedings of the 2004 ACM SIGMOD international conference on Management of data - SIGMOD '04
Many Web sites, especially those that dynamically generate HTML pages to display the results of a user's query, present information in the form of list or tables. Current tools that allow applications to programmatically extract this information rely heavily on user input, often in the form of labeled extracted records. The sheer size and rate of growth of the Web make any solution that relies primarily on user input is infeasible in the long term. Fortunately, many Web sites contain much
doi:10.1145/1007568.1007584
dblp:conf/sigmod/LermanGMK04
fatcat:zjmt44q5wfhffpnt57hlosphpm