Extracting Contextualized Quantity Facts from Web Tables

Vinh Thinh Ho, Koninika Pal, Simon Razniewski, Klaus Berberich, Gerhard Weikum
2021 Proceedings of the Web Conference 2021  
Quantity queries, with filter conditions on quantitative measures of entities, are beyond the functionality of search engines and QA assistants. To enable such queries over web contents, this paper develops a novel method for automatically extracting quantity facts from ad-hoc web tables. This involves recognizing quantities, with normalized values and units, aligning them with the proper entities, and contextualizing these pairs with informative cues to match sophisticated queries with
more » ... s. Our method includes a new approach to aligning quantity columns to entity columns. Prior works assumed a single subject-column per table, whereas our approach is geared for complex tables and leverages external corpora as evidence. For contextualization, we identify informative cues from text and structural markup that surrounds a table. For querytime fact ranking, we devise a new scoring technique that exploits both context similarity and inter-fact consistency. Comparisons of our building blocks against state-of-the-art baselines and extrinsic experiments with two query benchmarks demonstrate the benefits of our method.
doi:10.1145/3442381.3450072 fatcat:p7l4as6amre33kqdlkua7kwj64