Supporting the automatic construction of entity aware search engines

Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo, Paolo Papotti
2008 Proceeding of the 10th ACM workshop on Web information and data management - WIDM '08  
Several web sites deliver a large number of pages, each publishing data about one instance of some real world entity, such as an athlete, a stock quote, a book. Although it is easy for a human reader to recognize these instances, current search engines are unaware of them. Technologies for the Semantic Web aim at achieving this goal; however, so far they have been of little help in this respect, as semantic publishing is very limited. We have developed a method to automatically search on the
more » ... for pages that publish data representing an instance of a certain conceptual entity. Our method takes as input a small set of sample pages: it automatically infers a description of the underlying conceptual entity and then searches the web for other pages containing data representing the same entity. We have implemented our method in a system prototype, which has been used to conduct several experiments that have produced interesting results.
doi:10.1145/1458502.1458526 dblp:conf/widm/BlancoCMP08 fatcat:kxvczk73wfbitj6kux74wmwrva