Enhancing the Selection of Web Sources: A Reputation Based Approach [chapter]

Donato Barbagallo, Cinzia Cappiello, Chiara Francalanci, Maristella Matera
2011 Lecture Notes in Business Information Processing  
The large amount of available Web data sources is an important opportunity for Web users and also for various data-intensive Web applications. Nevertheless, the selection of the most relevant data sources and thus of high quality information is still a challenging issue. This paper proposes an approach for data source selection that is based on the notion of reputation of the data sources. The data quality literature defines reputation as a multidimensional quality attribute that measures the
more » ... ustworthiness and importance of an information source. This paper introduces a set of metrics able to measure the reputation of a Web source by considering its authority, its relevance in a given context, and the quality of the content. These variables have been empirically assessed for the top 20 sources identified by Google as a response to 100 queries in the tourism domain. In particular, Google's ranking has been compared with the ranking obtained by means of a multi-dimensional source reputation index. Results show that the assessment of reputation represents a tangible aid to the selection of information sources and to identification of reliable data.
doi:10.1007/978-3-642-19802-1_32 fatcat:6rxgyvxv6rdmhbwl5zbb3ndwpq