Themenspezifische Informationssuche im Internet mit Hilfe mobiler Programme [article]

Wolfgang Theilmann, Universität Stuttgart, Universität Stuttgart
Research on information retrieval has been challenged in a new way by the increasing importance and usage of the Internet and especially the World Wide Web. In contrast to traditional data bases, the information space Internet is characterized by its enormous size, the dynamics of its content, the heterogeneity of the documents and formats and by the distribution of the available documents among hosts all over the world. This thesis presents a new approach for search engines, which are
more » ... which are specialized to single domains, thus enabling the precise, efficient and comprehensive retrieval of searched information. Such search engines use a domain specific filter function for recognizing relevant documents and are able to offer a user interface and retrieval functionality that are adapted to the specific domain. For locating relevant documents in a fairly efficient way we rely on mobile program technology. Instead of downloading the documents to be analysed to the search engine, we send out mobile filter programs towards the relevant data sources. The mobile programs analyse the data locally and return only the relevant part of it. We present algorithms that allow to coordinate the program dissemination such that the resulting communication costs are minimized. Since the dissemination algorithms need to know the network distance between the involved hosts we also present an approach for estimating the network distance between arbitrary hosts in the Internet in a scalable and efficient way. We evaluate the methods for estimating network distances and for disseminating mobile filter programs on the basis of extensive measurements in the Internet. In addition, we present a case study in order to analyse the benefit of specialized search engines and especially of the employment of mobile filter programs in the context of such search engines.
doi:10.18419/opus-2456 fatcat:yzg6hky4dvhvdeiezmsrvystxu