Search Engine Technology and Digital Libraries

Friedrich Summann, Norbert Lossau
2004 D-Lib Magazine  
This article describes the journey from the conception of and vision for a modern searchengine-based search environment to its technological realisation. In doing so, it takes up the thread of an earlier article on this subject, this time from a technical viewpoint. As well as presenting the conceptual considerations of the initial stages, this article will principally elucidate the technological aspects of this journey. The conception of an academic search engine The starting point for the
more » ... berations about development of an academic search engine was the experience we gained through the generally successful project "Digital Library NRW", in which from 1998 to 2000-with Bielefeld University Library in overall charge-we designed a system model for an Internet-based library portal with an improved academic search environment at its core. At the heart of this system was a metasearch with an availability function, to which we added a user interface integrating all relevant source material for study and research. The deficiencies of this approach were felt soon after the system was launched in June 2001. There were problems with the stability and performance of the database retrieval system, with the integration of full-text documents and Internet pages, and with acceptance by users, because users are increasingly performing the searches themselves using search engines rather than going to the library for help in doing searches. Since a long list of problems are also encountered using commercial search engines for academic use (in particular the retrieval of academic information and long-term availability), the idea was born for a search engine configured specifically for academic use. We also hoped that with one single access point founded on improved search engine technology, we could access the heterogeneous academic resources of subject-based bibliographic databases, catalogues, electronic newspapers, document servers and academic web pages. Software evaluation and technical realisation Following on from our fundamental deliberations about an academic search engine, we searched the market for suitable software products. Our discussions with Google in 2002 broke down at an early stage, as we were only able to speak to sales personnel, and at that time at least, we received no indication that we could install Google software for testing locally. We found the situation different with the search engine Convera, and we were able to install their search engine on a machine in Bielefeld and test it for a limited period. We spent two weeks intensively observing the Convera software and concluded that it was more appropriate for an intranet installation than it was for our application of it as an Internet search engine. We also tested the Russian open source search engine MnoGo and found many positive aspects to that software, but in our tests we also encountered performance problems when trying to process large amounts of data. Finally, we contacted the Norwegian software company Fast, which in 2002 was one of the market leaders alongside Google with the Fast search engine Alltheweb. A test installation was quickly and flexibly agreed, the technical realisation of which also succeeded smoothly and without any problems. Our experience with the Fast software was so positive that by the end of the test period, it was clear that we should carry out a proof-of-concept with this search Search Engine Technology and Digital Libraries: Moving from Theory t...
doi:10.1045/september2004-lossau fatcat:zfkf5xzdwbd3beqpet7sqemsbi