Prototyping an integrated information gathering system on CORBA

Yue-Shan Chang, Kai-Chih Liang, Ming-Chun Cheng, Shyan-Ming Yuan
2004 Journal of Systems and Software  
The sheer volume of information and variety of sources from which it may be retrieved on the Web make searching the sources a difficult task. Usually, meta-search engines can be used only to search Web pages or documents; other major sources such as data bases, library corpuses and the so-called Web data bases are not involved. Faced with these restrictions, an effective retrieval technology for a much wider range of sources becomes increasingly important. In our previous work, we proposed an
more » ... tegrated Retrieval (IIR), which is based on Common Object Request Broker Architecture, to spare clients the trouble of complicated semantics when federating multiple sources. In this paper, we present an IIR-based prototype for integrated information gathering system. It offers a unified interface for querying heterogeneous interfaces or protocols of sources and uses SQL compatible query language for heterogeneous backend targets. We use it to link two general search engines (Yahoo and AltaVista), a science paper explorer (IEEE), and two library corpus explorers. We also perform preliminary measurements to assess the potential of the system. The results shown that the overhead spent on each source as the system queries them is within reason, that is, that using IIR to construct an integrated gathering system incurs low overhead.
doi:10.1016/s0164-1212(03)00086-4 fatcat:dsj6tnuixjdyvoeo7i6wz3kvte