Cooperative information-gathering: a distributed problem-solving approach

T. Oates, M.V. Nagendra Prasad, V.R. Lesser
1997 IEE Proceedings - Software Engineering  
We c o n trast two approaches to the problem of information gathering that may b e characterized as distributed p r ocessing and distributed p r oblem solving. The former is characteristic of most existing information gathering systems, while the latter is central to research i n m ulti-agent systems. We examine features of complex information carrying environments and the information gathering task that demonstrate both the utility of viewing information gathering as distributed problem
more » ... and di culties with viewing it as distributed processing. We propose a new approach to information gathering based on the distributed problem solving paradigm and its attendant b o d y o f research i n m ulti-agent systems and distributed arti cial intelligence. This approach, called Cooperative Information Gathering, involves concurrent, asynchronous discovery and composition of information spread across a network of information servers. Top level queries drive the creation of partially elaborated information gathering plans, resulting in the employment o f m ultiple semi-autonomous, cooperative a g e n ts for the purpose of achieving goals and subgoals within those plans. The system as a whole satis ces, trading o solution quality and search cost while respecting user-imposed deadlines. We also survey current w ork on distributed and agent-based approaches to information gathering. When a system is dealing with enormous quantities of data, distributed computation at the sites where the data resides may often be more e cient than migrating data to a centralized processing location. Instead of gathering data dispersed across networked information servers at a centralized site and then evolving a coherent response to a query, agents can reside at the data sources and perform distributed coordinated retrieval to prune their data space and send substantially less data to the centralized query system for further processing. Agent-based architectures o er modularity, robustness and other advantages of distributed systems. For example, information agents can be constructed and maintained separately to accomodate heterogeneity in access methods, data representations and communication protocols that make it necessary to construct agents with specialized knowledge. Agents can use other agents to provide abstractions of heterogeneous information sources. In addition, passive data sources like databases can be transformed into information providing agents by wrapping them with intelligent i n terfaces 31], making possible negotiation processes between retrieval agents and intelligent search engines. Cooperation between agents implies management o f i n terdependencies between their activities so as to integrate and evolve consistent clusters of high quality information from distributed heterogeneous sources. Rather than simply retrieve s e t s o f d o c u m e n ts from disparate sources that are relevant t o a q u e r y , s u c h a g e n ts perform a parallel search for information to compose a coherent a n s w er to a user's question. Cooperation is especially important because: Users often provide vaguely worded or sparse queries, leading to an explosion in the amount of information that is deemed potentially relevant. Agents that can dynamically exploit relevant information unearthed by other agents can better focus their search processes. Viewing partial results as information relevant to a query opens up a rich set of possible subproblem interrelationships that may be bene cially exploited. The amount of data that is relevant t o e v en a precisely worded query may itself be too vast. The agents can exploit cues and hints based on information discovered by other agents at non-local sites to further narrow the set of relevant local data. Given that the need to e ciently search through networks of information servers is real, the issues involved in using a team of cooperating semi-autonomous agents to search for the desired information are yet to be explored. Large scale networks of distributed information servers with complex interdependent data not only necessitate increased parallelism in search, but also motivate the need for cooperative retrieval and dynamic construction of responses to queries. The domain of such a search consists of multiple wide-area networks that are composed of, among other things, information servers (see Figure 1 ). In response to a query at a node, following some query planning, agents are dispersed to various regions in the network where they plan their local actions, which m a y include spawning additional agents to perform certain subtasks. This results in the formation of a search organization for the purpose of satisfying a query 16]. Intelligent servers that receive queries and act as regional planning sites, either further decomposing the search i n to subregions or sending
doi:10.1049/ip-sen:19971025 fatcat:xmlna6ldjzfhbglz34jw4hinpq