Interactive Query Formulation in Semistructured Databases [chapter]

Agathoniki Trigoni
2002 Lecture Notes in Computer Science  
The use of large amounts of distributed and heterogeneous information has become extremely cumbersome; this difficulty is mainly related to exploring the data, rather than actually storing or exchanging it. The user who is interested in small bits of information is getting more and more confused when having to dig under a large volume of diverse and more importantly semi-structured data. In this paper, we propose an interactive and adaptive framework that guides the user in the search for data,
more » ... by disclosing only a part of the underlying information at a time. It first provides the user with a high-level view of the raw data and gradually adapts to his/her needs in order to offer a refined answer. The proposed model offers the possibility to query a semistructured database based on general schema-related constraints imposed by the user or identified by the system, but without specific knowledge of the underlying metadata. This is achieved by receiving initially an amorphic query, which may consist of one or more basic paths, and helping the user to refine it gradually to a specific semistructured query, expressed in a language like XQuery or Lorel. -Users can usually contribute to the query input in a more intelligent way than by providing a few keywords. -A system is not user-friendly when it requires a detailed knowledge of the database structure, especially in the presence of large amounts of heterogeneous data. -The user would not like to miss information because of his lack of knowledge about the schema. Query languages in semistructured systems currently allow the user to avoid type errors when not complying with the explicit or implicit schema. However, the user ends up receiving only a part of the information requested. -Users usually have a high-level knowledge of the database structure, based on their natural perception of data correlations. -The lack of knowledge about the database structure does not preclude the need to enforce specific selection criteria on the results. Structured queries require knowledge of schema information, or else fail to deliver complete and accurate results. IR-style searches do not exploit but a small part of the user's knowledge about the data (mainly keywords), and result in sets of documents that cannot be filtered using detailed criteria. The idea behind this paper is that the tradeoff between allowing natural queries and receiving quality (complete and accurate) results, could be compromised if we adopted an interactive and adaptive query model.
doi:10.1007/3-540-36109-x_28 fatcat:p5mf2zeasvdptdgojwyzoxbwey