Efficient Processing Regular Queries In Shared-Nothing Parallel Database Systems Using Tree- And Structural Indexes
Symposium on Advances in Databases and Information Systems
In this paper, we introduce and study an efficient regular queries processing algorithm on a very large XML data set which is fragmented and stored on different machines. The machines are connected by the high speed interconnection. In this system the efficiency of a query processing algorithm depends on two main factors: the waiting time for the answer and the total query processing and communication cost over all machines of the system. In the partial processing approach, the query is sent to
... and partially evaluated at each server in parallel. The parallelism reduces the waiting time, but there are several redundant operations as it has to compute all possible cases for each fragment. In the stream processing approach, the query processing cost is minimized by parsing the data graph with the query. A fragment is visited if it is necessary, but there is no parallelism and the communication cost is high. To take the advantages of the shared-nothing parallel system, our algorithm is based on the partial evaluation. We describe two types of redundant operations. They are rejected by pre-computing the query on our treeand structural indexes. The sizes of the indexes and the processing costs over them are considered as constants. Our algorithm overcomes two above algorithms according both the waiting time and the total query processing and communication cost criteria both in theory and in experiment.