Multiversion-based view maintenance over distributed data sources
ACM Transactions on Database Systems
Materialized views can be maintained by submitting maintenance queries to the data sources. However, the query results may be erroneous due to concurrent source updates. State-of-the-art maintenance strategies typically apply compensations to resolve such conflicts and assume all source schemata remain stable over time. In a loosely-coupled dynamic environment, the sources may autonomously change not only their data but also their schema or semantics. Consequently, either the maintenance or the
... maintenance or the compensation queries may be broken. Unlike compensation-based approaches found in the literature, we instead model the complete materialized view maintenance process as a view maintenance transaction (VM Transaction). This way, the anomaly problem can be rephrased as the serializability of VM Transactions. To achieve VM Transaction serializability, we propose a multiversion concurrency control algorithm, called TxnWrap, which is shown to be the appropriate design for loosely-coupled environments with autonomous data sources. TxnWrap is complementary to the maintenance algorithms proposed in the literature, since it removes concurrency issues from consideration allowing the designer to focus on the maintenance logic. We show several optimizations of TxnWrap, in particular, (1) space optimizations on versioned data materialization and (2) parallel maintenance scheduling. With these optimizations, TxnWrap even outperforms state-of-the-art view maintenance solutions in terms of refresh time. Further, several design choices of TxnWrap are studied each having its respective advantages for certain environmental settings. A correctness proof based on transaction theory for TxnWrap is also provided. Lastly, we have implemented TxnWrap. The experimental results confirm that TxnWrap achieves predictable performance under a varying rate of concurrency.