An overview of Concurrency Control Techniques in Distributed Database

Rajdeep Singh Solanki
2018 International Journal for Research in Applied Science and Engineering Technology  
Today's business environment has an increasing need for distributed database and client/server applications as the desire for reliable, scalable and accessible information is steadily rising. Distributed database systems provide an improvement on communication and data processing due to its data distribution throughout different network sites. Not only is data access faster, but also a single-point of failure is less likely to occur, and it provides local control of data for users. However,
more » ... e is some complexity when attempting to manage and control distributed database systems. Concurrency control is an integral part of a database system. Devising a concurrency control technique that has a low lost opportunity cost and a low restart cost is a hard problem. The interconnection network in a distributed database system can act as a powerful coordination mechanism by providing certain useful properties. We identify several such useful network properties, and present a new family of concurrency control techniques that are built on top of these properties. Concurrency control techniques use network properties to keep the lost opportunity cost and restart cost low. Our thesis is that network properties can be exploited to achieve efficient concurrency control of transactions. I. INTRODUCTION In the past, implementation of distributed database systems was deemed impractical because network technology was either too unreliable or immature to be used and because computers were too expensive to be implemented in large numbers. However, as networks have become more reliable and computers have become much cheaper, there has been a large interest to use distributed database systems. There are five big reasons for using a distributed database system: Many organizations are distributed in nature. Multiple databases can be accessed transparently. Database can be expanded incrementally -as needs arise, additional computers can be connected to the distributed database system. Reliability and availability is increased -distributed database can replicate data among several sites. So even if one site fails, redundancy in data will lead to increased availability and reliability of the data as a whole. Performance will increase -query processing can be performed at multiple sites and as such distributed database systems can mimic parallel database systems in a high-performance network. Even with these benefits, distributed database systems have not been widely used because of many problems in designing distributed database management system (DDBMS). These problems arise when designed distributed DBMS as designing traditional DBMS: distributed DBMS also have to consider how to do query optimization and concurrency control but distributed DBMS requires different solutions because of its nature. Distributed database systems (DDBS) are systems that have their data distributed and replicated over several locations; unlike the centralized data base system (CDBS), where one copy of the data is stored. Data may be replicated over a network using horizontal and vertical fragmentation similar to projection and selection operations in Structured Query Language (SQL). Both types of database share the same problems of access control and transaction management, such as user concurrent access control and deadlock detection and resolution. A. Advantages of Distributed DBS Since organizations tend to be geographically dispersed, a DDBS fits the organizational structure better than traditional centralized DBS. Each location will have its local data as well as the ability to get needed data from other locations via a communication network. Moreover, the failure of one of the servers at one site won't render the distributed database system inaccessible. The affected site will be the only one directly involved with that failed server. In addition, if any data is required from a site exhibiting a failure, such data may be retrieved from other locations containing the replicated data. The performance of the system will improve, since several machines take care of distributing the load of the CPU and the I/O. Also, the expansion of the distributed system is relatively easy, since adding a new location doesn't affect the existing ones.
doi:10.22214/ijraset.2018.1194 fatcat:l73rfpnexrfedaewu5coooiepi