SQMD: Architecture for Scalable, Distributed Database System Built on Virtual Private Servers

Kangseok Kim, Marlon E. Pierce, Rajarshi Guha
2008 2008 IEEE Fourth International Conference on eScience  
Many scientific fields routinely generate huge datasets. In many cases, these datasets are not static but rapidly grow in size. Handling these types of datasets, as well as allowing sophisticated queries necessitates efficient distributed database systems. In this paper we present the architecture, implementation and performance analysis of a scalable, distributed database system built on software based virtualization environments. The system architecture makes use of a software partitioning of
more » ... are partitioning of the database based on data clustering, SQMD (Single Query Multiple Database) mechanism, a web service interface, and virtualization software technologies. The system allows uniform access to concurrently distributed databases, using the SQMD mechanism based on the publish/subscribe paradigm. We highlight the scalability of our architecture by applying it to a database of 17 million chemical structures. In addition to simple identifier based retrieval, we will present performance results for shape similarity queries, which is extremely, time intensive with traditional architectures.
doi:10.1109/escience.2008.35 dblp:conf/eScience/KimPG08 fatcat:ez2sdcji7nc5phk27yoyvqxun4