A flexible content repository to enable a peer-to-peer-based wiki

Udo Bartlang, Jörg P. Müller
2009 Concurrency and Computation  
Wikis-being major applications of the Web 2.0-are used for a large number of purposes, such as encyclopedias, project documentation, and coordination, both in open communities and in enterprises. At the application level, users are targeted as both consumers and producers of dynamic content. Yet, this kind of peer-to-peer (P2P) principle is not used at the technical level being still dominated by traditional client-server architectures. What lacks is a generic platform that combines the
more » ... ity of the P2P approach with, for example, a wiki's requirements for consistent content management in a highly concurrent environment. This paper presents a flexible content repository system that is intended to close the gap by using a hybrid P2P overlay to support scalable, fault-tolerant, consistent, and efficient data operations for the dynamic content of wikis. On the one hand, this paper introduces the generic, overall architecture of the content repository. On the other hand, it describes the major building blocks to enable P2P data management at the system's persistent storage layer, and how these may be used to implement a P2P-based wiki application: (i) a P2P back-end administrates a wiki's actual content resources. (ii) On top, P2P service groups act as indexing groups to implement a wiki's search index. collaboration within enterprises requires the sharing of produced data. But especially the explosion of unstructured content data that complicates filtering, administration, and controlled exchange. For example, intra-enterprise knowledge management aims to facilitate and optimize the retrieval, transfer, and storage of knowledge content. However, the sole exchange of such content is difficult: inconsistencies between redundant content may lead to problems and additional efforts [2]. The common practice in enterprises to employ various storage locations, for instance, an employee's local workstation, group storage devices, or intranet servers demands for knowledge content consolidation. The latest developments [3] recommend the usage of specialized content repositories to enable the management of both structured and unstructured content. Typically, these systems act as a meta layer on top of traditional persistent data stores, such as database management systems, providing additional capabilities. Regarding the design and implementation, however, a state-ofthe-art approach of a content repository is primarily based on a centralized architecture. For instance, distributed database systems as an example for hierarchical client-server systems may split large content data sets to different physically distributed network nodes to establish more efficient data querying through parallelism [4] . However, if replication strategies are applied in distributed systems, the consistency of data needs to be ensured. Therefore, these techniques usually employ a point of central coordination. Such flat client-server architectures are well suited for static networks and computing infrastructures, where the need for hardware resources can be predetermined quite well. Considering, however, the availability of crucial content, if the single server fails, the whole system service is no longer available, which is known as a the single point of failure. In contrast, the peer-to-peer (P2P) paradigm offers a more flexible communication pattern migrating to more and more application domains. For instance, there has been a significant increase of P2P-based systems regarding their popularity and their employment for content distribution on the Internet [5]. The increase in storage capacities, processor power of commodity hardware, and technological improvements to network bandwidth-accompanied by the reduction of its costsfoster decentralized solutions by pushing computer power to the edge of networks. For instance, today even commodity desktop machines are able to store huge amounts of content data and to act as the basis for building sophisticated computing infrastructures [6] . Employing dedicated content repositories is a change in the perspective of content life cycle management [7] . Even with evolving efforts to facilitate this shift of content management perspective, however, today's content repositories are less flexible regarding the support of different content models, offered functionality as dynamic runtime reconfiguration, or distributed system models. For example, despite the cognition to distinguish between different types of content, explicitly known semantic of content data (as the degree of importance) is neglected. But semantics of such knowledge regarding certain content types may be exploited, for instance, to optimize the overall system performance supporting a policy-based approach. This paper presents the method of using a flexible content repository system to implement a P2P-based wiki engine to achieve a more decentralized vision of a dynamic environment for the future Web 3.0. Wikis [8] are popular applications of the so-called Web 2.0 [9], for example, the WWW-based collaborative encyclopedia Wikipedia [10] is based on such an application. The P2P-based content repository system enables building the vision of an enterprise-wide wiki as a shared knowledge space and a shared structure of the content organization. It is both scalable A FLEXIBLE CONTENT REPOSITORY TO ENABLE A P2P-BASED WIKI 833 and shows good performance-its major functions are reconfigurable to enable a policy-based approach for the content management. However, the most important feature of the system is that it supports fault-tolerant and consistent content management: as, once content is stored in the system, it shall not be lost. This raises the challenge to coordinate concurrent activity in a dynamic P2P environment and to protect the consistency of created artefacts to keep content up-to-date across geographically distributed locations. The remainder of this paper is structured as follows. First, the scenario of a P2P-based wiki is described. Next, the background and related work of the approach are given. Subsequently, the system's overall architecture is shown. Thereafter, the major P2P building blocks are introduced. Finally , the approach is evaluated to conclude the paper.
doi:10.1002/cpe.1465 fatcat:3xqus7ajtrfelfviscqurcgoz4