PSciLab: An Unified Distributed and Parallel Software Framework for Data Analysis, Simulation and Machine Learning—Design Practice, Software Architecture, and User Experience

Stefan Bosse
2022 Applied Sciences  
In this paper, a hybrid distributed-parallel cluster software framework for heterogeneous computer networks is introduced that supports simulation, data analysis, and machine learning (ML), using widely available JavaScript virtual machines (VM) and web browsers to accommodate the working load. This work addresses parallelism, primarily on a control-path level and partially on a data-path level, targeting different classes of numerical problems that can be either data-partitioned or replicated.
more » ... These are composed of a set of interacting worker processes that can be easily parallelized or distributed, e.g., for large-scale multi-element simulation or ML. Their suitability and scalability for static and dynamic problems are experimentally investigated regarding the proposed multi-process and communication architecture, as well as data management using customized SQL databases with network access. The framework consists of a set of tools and libraries, mainly the WorkBook (processed by a web browser) and the WorkShell (processed by node.js). It can be seen that the proposed distributed-parallel multi-process approach, with a dedicated set of inter-process communication methods (message- and shared-memory-based), scales up efficiently according to problem size and the number of processes. Finally, it is demonstrated that this JavaScript-based approach for exploiting parallelism can be used easily by any typical numerical programmer or data analyst and does not require any special knowledge about parallel and distributed systems and their interaction. The study is also focused on VM processing.
doi:10.3390/app12062887 fatcat:2prgkzfnvfdxbgwrwg2iq55um4