Spark-Based Large-Scale Matrix Inversion for Big Data Processing

Jun Liu, Yang Liang, Nirwan Ansari
2016 IEEE Access  
Matrix inversion is a fundamental operation for solving linear equations for many computational applications, especially for various emerging big data applications. However, it is a challenging task to invert large-scale matrices of extremely high order (several thousands or millions), which are common in most web-scale systems such as social networks and recommendation systems. In this paper, we present a LU decomposition-based block-recursive algorithm for large-scale matrix inversion. We
more » ... ent its well-designed implementation with optimized data structure, reduction of space complexity and effective matrix multiplication on the Spark parallel computing platform. The experimental evaluation results show that the proposed algorithm is efficient to invert large-scale matrices on a cluster composed of commodity servers and is scalable for inverting even larger matrices. The proposed algorithm and implementation will become a solid foundation for building a high-performance linear algebra library on Spark for big data processing and applications.
doi:10.1109/access.2016.2546544 fatcat:npwy2xc4y5dqlbji2o4g2wniye