Highly Scalable Parallel Algorithms for Sparse Matrix Factorization

Anshul Gupta, George Karypis, and Vipin Kumar
IEEE Trans. Parallel Distrib. Syst. 8(5): 502-520, 1997
Download Paper
In this paper, we describe scalable parallel algorithms for sparse matrix factorization, analyze their
performance and scalability, and present experimental results for up to 1024 processors on a Cray T3D
parallel computer. Through our analysis and experimental results, we demonstrate that our algorithms
substantially improve the state of the art in parallel direct solution of sparse linear systems - both in terms
of scalability and overall performance. It is a well known fact that dense matrix factorization scales well
and can be implemented efficiently on parallel computers. In this paper, we present the first algorithms
to factor a wide class of sparse matrices (including those arising from two- and three-dimensional finite
element problems) that are asymptotically as scalable as dense matrix factorization algorithms on a variety
of parallel architectures. Our algorithms incur less communication overhead and are more scalable than any
previously known parallel formulation of sparse matrix factorization. Although, in this paper, we discuss
Cholesky factorization of symmetric positive definite matrices, the algorithms can be adapted for solving
sparse linear least squares problems and for Gaussian elimination of diagonally dominant matrices that are
almost symmetric in structure. An implementation of one of our sparse Cholesky factorization algorithms
delivers up to 20 GFlops on a Cray T3D for medium-size structural engineering and linear programming
problems. To the best of our knowledge, this is the highest performance ever obtained for sparse Cholesky
factorization on any supercomputer.
Research topics: Parallel processing | PSPASES | Scientific computing