A High Performance Sparse Cholesky Factorization Algorithm For Scalable Parallel Computers

George Karypis and Vipin Kumar
8th Symposium on the Frontiers of Massively Parallel Computation, 1995
Download Paper
This paper presents a new parallel algorithm for sparse matrix factorization. This algorithm
uses subforest-to-subcube mapping instead of the subtree-to-subcube mapping of another
recently introduced scheme by Gupta and Kumar.
Asymptotically, both formulations are equally scalable on a wide range of architectures and a
wide variety of problems. But the subtree-to-subcube mapping of the earlier formulation causes
significant load imbalance among processors, limiting overall efficiency and speedup.
The new mapping largely eliminates the load imbalance among processors. Furthermore, the
algorithm has a number of enhancements to improve the overall performance substantially.
This new algorithm achieves up to 6GFlops on a 256-processor Cray T3D for moderately large
To our knowledge, this is the highest performance ever obtained on an MPP for sparse Cholesky
Research topics: Parallel processing | PSPASES | Scientific computing