Genome alignments using MPI-LAGAN

Ruinan Zhang, Huzefa Rangwala, and George Karypis
International Conference on Bioinformatics and Biomedicine, 2008
Download Paper
We develop a parallel algorithm for a widely used whole genome alignment method called LAGAN. We use the MPI-based protocol to develop parallel solutions for two phases of the algorithm which take up a significant portion of the total runtime, and also have a high memory requirement. The serial LAGAN program uses CHAOS to quickly determine initial anchor or seeds, which are extended using a sparse dynamic programming based longest-increasing subsequence method. Our work involves parallelizing the CHAOS and LIS phases of the algorithm using a one-dimensional block cyclic partitioning of the computation. This leads to development of an efficient algorithm that utilizes the processors in a balanced way. We also ensure minimum time spent in communication or transfer of information across processors.

We also report experimental evaluation of our parallel implementation using pairs of human contigs of varying lengths. We discuss and illustrate the challenges faced in parallelizing a sparse dynamic programming formulation as in this work, and show equivalent to theoretical speedups for our parallelized phases of the LAGAN algorithm.

Research topics: Bioinformatics | Parallel processing