Research

My research interests are concentrated in the areas of bioinformatics, cheminformatics, data mining, and high-performance computing, and from time-to-time, I look at various problems in the areas of information retrieval, collaborative filtering, and electronic design automation for VLSI CAD.

Within these areas, my research focuses in developing novel algorithms for solving important existing and/or emerging problems, and on developing practical software tools implementing some of these algorithms. The results from this research have been presented in various conferences and published in leading peer reviewed journals and highly selective conference proceedings.

In my research I strive to develop algorithms that are practical (they can be easily implemented on commercially available platforms), efficient (require as little time as possible), effective (do a good job in solving the underlying problem), and scalable (remain efficient and effective as we increase the size of the problem and/or the number of processors). Quite often I consider my research as that of algorithm engineering.

Projects
Over the years I have developed many algorithms for a variety of problems including dynamic load balancing of unstructured parallel computations, graph and circuit partitioning, protein remote homology prediction and fold recognition, protein structure prediction, recommender systems, data clustering, document classification and clustering, frequent pattern discovery in diverse datasets (transactions, sequences, graphs), parallel Cholesky factorization, and parallel preconditioners.
Software
The research of my group research has resulted in the development of software libraries for serial and parallel graph partitioning (METIS and ParMETIS), hypergraph partitioning (hMETIS), for parallel Cholesky factorization (PSPASES), for collaborative filtering-based recommendation algorithms (SUGGEST), clustering high dimensional datasets (CLUTO), and finding frequent patterns in diverse datasets (PAFI). In addition, my group has developed two web-based servers for clustering gene expression data (gCLUTO) and for predicting the secondary structure of proteins (YASSPP).
Publications
I have coauthored over one hundred journal and conference papers on these topics and a book titled "Introduction to Parallel Computing" (Publ. Addison Wesley, 2003, 2nd edition).