My research interests are concentrated in the areas of data mining
, recommender systems
, learning analytics
, high-performance computing
, and chemical informatics
and from time-to-time, I look at various problems in the areas of health informatics
, information retrieval
, and scientific computing
Within these areas, my research focuses in developing novel algorithms for solving important existing and/or emerging problems, and on developing practical software tools implementing some of these algorithms. The results from this research have been presented in various conferences and published in leading peer reviewed journals and highly selective conference proceedings.
In my research I strive to develop algorithms that are practical (they can be easily implemented on commercially available platforms), efficient (require as little time as possible), effective (do a good job in solving the underlying problem), and scalable (remain efficient and effective as we increase the size of the problem and/or the number of processors). Quite often I consider my research as that of algorithm engineering.
Over the years I have developed many algorithms for a variety of problems including dynamic load balancing of unstructured parallel computations, graph and circuit partitioning, protein remote homology prediction and fold recognition, protein structure prediction, recommender systems, data clustering, document classification and clustering, frequent pattern discovery in diverse datasets (transactions, sequences, graphs), parallel Cholesky factorization, and parallel preconditioners.
The research of my group research has resulted in the development of software libraries for serial and parallel graph partitioning (METIS and ParMETIS), hypergraph partitioning (hMETIS), for parallel Cholesky factorization (PSPASES), for collaborative filtering-based recommendation algorithms (SUGGEST), clustering high dimensional datasets (CLUTO), and finding frequent patterns in diverse datasets (PAFI). In addition, my group has developed two web-based servers for clustering gene expression data (gCLUTO) and for predicting the secondary structure of proteins (YASSPP).