Parallel Graph & Mesh Partitioning
Hi
Just unpacked , finished the editing of makefile.inc and have started the execution of the examples from the /GRPAHS directory. They are working fine, but I am trying to put a 4960X4960 Harwell Boeing matrix as the input. Till now it seems that none of the examples can process it - mostly giving broken pipes.
Any idea if Harwell Boeing matrices are supported here and if yes then how?
PS: I am using a Sun Blade 150 cluster with OS Solaris 10.
Any help will be greatly appreciated.
Regards
Pinaki R Das
Hi
I've used ParMetis to partition and distribute a graph. I'm then modifying the graph along the way and want to rebalance it along the processes.
So, I rebuild the vtxdist, xadj, and adjncy arrays and things work fine when I call ParMETIS_V3_PartKway (). I don't care about weights so I'm keeping the weight arrays/parameters as NULL and 0 where needed.
That works fine.
If I try using ParMETIS_V3_AdaptiveRepart () however [using the same inputs and NULL for vsize, 1000 for itr), the function dies with a core dump and no error message. Tried setting vsize to contain 1's, random numbers, id's --- no difference. What confounds me is things work (properly) for _V3_PartKway with the same input!
I must be doing something weird with the inputs, but can't figure it out. Please help.
Thanks,
Lee.
Dear Sir or Madame,
The problem is as follows: I set nparts to number of processors and everything works. But when I set it to different number, I get the following mistake:
p0_15533: p4_error: interrupt SIGSEGV: 11
I thought that it's probably my mistake, but when I changed mtest.c file this way:
nparts = npes+3;
I got the same mistake.
Could you please tell me if this is an error in ParMetis library, or I should try to find my own fault here?
Sincerely,
Igor
P.S.: I'm using ParMetis-3.1 and mpich-1.2.7p1 for Linux.
I sometimes see dead-lock in parMetis using MPICH 1.2.5. I wondered if anyone else has experienced this..?
I can't seem to get consistent failures so it's difficult to track down. From what I see parMetis
"exchanges" information between processes which share a decomposed graph boundary. These exchanges are of the form:
for number of neighbours
post asynchronous recvs
fill send buffers
for number of neighbours
post asychronous sends
wait for sends/recvs to complete
In theory this should be dead-lock free so I suspect the issue is with MPICH and not parMetis.