Parallel Graph & Mesh Partitioning

Discussions about the routines in ParMETIS

Dear Sir/Madam,

Is there any auxiliary ParMETIS routine which would redistribute the "Distributed CSR graph" according to a newly computed partitioning?

Thanks a lot,
Agnonchik.

Hi

I recently introduced a platform independent (quite important for us) random number generator into my local copy of ParMetis (using Boost for the implementation). Since then I have had a few models assert in some parallel configurations. Before this they were fine. I have checked the number generator - seems fine also.

I am getting an assert in kwayfm.c @ line 309.

ASSERTP(ctrl, ognpwgts[j*ncon+h] <= badmaxpwgt[j*ncon+h] ||
pgnpwgts[j*ncon+h] <= ognpwgts[j*ncon+h],
(ctrl, "%.4f %.4f %.4f\n", ognpwgts[j*ncon+h],
badmaxpwgt[j*ncon+h], pgnpwgts[j*ncon+h]));

It appears that the propose node moves are violating the imbalance tolerance? The numbers involved look reasonable but I do admit I haven't made a Valgrind run yet.

I tried looking at the code in Metis but is seemed significantly different. I am probably being a bit slow but line 183

if (my_edegrees[j].ewgt > my_edegrees[k].ewgt ||

Hi all,
Here's some background before the question/crash description:
I'm trying to partition a graph distributed among processors, however, the number of vertices is smaller than the number of processors.
What do I do with the arrays passed into ParMETIS_V3_PartKway that would be zero-sized on the processors that contain no vertices (adjncy,vwgt,adjwgt)?
I'm using Fortran, and a zero sized array is not well defined. However, even a(0:0) sized allocation does not work. I get the following traceback from ParMetis:

Error! ***Memory allocation failed for AllocateWSpace: wspace->core. Requested size: -28 bytesforrtl: error (76): Abort trap signal
Image PC Routine Line Source
libpthread.so.0 0000003552C0DE80 Unknown Unknown Unknown
libc.so.6 0000003552430155 Unknown Unknown Unknown
libc.so.6 0000003552431BF0 Unknown Unknown Unknown

Hi,

I have been in trouble with an error message from ParMetis-3.1.1 used for a physics application (using PETSc) running on Cray XT systems.

The application program splits all MPI PEs into 32 disjoint MPI groups (so 32 MPI communicators are created) and calls iterative linear solver for each MPI group. During the solution ParMetis is called by each group in order to improve load balance within. The code terminates with the following Assertion message from each group when using more than 128 processors (more than 4 PEs per MPI group). Interestingly, the message is thrown by the last PE of each group; if 256 PEs are used, the message is thrown by 8th PE of each group.

Do you have any idea with the message below? It might be PETSc's bug, but I want to make sure when this problem occurs and what I should investigate.

Thank you,

Keita

[ 7] ***ASSERTION failed on line 34 of file match.c: wspace->nlarge > graph->xadj[graph->nvtxs]
:
: (repeats 32 times)

Hello,

I am patitioning a mesh using ParMETIS_V3_PartMeshKway inside a c++ and inside a fortran90 program. To have exactly same initialization in c++ and fortran90, I am using C convention even with fortran (my tables start at 0). With c++ everything works nicely. But this is not the case with fortran 90 and to trace the problem I add some pintf commands in procedure ParMETIS_V3_PartMeshKway localized in file mmetis.c.

More precisely, in fortran, it fails at line :

SetUpCtrl(&ctrl, *nparts, (options[0] == 1 ? options[PMV3_OPTION_DBGLVL] : 0), *comm);

*comm was initialized in fortran program as

integer :: comm(0)

and the error message I have is :

[santafe.onera:31060] *** An error occurred in MPI_Comm_dup
[santafe.onera:31060] *** on communicator MPI_COMM_WORLD
[santafe.onera:31060] *** MPI_ERR_COMM: invalid communicator
[santafe.onera:31060] *** MPI_ERRORS_ARE_FATAL (goodbye)
[santafe.onera:31061] *** An error occurred in MPI_Comm_dup

Hello

Due to the idiosyncrasies of my initial grid load I may end up with one processor with very few nodes. This has lead to the odd crash where the coarsened graph contained no nodes. In this case the a number of processors had only 1 node. Apologies for not having the stack trace to hand.

However, I did notice the following line in coarsen.c @ 114

ctrl.CoarsenTo = amin(vtxdist[npes]+1, 25*incon*amax(npes, inparts));

Is there a minimum size for the initial number of elements a processor must have? E.g. more than 1!

thanks

Dominic

Hi,
I'm trying to use ParMETIS_V3_PartMeshKway to compute a repartitioning for a distributed finite element mesh that has been adaptively refined locally on each processor. However, whatever I do, this routine gives as a result that most of the elements should belong to another processor than they currently reside on. Below is a small 3-processor case that illustrates the problem.

dear friends:

This routine is used to compute a k-way partitioning of a mesh on p processors. The mesh can contain elements
of different types.when the mesh contain elements of different types, how to set the parameter "ncommonnodes"?and how to realize different type elements partition.

redstone

as we all known that metis provide mesh2nodal and mesh2dual function,and then metis can make a partition of the nodal graph.but parmetis only provide parallel partition of the dual graph, how can it to realize parallel nodal graph partition?

there is a problem about the mtest.c. i want to output the partition information to a file, using the function WritePVector(),but the output file is null.

WritePVector(argv[2], &vtxdist, &part, comm);