Issue with ordering large-size graph with #define IDXTYPEWIDTH 64
Dear ParMETIS developers,
we are currently trying to provide support for ParMETIS (ParMETIS_V3_NodeND) with 64bit integers in the MUMPS solver. Unfortunately we have come across the following issue:
when ordering a very large graph (~12M nodes and ~5G edges) the execution of ParMETIS_V3_NodeND on 2 MPI processes is interrupted with the following message:
MPI_Irecv(165): MPI_Irecv(buf=0x7f631704b010, count=-1763562296, MPI_LONG_LONG_INT, src=0, tag=1, comm=0x84000004, request=0x135ba00) failed
MPI_Irecv(107): Negative count, value is -1763562296
Fatal error in MPI_Irecv: Invalid count, error stack:
MPI_Irecv(165): MPI_Irecv(buf=0x7f397c625250, count=-1781792296, MPI_LONG_LONG_INT, src=1, tag=1, comm=0x84000004, request=0x15585c4) failed
MPI_Irecv(107): Negative count, value is -1781792296
which makes us think that a message whose count exceeds the 32bit integers capacity is exchanged in ParMETIS. The same code works fine on the same graph when more processes are used (likely because the messages size becomes smaller). Also we have used the pt-scotch package (through the ParMETIS interface in pt-scotch) and it works fine.
We managed to reproduce the issue in a standalone program which can be downloaded at this link together with the associated data:
https://cloud.irit.fr/index.php/s/Es7KhbEXU9yzY0j
the code needs a ParMETIS version with 64bit integers and must be run on 2 processes as such
mpirun -np 2 ./main fort.2
Could you please give us some insight on this issue?
Kind regards,
the MUMPS team
- Login to post comments