You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In all dimensions the mesh generation with create_interval, create_rectangle and create_box fails with an MPI error when too many processes are used for too little mesh entities.
This should at least result in a user readable error, better yet in a parallelized mesh that does not use all processes and possibly warns the caller about the questionable number of entities per process.
How to reproduce the bug
The following test case runs in all dimensions with success sequentially and stops working at some process count.
I think it should be possible to make say an interval with N cells over M processes with N<=M.
We should probably add a warning, saying that it is not recommended to do so, but nothing in the internals of DOLFINx would have issue with it (except currently the constructor?)
I think it should be possible to make say an interval with N cells over M processes with N<=M.
Agree.
Is there a good guide/rule on how one should pick a useful level of parallelization for a given problem?
It depends on the geometry (how all cells are connected) as well as what kind of problem you are solving (type of spaces, number of degrees of freedom etc). What I've seen is that people recommend having between 10 000 and 100 000 dofs per process (sometimes up to 500 000 per proc).
In my experience, dolfinx seems to scale pretty well in memory - so you can just push up the number of dofs per process until you run out of memory. The numbers @jorgensd suggests are good.
Summarize the issue
In all dimensions the mesh generation with
create_interval
,create_rectangle
andcreate_box
fails with an MPI error when too many processes are used for too little mesh entities.This should at least result in a user readable error, better yet in a parallelized mesh that does not use all processes and possibly warns the caller about the questionable number of entities per process.
How to reproduce the bug
The following test case runs in all dimensions with success sequentially and stops working at some process count.
Minimal Example (Python)
No response
Output (Python)
An error occurred in MPI_Neighbor_alltoallv
Version
main branch
DOLFINx git commit
No response
Installation
No response
Additional information
Observed in #3584
The text was updated successfully, but these errors were encountered: