-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with parallel I/O (netcdf) #14
Comments
using the test Docker (on OSX) from #13, i.e. with
|
Eurgh I give up with this rubbish lol: https://docs.abinit.org/INSTALL_Ubuntu/ says you simply install Including this in the Does this mean that we also have to build netcdf from source? |
Let's try to simplify things a bit and mainly focus on the hard-requirements i.e. the libs that allow users to run standard GS/DFPT calculations in parallel with MPI and produce (small) netcdf files that can be used by python tools such as The first question is: what happens if you try run the input file that, in the previous build, was aborting with a stack smashing error when calling MPI_FILE_OPEN? Do you still have the same error? If this first test completes successfully, I would say that the fact that your netcdf library does not support parallel-IO ( If the error persists, we have a serious problem.
so I assume that also the MPI library was compiled with similar options. From this man page:
In this case, the program should be intended as the MPI/netcdf/hdf5 library provided by apt so the stack smashing issue should be reported to the maintainers of these packages as Abinit is just a client of these libs and there's no way to disable these checks on our side. As mentioned here
If the GS calculation seems to work in parallel, I would say we are on the right track and we only need to check whether other basic capabilities work as expected.
If the tests are OK, I would say that the basic stuff works as expected. PS: Note that having a hdf5 library that supports MPI-IO ( We (optionally) require an hdf5 library compiled with MPI-IO support but in this case the compilation/linking The reason is that MPI is not just an API but it's also an implementation-dependent ABI so it's not possible to mix libs compiled with different compilers/MPI-implementations. |
Thanks for the reply @gmatteo
I'm unclear why you think this will have changed to the previous build? Given that the only difference is the install of
This is already run in https://github.com/marvel-nccr/ansible-role-abinit/blob/master/tasks/tests.yml, and does not surface the stack smashing error
If you think there is an issue with the apt libraries fair enough, you are certainly more knowledgable in this area than me.
Again I would note here that this is not an issue for any of the other simulation codes with exactly the same MPI libraries. |
Anyhow, I don't see a way forward on this install route into Ubuntu, so will pivot to look at the Conda install route |
Just a quick comment/question to avoid possible misunderstandings: @chrisjsewell: after installing I think (@gmatteo correct me if I'm wrong) that installing that package makes it possible for the configure system to detect the library, and therefore compile abinit with the right support. However, just installing the library without recompiling abinit should not change the behaviour of the code (I think). |
I didn't just install |
I haven't read through the thread, just wanted to provide a link to the |
Thanks, although I'd say that's not actually the salient point of the recipe (the make command is basically the same here), it's actually that the netcdf packages linked to are ones that have been compiled against the mpi library: https://github.com/conda-forge/abinit-feedstock/blob/master/recipe/meta.yaml#L57 Basically, I believe that to get parallel I/O here we also have to directly compile the netcdf libraries, rather than just installing them from apt. |
Needless to say this introduces yet more complexity and build time to Quantum Mobile (for which abinit is already one of the longest running components), and so if we are anyway planning to move to Conda, I would rather spend my time on that rather than trying to add the netcdf compilation. |
To also link to the conda effort: marvel-nccr/ansible-role-conda-codes#1 |
Taken from email chain:
@giovannipizzi:
samuel ponce:
@chrisjsewell:
Matteo Giantomassi
cc also @sphuber
The text was updated successfully, but these errors were encountered: