-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abinit: common relax workflow excepts for simple Silicon structure #159
Comments
Not sure if this is actually a problem with the plugin or with the installation/compilation of Abinit on QM. |
Can you share the input file? |
|
Hello Sebastiaan, Does seems like an Abinit compilation issue with mpi indeed. Thanks, |
I tried with 20.11.2a but that made things worse. That actually still had |
I've run a few tests: Quantum Mobile in Docker
I saw the same failure as above ( I then copied the repository directory to my workstation and ran the input file using my installation of Abinit v9.2.1, which worked as expected. Quantum Mobile v20.11.2a VM image in VirtualBox
This completed successfully as on my workstation. It seems like this is an issue with the compilation in Docker. |
It may be good to check if this also happens with the VirtualBox image of Quantum Mobile. I recall there was initially some issues with the source abinit tests failing (marvel-nccr/ansible-role-abinit#9) but now they do parse, including one for parallel execution: (https://github.com/marvel-nccr/ansible-role-abinit/blob/88d8c1380fdac30941a32b567bf6c5c8f2810bee/defaults/main.yml#L49 |
but yeh I will look into this more when finalising the QM for common workflows within the next week |
@chrisjsewell I did run in both the Docker container and in the VirtualBox image (see above). |
Ok cheers, and do we know if any of the other simulation codes have issues running with MPI inside the Docker container, or this is only an issue with abinit? |
The |
|
Siesta doesn't use Edit: just reran QE and can confirm that it does run with MPI and without issues. This is using the Docker container. @zooks97 yes this has already been fixed in the develop branch of |
ta, yeh if we can "gather" any quantum-mobile specific issues here, then I can hopefully set aside a day and hit them all in one go 😄 |
@sphuber this is the traceback from Siesta; looks like they want
|
@zooks97 thanks for testing. All these problems are known and are either fixed or have their own open issue on this repo. To run Siesta, you need to run the instructions that are in the SI of the paper. In the case of Siesta specifically:
@chrisjsewell cheers. To summarize, the only problem with the QM container seems that Abinit fails to run using MPI. All the other problems mentioned here are either already fixed or problems with the plugins that are dealt with in other open issues. |
@sphuber Thanks, sorry for the confusion and duplication. |
The illegal instruction may be due to the use of the optimization options: Have you tried to recompile Abinit with less aggressive optimization level e.g.:
|
To narrow down the issue, I created a new Docker image that only has abinit installed of the simulation codes: Can you guys confirm whether you can run correctly on this? The only (non-fatal) issue I noted from the output below is that netcdf is not compile for mpi
|
@chrisjsewell thanks for looking into this. Did you use the exact same compilation as the current QM or did you apply with the changes suggested by @gmatteo ? |
yes, no changes to the abinit compilation |
Tried to run manually with the same input files you described in QM 20.11.2a and still failed:
|
Actually though, thinking about it now @gmatteo is probably right, in that for Docker |
what OS are you running on @sphuber? |
Ubuntu 20.04 |
ok well I'll have a look at how I can get abinit to build without these flags now |
Ok do you want to check If so, I will then do another full release.
|
Thanks a lot @chrisjsewell . I have good news and bad news: the good news is that with the new QM release I can run the abinit example for Si fast without issues 👍 ! However, there was an unrelated problem that had to be fixed first. When trying to submit I was faced with the following exception:
We have seen this before but I never figured out an actual fix that is reproducible. It seems that there is an incompatibility between the Then @chrisjsewell , do you have any idea how we can prevent this |
Hi all, |
No, this will install anything |
Indeed, although since |
Sounds possible, although perhaps not according to https://github.com/aiidateam/AEP/tree/master/003_adopt_nep_29; I would defer to @csadorf the official @aiidateam/dependency-manager 😉 |
@sphuber @chrisjsewell Thanks for pinging me. No, requiring |
I link here the most complete tread I found on the problem reported by @sphuber |
Will split this discussion of in a separate issue. |
More good and bad news: good news, the new QM seems to run Si with Abinit with all three protocols. Great success 👍 ! However, running
I will attach the entire input and output file: abinit_input.txt Any ideas what could be causing this? It seems again to be related to a compilation error with MPI, but it's weird that the previous fix didn't fix this as well. EDIT: just confirmed that the same holds for the |
To confirm, can |
I cannot reproduce the problem on my mac with mpich-3.3 compiled from source.
indicates that execution aborts with
when we call for the first time the MPI primitive
to create a file for parallel access. MPI_FILE_OPEN receives in input scalars and a Fortran string so Can you provide additional details on the MPI library used for the docker image? |
All the details are in the ansible role: https://github.com/marvel-nccr/ansible-role-abinit/blob/master/tasks/main.yml |
Thanks for the link. The problem is that we are installing a precompiled MPI library so we cannot disable the stack protector hence Abinit will continue to fail when creating the file in MPI-IO mode. Furthermore, I've noticed the following warning in the log file:
This clearly indicates that there's some problem in the MPI-IO layer since Netcdf4 relies on MPI-IO to perform parallel IO. In a nutshell, the Abinit build system has detected a netcdf4 lib that in principle supports MPI-IO (at configure time, we just try to link a fortran program that uses the netcdf4 API for parallel-IO and if the linker is happy we "define" the CPP macro HAVE_NETCDF_FORTRAN_MPI). |
The netcdf issue was already noted above: #159 (comment) and I opened a PR: marvel-nccr/ansible-role-abinit#13. However, abinit still does not pick up the correct netcdf library, so perhaps there is additional configuration required? |
I am running
cfe9ba69d87629fb89b39c0b86dd4e9233d3a6fe
of this repo withaiida-abinit==0.2.0a1
on Quantum Mobile. A simple relax workchain has theAbinitCalculation
fail with a 304, for all protocols. The stdout contains:The text was updated successfully, but these errors were encountered: