-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault #592
Labels
Comments
What kind of GPUs are you using? Also can you tell us what is the size of the system you are tying to run? Did you make sure to run without domain_decomposition? |
Hi there,
I am using GPU units with architecture KEPLER37. Actually, I am just initializing my simulation, my input file looks like this
units metal
atom_style atomic
atom_modify map yes
newton on
and the segmentation error occurs when invoking the lmp executable. I guess it is indeed a memory issue of the cluster I used, since I tried a different cluster and everything works just fine.
All the best,
Alfonso
…________________________________
From: Ilyes Batatia ***@***.***>
Sent: Wednesday, September 18, 2024 2:27 AM
To: ACEsuit/mace ***@***.***>
Cc: Alfonso Castillo Juarez ***@***.***>; Author ***@***.***>
Subject: Re: [ACEsuit/mace] Segmentation fault (Issue #592)
What kind of GPUs are you using? Also can you tell us what is the size of the system you are tying to run? Did you make sure to run without domain_decomposition?
—
Reply to this email directly, view it on GitHub<#592 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AXLB6HF6ZH4RKTA2IZYGCVDZXETOJAVCNFSM6AAAAABOHNG4WGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJXG4YDONBZGU>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi Alfonso, can we close this now? Or is there still a problem to investigate? |
Hi,
Yes, we can close this now. Thank you!
All the best,
Alfonso
…________________________________
From: wcwitt ***@***.***>
Sent: Monday, September 23, 2024 7:13 AM
To: ACEsuit/mace ***@***.***>
Cc: Alfonso Castillo Juarez ***@***.***>; Author ***@***.***>
Subject: Re: [ACEsuit/mace] Segmentation fault (Issue #592)
Hi Alfonso, can we close this now? Or is there still a problem to investigate?
—
Reply to this email directly, view it on GitHub<#592 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AXLB6HG7FHKARNEQZNG5NCLZYAAXHAVCNFSM6AAAAABOHNG4WGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRYGAZTQNZZGQ>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi everyone,
I am able to build/compile LAMMPS and MACE for GPU use smoothly with the instructions on the website but when invoking the lmp executable with the following example:
units metal
atom_style atomic
atom_modify map yes
newton on
lmp -k on g 1 -sf kk -in in.lammps
the segmentation error shows up:
[midway3-0298:723348:0:723348] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x440000e8)
==== backtrace (tid: 723348) ====
0 0x0000000000012b20 .annobin_sigaction.c() sigaction.c:0
1 0x000000000006c1f7 MPI_Comm_rank() ???:0
2 0x00000000008b116b LAMMPS_NS::Universe::Universe() ???:0
3 0x0000000000747e17 LAMMPS_NS::LAMMPS::LAMMPS() ???:0
4 0x000000000040438f main() ???:0
5 0x0000000000023493 __libc_start_main() ???:0
6 0x000000000040453e _start() ???:0
[midway3-0298:723348] *** Process received signal ***
[midway3-0298:723348] Signal: Segmentation fault (11)
[midway3-0298:723348] Signal code: (-6)
[midway3-0298:723348] Failing at address: 0x69421179000b0994
[midway3-0298:723348] [ 0] /lib64/libpthread.so.0(+0x12b20)[0x7f5e04e5eb20]
[midway3-0298:723348] [ 1] /software/openmpi-4.1.0-el8-x86_64/lib/libmpi.so.40(MPI_Comm_rank+0x37)[0x7f5e0987c1f7]
[midway3-0298:723348] [ 2] /project/gagalli/alfonso/Software/myMACE/lammps/mybuild/liblammps.so.0(_ZN9LAMMPS_NS8UniverseC1EPNS_6LAMMPSEi+0xfb)[0x7f5e05b2516b]
[midway3-0298:723348] [ 3] /project/gagalli/alfonso/Software/myMACE/lammps/mybuild/liblammps.so.0(_ZN9LAMMPS_NS6LAMMPSC2EiPPci+0xa7)[0x7f5e059bbe17]
[midway3-0298:723348] [ 4] /project/gagalli/alfonso/Software/myMACE/lammps/mybuild/lmp[0x40438f]
[midway3-0298:723348] [ 5] /lib64/libc.so.6(__libc_start_main+0xf3)[0x7f5e042bb493]
[midway3-0298:723348] [ 6] /project/gagalli/alfonso/Software/myMACE/lammps/mybuild/lmp[0x40453e]
[midway3-0298:723348] *** End of error message ***
/var/spool/slurm/d/job23524083/slurm_script: line 38: 723348 Segmentation fault (core dumped) /project/gagalli/alfonso/Software/myMACE/lammps/mybuild/lmp -k on g 1 -sf kk -in in.lammps
I have read that such type of error might be related to memory issues but even after installing everything in my research group folder with tons of memory available I get the same error. These are the modules I used:
####LOAD MODULES
module load intel/19.1.1
module load mkl/2023.1
module load cuda/12.2
module load cudnn/9.4.0
module load openmpi/4.1.0
module load gcc/10.2.0
module load python/3.11.9
source ~/.bashrc
conda activate /project/gagalli/alfonso/Software/ENVS/myenvX
P.D. I did not have any architecture-related issue when compiling.
Any recommendation would be greatly appreciated.
The text was updated successfully, but these errors were encountered: