-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FCI csf_solver memory usage #48
Labels
scalability
Code works for small problems but not for large problems
Comments
MatthewRHermes
added
the
scalability
Code works for small problems but not for large problems
label
Nov 13, 2023
Curiously, @valay1 had a segfault here,
which suggests that it is here Lines 284 to 294 in 739a255
that a segfault is occuring (both timer_debug lines should flush the buffer). That is a call to a standard PySCF library function which performs no allocations, so I am somewhat confused. A misallocated array could also cause a segfault here, but I can't for the life of me see how any of the input arrays could be misallocated.
ETA: it's worth pointing out in this scenario that we are demanding a far larger matrix (50,000 vs 200) than that PySCF library function was likely ever tested on building. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In two csf_solver functions (
mrh.my_pyscf.fci.csf.pspace
andmrh.my_pyscf.fci.csf.make_hdiag_csf
), the memory usage is problematic due to the two-step evaluation of Hamiltonian matrix elements in the CSF basis, which requires the (temporary) construction of arrays of size quadratic with respect to the number of corresponding determinants. Massively open-shell low-spin wave functions become impossible in the current implementation around (16e,16o) because the corresponding determinants are too numerous to store the block Hamiltonian in memory. The relevant arrays should be split into blocks handled sequentially, as is done for DFT quadrature. In the mean time, segfaults and calculations abruptly killed by cluster daemon processes are symptoms of unmanaged memory usage, and indicate a region of the code where a memory-checking step should be added.The text was updated successfully, but these errors were encountered: