You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working on refactoring the code related to numerical atomic orbitals and two-center integrals. In many scenarios I have to store a series of objects with multiple internal indices into a 1-d array. These internal indices are somewhat "sparse", for example,
internal indices array index
(0,0) --> 0
(0,1) --> 1
// there is no (0,2) or (0,3)
(1,0) --> 2
(1,1) --> 3
(1,2) --> 4
(1,3) --> 5
(2,0) --> 6
Moreover, the usage of such objects is usually based on internal indices instead of their array index. In the above example, one might look for an object with specific internal indices (l,n).
Some scenario is more complicated. The two-center overlap integral table S[t1, l1, n1, t2, l2, n2, l3](R[iR]) is a parametrized overlap integral between two orbitals and has 8 indices. Seven of them in the square bracket define what the integral is, and the last one (iR) is for the parameter R. Assuming that iR always have the same size for all [t1, l1, n1, t2, l2, n2, l3], the whole table can be viewed as a matrix whose row index is compressed from a tuple of 7 integers: (t1,l1,n1,t2,l2,n2,l3). In practice one loops over these 7 indices and figure out the overlap matrix element. The whole table might be placed on GPU.
My question is, what is a good practice to handle an index mapping (int,int,int,...) -> int that works for both CPU & GPU? Here are some of my thoughts:
Some current code uses ModuleBase::IntArray to build a multi-dimensional integer array whose dimension equals the number of internal indices and values equal the array index. Values for absent internal indices are set to -1. But so far this class does not support GPU and it allows up to 6 dimension.
Denghui's new container has GPU support and might be a good replacement for ModuleBase::IntArray, but is multi-dimensional array indexing supported?
Use std::map with key being std::tuple<int,int,...>. This seems to be a good CPU solution, but I'm not familliar with GPU coding and I was wondering if this solution is compatible with GPU.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I'm working on refactoring the code related to numerical atomic orbitals and two-center integrals. In many scenarios I have to store a series of objects with multiple internal indices into a 1-d array. These internal indices are somewhat "sparse", for example,
internal indices array index
(0,0) --> 0
(0,1) --> 1
// there is no (0,2) or (0,3)
(1,0) --> 2
(1,1) --> 3
(1,2) --> 4
(1,3) --> 5
(2,0) --> 6
Moreover, the usage of such objects is usually based on internal indices instead of their array index. In the above example, one might look for an object with specific internal indices (l,n).
Some scenario is more complicated. The two-center overlap integral table S[t1, l1, n1, t2, l2, n2, l3](R[iR]) is a parametrized overlap integral between two orbitals and has 8 indices. Seven of them in the square bracket define what the integral is, and the last one (iR) is for the parameter R. Assuming that iR always have the same size for all [t1, l1, n1, t2, l2, n2, l3], the whole table can be viewed as a matrix whose row index is compressed from a tuple of 7 integers: (t1,l1,n1,t2,l2,n2,l3). In practice one loops over these 7 indices and figure out the overlap matrix element. The whole table might be placed on GPU.
My question is, what is a good practice to handle an index mapping (int,int,int,...) -> int that works for both CPU & GPU? Here are some of my thoughts:
Do you have any suggestions/ideas? @denghuilu @baixiaokuang @caic99 @mohanchen
Beta Was this translation helpful? Give feedback.
All reactions