-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use 64-bit integers for variables, use SimplexId for data #817
Comments
hi pavol,
thanks for your message.
I think a good solution is to use 64-bit integers (int64_t) for variables that are not data arrays. For indexing arithmetic, I would still use SimplexId due to idiv latency (https://www.agner.org/optimize/instruction_tables.pdf). Ideally, benchmarks would test if 32-bit index calculations are noticeably slower compared to 64-bit indices.
the (advanced) cmake option "TTK_ENABLE_64BIT_IDS" should do the trick (things should get slower indeed).
best,
--
Dr Julien Tierny
CNRS Researcher
Sorbonne Universite
http://lip6.fr/Julien.Tierny
…On Wednesday, August 3, 2022 7:18:49 PM CEST Pavol Klacansky wrote:
**Is your feature request related to a problem? Please describe.**
I run into crashes when using datasets that are around 1024x1024x1024 in size due to the use of SimplexId to store sizes. For example, the discrete gradient stores the number of cells using SimplexId type https://github.com/topology-tool-kit/ttk/blob/a3b0d63cc997dcc597be691212aba5db46f5ee10/core/base/discreteGradient/DiscreteGradient.cpp#L21 which overflows and causes allocation failure. https://github.com/topology-tool-kit/ttk/blob/a3b0d63cc997dcc597be691212aba5db46f5ee10/core/base/discreteGradient/DiscreteGradient.cpp#L35 The necessary step is to set the SimplexId to be 64 bits, but that doubles the size of arrays, such as the offset array.
**Describe the solution you'd like**
I think a good solution is to use 64-bit integers (int64_t) for variables that are not data arrays. For indexing arithmetic, I would still use SimplexId due to idiv latency (https://www.agner.org/optimize/instruction_tables.pdf). Ideally, benchmarks would test if 32-bit index calculations are noticeably slower compared to 64-bit indices.
Additional benefit of this solution, I think, is that TTK can now detect if an input dataset can be represented using SimplexId robustly and exit with a clear error message.
**Describe alternatives you've considered**
Compile TTK using 64-bit indices at the cost of increased memory consumption.
|
Hi Julien, that's what I used to force SimplexId to be 64 bits. I was thinking it may be better to decouple mesh indices and other variables (such as sizes, loop induction variables) into different data types. Of course, this solution adds more complexity, but I think it would be possible to check at few spots if the mesh fits into a SimplexId and otherwise give a user warning. All other variables could be 64 bits. For example, I use by default int64_t for all variables except when reducing the data type size offers memory savings, such as 16-bit indices inside a grid tile to represent segmentation. I am curious about the thinking process about using SimplexId for (almost) all variables in TTK. What were the advantages and disadvantages? Was it to support 32-bit processors? Thank you, |
Is your feature request related to a problem? Please describe.
I run into crashes when using datasets that are around 1024x1024x1024 in size due to the use of SimplexId to store sizes. For example, the discrete gradient stores the number of cells using SimplexId type
ttk/core/base/discreteGradient/DiscreteGradient.cpp
Line 21 in a3b0d63
ttk/core/base/discreteGradient/DiscreteGradient.cpp
Line 35 in a3b0d63
Describe the solution you'd like
I think a good solution is to use 64-bit integers (int64_t) for variables that are not data arrays. For indexing arithmetic, I would still use SimplexId due to idiv latency (https://www.agner.org/optimize/instruction_tables.pdf). Ideally, benchmarks would test if 32-bit index calculations are noticeably slower compared to 64-bit indices.
Additional benefit of this solution, I think, is that TTK can now detect if an input dataset can be represented using SimplexId robustly and exit with a clear error message.
Describe alternatives you've considered
Compile TTK using 64-bit indices at the cost of increased memory consumption.
The text was updated successfully, but these errors were encountered: