Tabulated-mode T Optimization ideas #212

mabruzzo · 2024-06-19T17:06:59Z

I wrote this up a little while back and wanted to share this for the sake of posterity (in case anybody is feeling ambitious). I think there are 2 (related) avenues for optimizing the Temperature calculation in tabulated mode. This is definitely not worth pursuing unless we know that the temperature calculation is a performance bottleneck!

This entire discussion entirely ignores the impact of metals on mean molecular weight since that is always added after the fact.

Background

In tabulated mode, Grackle provides 2D and 3D grids to compute mmw via bilinear and trilinear interpolation respectively. We might write these functions as mmw = f(log(T), log(nH)) and mmw = f(logT, log(1+z), log(nH)). A relevant detail for later: T has constant spacing in log-space.

To compute T, we take rho (total mass density), eint (specific internal energy), the assumed HydrogenMassFraction, the assumed gamma (adiabatic index), and (if applicable) and we solve iteratively for T.

In more detail:

from rho and the assumed HydrogenMassFraction, we get nH
for a calorically perfect ideal gas we know:
- eint = p / (rho*(gamma-1))
- p = rho * kboltz * T/(mmw * mH)
- in other words: mmw = kboltz * T / (mH * (gamma - 1) * eint)

Essentially, we need to find the root of the following function:
f(log(T), log(1+z), log(nH)) - kboltz *T / (mH * (gamma - 1) * eint) = 0

In practice, this isn't terrible to solve because for a given cell, T is the only unspecified variable.
For simplicity, we rewrite this as:
f(log(T), log(1+z), log(nH)) - c * T = 0

Idea 1: Exploit structure of the interpolation grid

The first idea's premise is simple. Rather than directly solving for the root, break the calculation into 2 parts.

In the first part, we identify the pair of T values that are part of the interpolation grid that bound the root (and compute the mean molecular weight at both locations)
Then separately solve for the root.

Details about Part 1: Solving for the Tgrid values bounding the root

We can exploit some basic tricks here to do this efficiently based on our initial guess for the mean molecular weight fguess(log(T), log(1+z), log(nH)).

We just need to adopt a very simplistic choice:

to start, it's useful to know that on the existing grids, mmw has a minimum value of 0.612 (all Hydrogen and Helium is ionized) and a maximum of 1.281 (there is no free electrons).¹
So let's simply define fguess(log(T), log(1+z), log(nH)) as a function that always returns 0.8854 (the geometric mean of mmw's extrema).

Using this choice for fguess ensures that the true mean molecular weight, mmw_true, always satisfies mmw_guess/1.45 <= mmw_true <= 1.45*mmw_guess. Rephrasing this in terms of temperature:

T_guess/1.45 <= T_true <= 1.45*T_guess.
In other words, abs( log10(T_true) - log10(T_guess) ) is less than log10(1.45) or 0.1604.
This is relevant because the Temperature grid has a constant spacing where log(T) always differs by exactly 0.1.

Putting this together, in the first part of the calculation, we would do the following:

solve for Tguess
Identify the nearest 6 temperature grid point to Tguess in log-space (and compute mmw at each location). These Temperatures are GUARANTEED to bound the true value.
Use these mmw values to identify the pair of these T values that bound the root

Details about Part 2: Perform the rest of the solve

Let's call the pair of the T values as T_left and T_right (and their associated mmw values as mmw_left and mmw_right).

A property of our usage of bilinear and trilinear interpolation is that we can analytically write the interpolation function between T_left and T_right as
mmw = f(logT, ...) = a * log10(T) + b

Thus we end up solving
a * log10(T) + b - c * T = 0

Unfortunately, this doesn't have an analytic solution, so we need to solve this analytically. But, at least we can analytically write out the derivatives.

Performance considerations

It's a little unclear whether this idea alone is adequate to achieve better performance. But it's worth noting that

Part 1 will probably be at least as fast as 6 iterations of the existing solver. It could easily be faster since the access order of the interpolation table is much faster. On CPUs in particular, these 6 evaluations could be vectorized very efficiently since we know ahead of time that we are loading contiguous values from the tables
If we could come up with a slightly better, yet still inexpensive, fguess would reduce the cost of part 1.²
Each iteration in Part 2 will be much faster than an iteration in the existing implementation.

Idea 2: Modifying the interpolation properties

This is only worth considering alongside with Idea 1. Essentially the idea would be to change the interpolation details such that either:

the entire interpolation grid interpolates log(mmw) rather than mmw
mmw is interpolated with respect to T rather than logT (we would not change anything about the grid, it would still have constant spacing in log-space).

If we adopted either of these solutions, then once we know the pair of Tgrid values bounding the root (T_left and T_right), then there is an analytic solution.

Conclusions

By adopting both ideas 1 and 2, the calculation would no longer be iterative. Instead it would be constant in time and involve no branching. I'm fairly confident it would be faster than the existing solution

The main disadvantage is that the calculation might require custom code for different cooling tables. But I don't think that's necessarily deal-breaker, especially if this provides a significant performance boost.

Numbers are slightly rounded here, for demonstration purposes. While we would want to use the actual values in an implementation, they won't change the story ↩
If fguess provides a guess within a factor of 10^0.15, then we would only need to consider the nearest 5 grid values. Likewise a guess within 10^0.1 or 10^0.05 would only consider the nearest 4 or 3 grid values. ↩

The text was updated successfully, but these errors were encountered:

mabruzzo added the enhancement New feature or request label Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tabulated-mode T Optimization ideas #212

Tabulated-mode T Optimization ideas #212

mabruzzo commented Jun 19, 2024

Tabulated-mode T Optimization ideas #212

Tabulated-mode T Optimization ideas #212

Comments

mabruzzo commented Jun 19, 2024

Background

Idea 1: Exploit structure of the interpolation grid

Details about Part 1: Solving for the Tgrid values bounding the root

Details about Part 2: Perform the rest of the solve

Performance considerations

Idea 2: Modifying the interpolation properties

Conclusions

Footnotes