Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tabulated-mode T Optimization ideas #212

Open
mabruzzo opened this issue Jun 19, 2024 · 0 comments
Open

Tabulated-mode T Optimization ideas #212

mabruzzo opened this issue Jun 19, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@mabruzzo
Copy link
Collaborator

I wrote this up a little while back and wanted to share this for the sake of posterity (in case anybody is feeling ambitious). I think there are 2 (related) avenues for optimizing the Temperature calculation in tabulated mode. This is definitely not worth pursuing unless we know that the temperature calculation is a performance bottleneck!

This entire discussion entirely ignores the impact of metals on mean molecular weight since that is always added after the fact.

Background

In tabulated mode, Grackle provides 2D and 3D grids to compute mmw via bilinear and trilinear interpolation respectively. We might write these functions as mmw = f(log(T), log(nH)) and mmw = f(logT, log(1+z), log(nH)). A relevant detail for later: T has constant spacing in log-space.

To compute T, we take rho (total mass density), eint (specific internal energy), the assumed HydrogenMassFraction, the assumed gamma (adiabatic index), and (if applicable) and we solve iteratively for T.

In more detail:

  • from rho and the assumed HydrogenMassFraction, we get nH
  • for a calorically perfect ideal gas we know:
    • eint = p / (rho*(gamma-1))
    • p = rho * kboltz * T/(mmw * mH)
    • in other words: mmw = kboltz * T / (mH * (gamma - 1) * eint)

Essentially, we need to find the root of the following function:
f(log(T), log(1+z), log(nH)) - kboltz *T / (mH * (gamma - 1) * eint) = 0

In practice, this isn't terrible to solve because for a given cell, T is the only unspecified variable.
For simplicity, we rewrite this as:
f(log(T), log(1+z), log(nH)) - c * T = 0

Idea 1: Exploit structure of the interpolation grid

The first idea's premise is simple. Rather than directly solving for the root, break the calculation into 2 parts.

  1. In the first part, we identify the pair of T values that are part of the interpolation grid that bound the root (and compute the mean molecular weight at both locations)
  2. Then separately solve for the root.

Details about Part 1: Solving for the Tgrid values bounding the root

We can exploit some basic tricks here to do this efficiently based on our initial guess for the mean molecular weight fguess(log(T), log(1+z), log(nH)).

We just need to adopt a very simplistic choice:

  • to start, it's useful to know that on the existing grids, mmw has a minimum value of 0.612 (all Hydrogen and Helium is ionized) and a maximum of 1.281 (there is no free electrons).1
  • So let's simply define fguess(log(T), log(1+z), log(nH)) as a function that always returns 0.8854 (the geometric mean of mmw's extrema).

Using this choice for fguess ensures that the true mean molecular weight, mmw_true, always satisfies mmw_guess/1.45 <= mmw_true <= 1.45*mmw_guess. Rephrasing this in terms of temperature:

  • T_guess/1.45 <= T_true <= 1.45*T_guess.
  • In other words, abs( log10(T_true) - log10(T_guess) ) is less than log10(1.45) or 0.1604.
  • This is relevant because the Temperature grid has a constant spacing where log(T) always differs by exactly 0.1.

Putting this together, in the first part of the calculation, we would do the following:

  1. solve for Tguess
  2. Identify the nearest 6 temperature grid point to Tguess in log-space (and compute mmw at each location). These Temperatures are GUARANTEED to bound the true value.
  3. Use these mmw values to identify the pair of these T values that bound the root

Details about Part 2: Perform the rest of the solve

Let's call the pair of the T values as T_left and T_right (and their associated mmw values as mmw_left and mmw_right).

A property of our usage of bilinear and trilinear interpolation is that we can analytically write the interpolation function between T_left and T_right as
mmw = f(logT, ...) = a * log10(T) + b

Thus we end up solving
a * log10(T) + b - c * T = 0

Unfortunately, this doesn't have an analytic solution, so we need to solve this analytically. But, at least we can analytically write out the derivatives.

Performance considerations

It's a little unclear whether this idea alone is adequate to achieve better performance. But it's worth noting that

  • Part 1 will probably be at least as fast as 6 iterations of the existing solver. It could easily be faster since the access order of the interpolation table is much faster. On CPUs in particular, these 6 evaluations could be vectorized very efficiently since we know ahead of time that we are loading contiguous values from the tables
  • If we could come up with a slightly better, yet still inexpensive, fguess would reduce the cost of part 1.2
  • Each iteration in Part 2 will be much faster than an iteration in the existing implementation.

Idea 2: Modifying the interpolation properties

This is only worth considering alongside with Idea 1. Essentially the idea would be to change the interpolation details such that either:

  • the entire interpolation grid interpolates log(mmw) rather than mmw
  • mmw is interpolated with respect to T rather than logT (we would not change anything about the grid, it would still have constant spacing in log-space).

If we adopted either of these solutions, then once we know the pair of Tgrid values bounding the root (T_left and T_right), then there is an analytic solution.

Conclusions

By adopting both ideas 1 and 2, the calculation would no longer be iterative. Instead it would be constant in time and involve no branching. I'm fairly confident it would be faster than the existing solution

The main disadvantage is that the calculation might require custom code for different cooling tables. But I don't think that's necessarily deal-breaker, especially if this provides a significant performance boost.

Footnotes

  1. Numbers are slightly rounded here, for demonstration purposes. While we would want to use the actual values in an implementation, they won't change the story

  2. If fguess provides a guess within a factor of 10^0.15, then we would only need to consider the nearest 5 grid values. Likewise a guess within 10^0.1 or 10^0.05 would only consider the nearest 4 or 3 grid values.

@mabruzzo mabruzzo added the enhancement New feature or request label Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant