Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce gr_initialize_field_data #205

Merged
merged 4 commits into from
Jun 25, 2024

Conversation

mabruzzo
Copy link
Collaborator

@mabruzzo mabruzzo commented Jun 3, 2024

This change was split off from GH PR #197. This should be a lot easier to review in isolation (and I think that it is a generally useful change in its own right).

Overview

The objective of this PR is conceptually very simple. It introduces the following function:

int gr_initialize_field_data(grackle_field_data *my_fields);

The idea is to have users immediately call this function right after they initialize a new grackle_field_data instance

  • When it's called the handful of non data-field members are assigned sensible defaults. It also initializes all data-field members to NULL pointers.

  • this is analogous to the way that local_initialize_chemistry_parameters is used.

Motivation

While this function is not strictly necessary, I think it is a useful addition and I think we should tell people to use it (though old code won't break).

A large motivation is peace of mind. I have often wished for this sort of thing. I'm always a little concerned when using a new configuration of grackle that I didn't initialize all of the required fields. I would definitely feel a little more comfortable knowing that a field that I forgot about is assigned a NULL pointer rather than some garbage data that might let the code chug along.

It could also facillitate more error checking:

  • such as checks in the style of the grid_dx check introduced in PR Add error-check on grackle_field_data.grid_dx #190 (applying similar checks to grid_rank, grid_dimension, grid_start, and grid_end requires that we know their default value).

  • we could explicitly check that the user specified all required data-fields. We can only do this if we know that unset data-fields have a default value of NULL.

    • From a user friendliness perspective, I think this check alone is enough to justify the function's existence.

    • We could also potentially warn if unnecessary fields were specified

    • Plus, we could account for the fact that functions like calculate_pressure require fewer fields than solve_chemistry

  • we could also add checks that none of the fields are aliases (an implicit Fortran requirement). This would be much easier to implement if we knew we could just ignore fields that were set to NULL pointers

We would probably need to make these checks opt-in somehow, both because we can't be absolutely certain whether a user actually invoked gr_initialize_field_data and users might not want the runtime-cost (which could be relatively large for very small grids)

Naming

I choose to name this function based on my suggestion in #189 that we adopt a convention that all new functions in the stable public API share a common prefix (gr_ or grackle_).

For the sake of proposing something concrete, I assumed that we choose the gr_ prefix. (If we decide on a different prefix, this would be easy to change).

Alternatively, if we don't want any prefix, we could name it initialize_grackle_field_data (I do worry a little that could introduce naming conflicts with existing functions in downstream codes)

@mabruzzo mabruzzo changed the title Introduce gr_initialize_field_data Introduce gr_initialize_field_data Jun 3, 2024
This change was split off from GH PR grackle-project#197. This should be a lot easier
to review in isolation (and I think that it is a generally useful
change in its own right).

Overview
--------
The objective of this PR is conceptually very simple. It introduces the
following function:

```c++
int gr_initialize_field_data(grackle_field_data *my_fields);
```

The idea is to have users immediately call this function right after they
initialize a new `grackle_field_data` instance

- When it's called the handful of non data-field members are assigned
  sensible defaults. It also initializes all data-field members to NULL
  pointers.

- this is analogous to the way that `local_initialize_chemistry_parameters`
  is used.

Motivation
----------

While this function is not strictly necessary, I think it is a useful
addition and I think we should tell people to use it (though old code
won't break).

A large motivation is peace of mind. I have often wished for this sort
of thing. I'm always a little concerned when using a new configuration
of grackle that I didn't initialize all of the required fields. I
would definitely feel a little more comfortable knowing that a field
that I forgot about is assigned a ``NULL`` pointer rather than some
garbage data that might let the code chug along.

It could also facillitate more error checking:

- such as checks in the style of the ``grid_dx`` check introduced in
  PR grackle-project#190 (applying similar checks to `grid_rank`, `grid_dimension`,
  `grid_start`, and `grid_end` requires that we know their default
  value).

- we could explicitly check that the user specified all required
  data-fields. We can only do this if we know that unset data-fields
  have a default value of `NULL`.

  - **From a user friendliness perspective, I think this check alone is
    enough to justify the function's existence.**

  - We could also potentially warn if unnecessary fields were specified

  - Plus, we could account for the fact that functions like
    ``calculate_pressure`` require fewer fields that ``solve_chemistry``

- we could also add checks that none of the fields are aliases (an
  implicit Fortran requirement). This would be much easier to implement
  if we knew we could just ignore fields that were set to ``NULL``
  pointers

We would probably need to make these checks opt-in somehow, both because
we can't be absolutely certain whether a user actually invoked
``gr_initialize_field_data`` and users might not want the runtime-cost
(which could be relatively large for very small grids)

Naming
------

I choose to name this function based on my suggestion in grackle-project#189 that we
adopt a convention that all new functions in the stable public API share
a common prefix (`gr_` or `grackle_`).

For the sake of proposing something concrete, I assumed that we choose
the `gr_` prefix. (If we decide on a different prefix, this would be
easy to change).

Alternatively, if we don't want any prefix, we could name it
`initialize_grackle_field_data` (I do worry a little that could
introduce naming conflicts with existing functions in downstream codes)
@mabruzzo mabruzzo added the enhancement New feature or request label Jun 4, 2024
@brittonsmith
Copy link
Contributor

I think it's a good idea that could, at bare minimum, help with debugging. I just had the one question about the meaning of storing whether a field is a passive scalar. If we wanted to make use of being able to expect NULL or not NULL values, we could add an extra int value (for example, "initialized") to the field struct that gets set in gr_initialize_field_data.

@mabruzzo
Copy link
Collaborator Author

I just had the one question about the meaning of storing whether a field is a passive scalar.

This is poor word-choice on my part. Basically I was looking at a word to describe any fields that could written as $C \rho$, where $C$ is some kind of concentration parameter and $\rho$ is the local density.

  • in Enzo-Classic, I think these are called "colour" fields and "species" fields.
  • from a hydrodynamics-perspective, these quantities are sometimes called "passively advected scalars" (since the rate of advection between can be expressed in terms of the primary density field's advection)
  • But this is probably a poor naming choice anyways. For example, under Greg's Dual-Energy Formalism (used in Enzo-Classic, Enzo-E, Cholla) one could argue that the specific internal energy is also a "passively advected scalar" (based on how fluxes are computed)
  • What if we called them density-derived? Alternatively, I could just remove this classification (it exists right now as a placeholder).

If we wanted to make use of being able to expect NULL or not NULL values, we could add an extra int value (for example, "initialized") to the field struct that gets set in gr_initialize_field_data.

The problem is that we can't generally make any assumptions about the value held by grackle_field_data::initialized in the case where grackle_field_data isn't initialized with gr_initialize_field_data. For example, suppose somebody wanted to store the struct on the heap in a C program:

grackle_field_data* ptr = malloc(sizeof(grackle_field_data));

In this scenario, it's possible that ptr->initialized already has the value that indicates that it is initialized.

With that said, if we made initialized an unsigned long long and use some arbitrary 64-bit "magic number" (say 3661638333286146650) to represent that it's initialized, then maybe it's acceptable? The probability of having this value would be 1 in 2**64.1 Even then, we could word the error message such to tell inform the user of the chance of this coincidence.

I also have some other ideas, but maybe we punt that discussion until we start implementing checks?

Footnotes

  1. If the user malloced a pointer to grackle_field_data, called gr_initialize_field_data, freeed the pointer, and then malloced a new grackle_field_data pointer, then I think on some systems there may be a much higher chance that the second pointer would already holds the magic value (without a call to gr_initialize_field_data). This seems a little pathological.

@brittonsmith
Copy link
Contributor

I'm happy to punt on the conversation about initialization. With regard to the passive scalars, the main thing I wanted to know is what you wanted to do with this information. I didn't see the value stored being used anywhere. One thing something like this could be used for is populating a list of fields to be scaled inside scale_fields, but that would be a slightly different list as it wouldn't count things like internal energy.

@mabruzzo
Copy link
Collaborator Author

Yeah, that's fair. (Btw, I assumed you knew what a passive scalar was - I just wanted to be explicit in case we were familiar with different with different nomenclature).

And yes, I was mostly thinking about using in something like scale_fields. But, I'm gonna remove that detail for right now since we aren't currently using it for anything. We can add it in when we feel that the time is right

Copy link
Contributor

@gregbryan gregbryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this seems sensible to me!

Copy link
Contributor

@brittonsmith brittonsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm happy with this. @mabruzzo I kind of lost the thread on whether there was anything more you want to do here, but this looks good to me. I'll leave it to you to merge when ready.

src/clib/grackle_field_data_fdatamembers.def Outdated Show resolved Hide resolved
@mabruzzo
Copy link
Collaborator Author

Sorry, forgot about this. I removed the unused metadata.

If the checks pass, I think we are good to go. @brittonsmith not sure if you'll want to take another look.

Copy link
Contributor

@brittonsmith brittonsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looks great, let's merge when tests pass.

@brittonsmith brittonsmith merged commit 095363c into grackle-project:main Jun 25, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants