Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new ERROR and ASSERT macros #336

Merged
merged 6 commits into from
Sep 22, 2023

Conversation

mabruzzo
Copy link
Collaborator

This PR primarily introduces new ERROR and ASSERT macros to simplify the process of reporting errors/aborting.

(I kind of forged ahead with this before asking - I'm happy to drop this if the reviewers dislike it)

A brief note about [[noreturn]]

I've annotated chexit (and another function introduced in this PR) with the [[noreturn]]. Just in case you're unaware of what this means, this is the portable way to introduce "attributes" to functions that was introduced in C++ 11. Furthermore, noreturn is the name of an attribute (also introduced in C++11) that informs the compiler that the annotated function never returns.

This annotation helps avoid compiler warnings about functions not-returning a result in a function like the hypothetical one shown in the following spoiler tag:

Contrived example function
std::string get_boundary_name(int type) {
  if (type == 0) {
    return "no boundary";
  } else if (type == 1) {
    return "periodic";
  } else if (type == 2) {
    return "reflective";
  } else if (type == 3) {
    return "transmissive";
  } else if (type == 4) {
    return "custom";
  } else if (type == 5) {
    return "mpi";
  } else {
    chprintf("encountered an unknown boundary type: %d\n", type);
    chexit(-1);
  }
}

About ERROR and ABORT

  • these were inspired by similarly named macros from Enzo-E that I've always found to be incredibly useful! (I implemented these from scratch and actually improved the interface).
  • These macros should be treated as functions. ERROR cause the program to print an error to stderr, with some extra meta-data (e.g. process number, file-name and location where it was called). ASSERT checks a boolean condition, and if that condition evaluates to false it is equivalent to evaluating ERROR.
  • They effectively have the following signatures
    [[noreturn]] void ERROR(const char* msg, ...);
    [[noreturn]] void ASSERT(bool cond, const char* msg, ...);
    where:
    • msg is the error-msg that will be printed.
    • when msg is specified with printf-style formatting, the remaining arguments are used to specify the output.
    • cond is the condition that is to be checked by the assertion statement. ASSERT only produces the program
  • ASSERT and ERROR are defined as macros rather than as functions so that they can automatically print the function-name, file-name, and line number where the error occurred. Under the hood they call a newly-defined function called Abort_With_Err_.

Potential future improvements:

  • In the original Enzo-E version, a backtrace is also printed out when an error is raised in this fashion. This is definitely something we can add in the future (we would need to use an ifdef statement to check whether we are using glibc), but at this moment, it's a little unclear how useful that would be.
  • We can print out the failed assertion statement in ASSERT (like what is done in the standard library's assert macro).

Bugfix: moved value-check of ::gama

To test out the new macros, I replaced a few snippets of existing code to make use of ERROR or ASSERT (and then intentionally passed parameters to make the code fail).

One of the places where I did this was in Check_Configuration, where the value of ::gama got checked.

  • When I did this, I realized that logic in the old version of the check was wrong. It caused the code to abort when gama > 1 - the error message said that code was failing because gama was less than or equal to 1.
  • When I corrected the code, the code started to crash
  • It turns out that this check was happening before ::gama was being initialized from the P.gamma (i.e. it had a value of 0).
  • To fix this, I moved this check of the value of ::gama into Set_Gammas - the function where ::gama is initialized.

… confirm that they actually work)

- moved the check on the value of ``gama``. Previously there was a logical bug in the check and the check was performed before the variable was initialized.
@bcaddy bcaddy self-requested a review September 15, 2023 19:55
Copy link
Collaborator

@bcaddy bcaddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addition of [[noreturn]] and the gama bugfix are great.

There's some C-style code in Abort_With_Err_ that should be C++ code.

While I think that ERROR and ASSERT are clever I'm unconvinced that they bring anything that couldn't be handled equally well or more idiomatically with some combination of assert, exceptions, or just a direct call to std::cerr. If we just want a warning then std::cerr is easier, if we don't want the process to close upon finding an error then exceptions are a better since we can handle them, and if we do want Cholla to exit then IMO printing a warning then a direct call to chexit is clearer in what it is doing.

As a semi-related note. Our testing framework, GoogleTest, is full of macros that look like ASSERT_<qualifier> and I think that having another macro named ASSERT could lead to confusion.

src/utils/error_handling.cpp Outdated Show resolved Hide resolved
src/utils/error_handling.cpp Outdated Show resolved Hide resolved
src/utils/error_handling.cpp Outdated Show resolved Hide resolved
src/utils/error_handling.h Outdated Show resolved Hide resolved
@mabruzzo
Copy link
Collaborator Author

Thanks for reviewing this. I'll respond more generally in a few moments.

But first, I realized that I was missing a commit that I had made on Friday (while I was writing up the PR) that I forgot to push. This change directly addressed a comment you already made - after this commit, ERROR and ASSERT automatically infer the name of the function where they are called (the developer no longer specifies that as an argument).

@bcaddy
Copy link
Collaborator

bcaddy commented Sep 18, 2023

There's some C-style code in Abort_With_Err_ that should be C++ code.

To clarify this. C-style code isn't necessarily bad or not allowed in Cholla, since CUDA, MPI, and HDF5 expect it we need to use a lot of C-style code. My intention was that IMO this particular case (managing strings) is an example where the C++ version is more readable and maintainable.

@mabruzzo
Copy link
Collaborator Author

naming

First, I just wanted to briefly mention that I would definitely be in favor of renaming ERROR and ASSERT to something different (like CHOLLA_ERROR/CHOLLA_ASSERT, cherror/chassert, or whatever you suggest). I should have been explicit about that in the original PR. (for convenience, I'll continue referring to them by their original name through the rest of this response).

motivating ERROR & ASSERT

While I think that ERROR and ASSERT are clever I'm unconvinced that they bring anything that couldn't be handled equally well or more idiomatically with some combination of assert, exceptions, or just a direct call to std::cerr. If we just want a warning then std::cerr is easier, if we don't want the process to close upon finding an error then exceptions are a better since we can handle them, and if we do want Cholla to exit then IMO printing a warning then a direct call to chexit is clearer in what it is doing.
So you're absolutely right: other more idiomatic code could definitely be written that handle scenarios where ERROR and ASSERT are used equally as well. Many of the advantages of introducing ERROR and ASSERT can be considered marginal improvements (often times, it may just save a line or 2).

These are some fair points. I'm going to try to motivate why I think these macros are useful. Let me know if you still remain unconvinced. If I can't convince you (I expect this to be the likely outcome), I'll create a new PR with the 2 other tweaks that doesn't define defining ERROR or ASSERT

While I generally prefer more explicit/idiomatic code, I view error messages as something of a special-case. I think that simulation-codes are most pleasant to use when they fail loudly (rather than running to completion with silent errors) and provide lots of informative error messages. Unfortunately, these sorts of error-messages tend to be an afterthought for lots of astronomers/physicists (including myself - depending on my mood). I find that they are usually just added after the fact (when a problem is encountered or if they are pressed to add them in a PR review).

  • For this reason, I think it's useful to make reporting an error message as frictionless as possible. If somebody just has to make 1 function call to report an error or check an invariant, I'd argue they're more likely to do it than if they need to make 2 function calls.

  • Likewise, if we make it easier to dynamically format an assert message (1 function call instead of 2), people are more likely to write a more informative message.

Now, I want to highlight some of the advantages of ERROR and ASSERT over alternatives:

  1. I think that ASSERT(::gama > 1.0, "Gamma must be greater than one."); is somewhat easier to understand than assert(::gama > 1.0 and "Gamma must be greater than one.");. I suspect the same would also be true for new developers coming from python. With that said, this is a very subjective view; I imagine that the alternative will look more natural to me over time.

  2. I think that ASSERT makes producing dynamically formatted assertion-error messages slightly easier. When using assert, I need to define and store the dynamic error message before calling assert. The ASSERT macro lets me format the error message in the same call.

  3. I think that ERROR is a little more convenient than call(s) to std::cerr followed by chexit (if nothing else, it's 1 line instead of 2 lines). 

    • Importantly, it automatically includes the location where the error is occuring (file name, line number, function name).
    • There is a marginal advantage in that ERROR encourages users to concatenate a multiline error-message into a single string (it effectively discourages multiple calls to std::cerr, which may make the message hard to read if the same error occurs in multiple mpi threads). To be fair, I'm not sure this is really worth mentioning.
  4. In addition to automatically specifying where an issue occurs, these macros also automatically specify the MPI rank where it was invoked. It may also be useful to print tracebacks to the function where the error/assertion is invoked.

Now, with all of that said, I guess it's only fair if I highlight a disadvantage. I don't love that printf-formatting requires us to use c-types. In an ideal world I would much rather employ std::format, but that's not really an immediate option (until compiler support improves).

other thoughts

I honestly don't have much experience with exceptions. My impression was that lots of C++ developers dislike their overhead and consequently avoid them. But I definitely could be wrong (I haven't looked into them much). 
I do think it's worth highlighting that an exception doesn't carry as much information about where an error occurred. Plus, in all of these cases, where ERROR  would be used, we probably won't recover from an exception.

Let me know how you feel about this. I expect you probably won't be convinced (in which case I'll create a new PR with just the 2 other tweaks that doesn't involve ERROR or ASSERT). Otherwise, I'll go through and make adjustments based on your other comments.

@bcaddy
Copy link
Collaborator

bcaddy commented Sep 18, 2023

Naming

I think a change to CHOLLA_ASSERT would do the job perfectly. It also make it clearer that this is Cholla, not a library call, which generic names tend to be.

Motivation

I think I'm generally convinced. I'm not sure that I will use them but I think others might. I don't see any way that they harm the code base and I do see the benefits that you highlight. I'm fine with merging them in and I'll experiment with them.

I do think we can, and should, move away from char * and std::vector<char> variables in favor of using std::string with the .data() and '.c_str()methods when they're needed with printf.std::format` appears to be supported by GCC and clang but only for C++20 and we use C++17; might be worth having a discussion on our C++ version again.

I will say that part of my aversion to char arrays/vectors is personal preference. I originally came from Python and deeply dislike how C handles strings (or more accurately, doesn't).

Other thoughts

I know that people complain about exceptions being slow. IMO their performance isn't that important since they should only actually trigger in rare situations

Summary

With the changes to naming, at least investigating using std::string instead, and dealing with my specific comments on specific lines, I'm fine with merging this.

@bcaddy
Copy link
Collaborator

bcaddy commented Sep 18, 2023

Actually, follow-up. If we use the STL at all these cannot be called from within GPU kernels. If they were refactored to be callable from GPU kernels I would 100% be behind it but that's not straightforward to actually make it crash both the kernel and the code.

@mabruzzo
Copy link
Collaborator Author

I will say that part of my aversion to char arrays/vectors is personal preference. I originally came from Python and deeply dislike how C handles strings (or more accurately, doesn't).

I'm very much on the same page. Given the option, I much prefer to use std::string.

Actually, follow-up. If we use the STL at all these cannot be called from within GPU kernels. If they were refactored to be callable from GPU kernels I would 100% be behind it but that's not straightforward to actually make it crash both the kernel and the code.

I'm totally open to going that route. With that said, I did some quick google searches and it's not at all obvious to me how we would make that work (getting consistent behavior for aborting the run doesn't seem straight-forward). But, I don't think it's a huge deal if we just restrict this functionality to hosts.

@evaneschneider evaneschneider merged commit ac06fe6 into cholla-hydro:dev Sep 22, 2023
8 checks passed
@mabruzzo mabruzzo deleted the error-macro branch May 20, 2024 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants