Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carpet: Disable OpenMP parallelization of transport operators #3

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

rhaas80
Copy link
Member

@rhaas80 rhaas80 commented Aug 8, 2024

@eschnett pull request 18 https://bitbucket.org/eschnett/carpet/pull-requests/18

  • CarpetLib: Disable OpenMP parallelization of AMR and sync operators
  • Carpet: Make timer names more legible

@rhaas80
Copy link
Member Author

rhaas80 commented Aug 8, 2024

Most of this has been superseded by 5c036248cbb596ec17398faa4f1a3c26c3dd5b0b which is from August 2017. The non-superseded parts are collected in https://bitbucket.org/eschnett/carpet/branch/rhaas/no-openmp and amount to disabling OpenMP where what is done is basically a memory copy, as well as the change in timer name that I negatively commented on. Timer names are no longer truncated as of as git hash 6efd070 "TimerReport: allow arbirary long timer names" of cactusutils.

Do you want to retire this pull request (or replace eschnett/no-openmp with rhaas/no-openmp) and apply only the leftover openmp parts?

@rhaas80 rhaas80 requested a review from eschnett August 8, 2024 21:48
Copy link
Member Author

@rhaas80 rhaas80 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transferred comments from bitbucket

timers.push_back(new Timers::Timer(name1.str()));
ostringstream name2;
name2 << "comm_state[" << timers.size() << "]"
<< "." << tostring(state) << ".step";
name2 << "comm_state[" << timers.size() << "].step." << tostring(state);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to change the name of timers from "state.user" and "state.step" to "user.state" and "user.step" which is a non-backwards compatible, user visible change for anyone who wants to parse the output files for a particular user. Since the change is only cosmetic, I would rather keep the old names. This affects only the first of the two commits in the branch of the pull request.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eschnett says

This change tries to ensure that the important part of the timer name remains readable when the timer name is truncated, which happens in thorn TimerReport. Moving "user" and "step" (see below) to an earlier position in the string helps a lot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it then be better to remove the (arbitrary) restriction on timer name length from TimerReport? Eg use something better than (more or less) char [n_timers][100] timerbuffer?

I am aware this is very minor and will only affect code that parses for these timer names which is intrinsically "termporary" since it is performance analysis scripts and the like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://bitbucket.org/einsteintoolkit/tickets/issues/2268/timerreport-allow-arbirary-long-timer for a patch that removes the limit from TimerReport.

I’ll create version of this pull request that:

  • removes the omp parallel more cleanly

  • does not change the timer names

once that ticket has been reviewed and the pull request acted on.

@@ -116,21 +116,20 @@ void copy_3d(T const *restrict const src, ivect3 const &restrict srcpadext,
ptrdiff_t const dstkoff = dstoff[2];

// Loop over region
if (use_openmp) {
if (false and use_openmp) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole "if" branch should be removed.

@@ -101,13 +101,12 @@ void copy_4d(T const *restrict const src, ivect4 const &restrict srcpadext,
ptrdiff_t const dstloff = dstoff[3];

// Loop over region
if (use_openmp) {
if (false and use_openmp) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole "if" branch should be removed.

// Loop over region
#pragma omp parallel
// Loop over region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Loop over region
#pragma omp parallel
// Loop over region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Loop over coarse region
#pragma omp parallel for collapse(3)
// Loop over coarse region
// #pragma omp parallel for collapse(3)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Loop over coarse region
#pragma omp parallel for collapse(4)
// Loop over coarse region
// #pragma omp parallel for collapse(4)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants