Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carpet: Disable OpenMP parallelization of transport operators #3

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions Carpet/src/Comm.cc
Original file line number Diff line number Diff line change
Expand Up @@ -242,12 +242,10 @@ static void ProlongateGroupBoundaries(const cGH *cctkGH,
for (astate state = static_cast<astate>(0); state != state_done;
state = static_cast<astate>(static_cast<int>(state) + 1)) {
ostringstream name1;
name1 << "comm_state[" << timers.size() << "]"
<< "." << tostring(state) << ".user";
name1 << "comm_state[" << timers.size() << "].user." << tostring(state);
timers.push_back(new Timers::Timer(name1.str()));
ostringstream name2;
name2 << "comm_state[" << timers.size() << "]"
<< "." << tostring(state) << ".step";
name2 << "comm_state[" << timers.size() << "].step." << tostring(state);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to change the name of timers from "state.user" and "state.step" to "user.state" and "user.step" which is a non-backwards compatible, user visible change for anyone who wants to parse the output files for a particular user. Since the change is only cosmetic, I would rather keep the old names. This affects only the first of the two commits in the branch of the pull request.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eschnett says

This change tries to ensure that the important part of the timer name remains readable when the timer name is truncated, which happens in thorn TimerReport. Moving "user" and "step" (see below) to an earlier position in the string helps a lot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it then be better to remove the (arbitrary) restriction on timer name length from TimerReport? Eg use something better than (more or less) char [n_timers][100] timerbuffer?

I am aware this is very minor and will only affect code that parses for these timer names which is intrinsically "termporary" since it is performance analysis scripts and the like.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://bitbucket.org/einsteintoolkit/tickets/issues/2268/timerreport-allow-arbirary-long-timer for a patch that removes the limit from TimerReport.

I’ll create version of this pull request that:

  • removes the omp parallel more cleanly

  • does not change the timer names

once that ticket has been reviewed and the pull request acted on.

timers.push_back(new Timers::Timer(name2.str()));
}
}
Expand Down
5 changes: 2 additions & 3 deletions CarpetLib/src/copy_3d.cc
Original file line number Diff line number Diff line change
Expand Up @@ -116,21 +116,20 @@ void copy_3d(T const *restrict const src, ivect3 const &restrict srcpadext,
ptrdiff_t const dstkoff = dstoff[2];

// Loop over region
if (use_openmp) {
if (false and use_openmp) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole "if" branch should be removed.

#pragma omp parallel for collapse(3)
for (int k = 0; k < regkext; ++k) {
for (int j = 0; j < regjext; ++j) {
for (int i = 0; i < regiext; ++i) {

dst[DSTIND3(i, j, k)] = src[SRCIND3(i, j, k)];
}
}
}
} else {
for (int k = 0; k < regkext; ++k) {
for (int j = 0; j < regjext; ++j) {
#pragma omp simd
for (int i = 0; i < regiext; ++i) {

dst[DSTIND3(i, j, k)] = src[SRCIND3(i, j, k)];
}
}
Expand Down
5 changes: 2 additions & 3 deletions CarpetLib/src/copy_4d.cc
Original file line number Diff line number Diff line change
Expand Up @@ -101,13 +101,12 @@ void copy_4d(T const *restrict const src, ivect4 const &restrict srcpadext,
ptrdiff_t const dstloff = dstoff[3];

// Loop over region
if (use_openmp) {
if (false and use_openmp) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole "if" branch should be removed.

#pragma omp parallel for collapse(4)
for (int l = 0; l < reglext; ++l) {
for (int k = 0; k < regkext; ++k) {
for (int j = 0; j < regjext; ++j) {
for (int i = 0; i < regiext; ++i) {

dst[DSTIND4(i, j, k, l)] = src[SRCIND4(i, j, k, l)];
}
}
Expand All @@ -117,8 +116,8 @@ void copy_4d(T const *restrict const src, ivect4 const &restrict srcpadext,
for (int l = 0; l < reglext; ++l) {
for (int k = 0; k < regkext; ++k) {
for (int j = 0; j < regjext; ++j) {
#pragma omp simd
for (int i = 0; i < regiext; ++i) {

dst[DSTIND4(i, j, k, l)] = src[SRCIND4(i, j, k, l)];
}
}
Expand Down
5 changes: 2 additions & 3 deletions CarpetLib/src/interpolate_3d_2tl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -104,11 +104,10 @@ void interpolate_3d_2tl(T const *restrict const src1, CCTK_REAL const t1,
RT const s1fac = (t - t2) / (t1 - t2);
RT const s2fac = (t - t1) / (t2 - t1);

// Loop over region
#pragma omp parallel
// Loop over region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(interpolate_3d_2tl, i, j, k, 0, 0, 0, regiext, regjext, regkext,
dstipadext, dstjpadext, dstkpadext) {

dst[DSTIND3(i, j, k)] =
+s1fac * src1[SRCIND3(i, j, k)] + s2fac * src2[SRCIND3(i, j, k)];
}
Expand Down
5 changes: 2 additions & 3 deletions CarpetLib/src/interpolate_3d_3tl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -107,11 +107,10 @@ void interpolate_3d_3tl(T const *restrict const src1, CCTK_REAL const t1,
RT const s2fac = (t - t1) * (t - t3) / ((t2 - t1) * (t2 - t3));
RT const s3fac = (t - t1) * (t - t2) / ((t3 - t1) * (t3 - t2));

// Loop over region
#pragma omp parallel
// Loop over region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(interpolate_3d_3tl, i, j, k, 0, 0, 0, regiext, regjext, regkext,
dstipadext, dstjpadext, dstkpadext) {

dst[DSTIND3(i, j, k)] = +s1fac * src1[SRCIND3(i, j, k)] +
s2fac * src2[SRCIND3(i, j, k)] +
s3fac * src3[SRCIND3(i, j, k)];
Expand Down
5 changes: 2 additions & 3 deletions CarpetLib/src/interpolate_3d_4tl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -115,11 +115,10 @@ void interpolate_3d_4tl(T const *restrict const src1, CCTK_REAL const t1,
RT const s4fac =
(t - t1) * (t - t2) * (t - t3) / ((t4 - t1) * (t4 - t2) * (t4 - t3));

// Loop over region
#pragma omp parallel
// Loop over region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(interpolate_3d_4tl, i, j, k, 0, 0, 0, regiext, regjext, regkext,
dstipadext, dstjpadext, dstkpadext) {

dst[DSTIND3(i, j, k)] =
+s1fac * src1[SRCIND3(i, j, k)] + s2fac * src2[SRCIND3(i, j, k)] +
s3fac * src3[SRCIND3(i, j, k)] + s4fac * src4[SRCIND3(i, j, k)];
Expand Down
5 changes: 2 additions & 3 deletions CarpetLib/src/interpolate_3d_5tl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -120,11 +120,10 @@ void interpolate_3d_5tl(T const *restrict const src1, CCTK_REAL const t1,
RT const s5fac = (t - t1) * (t - t2) * (t - t3) * (t - t4) /
((t5 - t1) * (t5 - t2) * (t5 - t3) * (t5 - t4));

// Loop over region
#pragma omp parallel
// Loop over region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(interpolate_3d_5tl, i, j, k, 0, 0, 0, regiext, regjext, regkext,
dstipadext, dstjpadext, dstkpadext) {

dst[DSTIND3(i, j, k)] =
+s1fac * src1[SRCIND3(i, j, k)] + s2fac * src2[SRCIND3(i, j, k)] +
s3fac * src3[SRCIND3(i, j, k)] + s4fac * src4[SRCIND3(i, j, k)] +
Expand Down
11 changes: 5 additions & 6 deletions CarpetLib/src/interpolate_eno_3d_3tl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -129,15 +129,14 @@ void interpolate_eno_3d_3tl(
bool const use_12 = t >= min(t1, t2) - eps and t <= max(t1, t2) + eps;
bool const use_23 = t >= min(t2, t3) - eps and t <= max(t2, t3) + eps;
assert(use_12 or use_23);
// TODO: Instead of use_12, calculate 3 coefficents that perform
// the desired 2-point interpolation, which would avoid the if
// statement in the loop, simplifying the code.
// TODO: Instead of use_12, calculate 3 coefficents that perform
// the desired 2-point interpolation, which would avoid the if
// statement in the loop, simplifying the code.

// Loop over region
#pragma omp parallel
// Loop over region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(interpolate_end_3d_3tl, i, j, k, 0, 0, 0, regiext, regjext,
regkext, srcipadext, srcjpadext, srckpadext) {

T const s1 = src1[SRCIND3(i, j, k)];
T const s2 = src2[SRCIND3(i, j, k)];
T const s3 = src3[SRCIND3(i, j, k)];
Expand Down
4 changes: 2 additions & 2 deletions CarpetLib/src/prolongate_3d_cc_eno_rf2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1289,8 +1289,8 @@ void prolongate_3d_cc_eno_rf2(

} else { // use_loopcontrol_in_operators

// Loop over fine region
#pragma omp parallel
// Loop over fine region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(prolongate_3d_cc_eno_rf2, i, j, k, 0, 0, 0, regiext, regjext,
regkext, dstipadext, dstjpadext, dstkpadext) {
const ptrdiff_t is = (srcioff + i) / 2;
Expand Down
10 changes: 5 additions & 5 deletions CarpetLib/src/prolongate_3d_dgfe_rf2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ void prolongate_3d_dgfe_rf2(

// Loop over fine region
ptrdiff_t const i = 0;
#pragma omp parallel for collapse(2)
// #pragma omp parallel for collapse(2)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Zwicky's Intel compiler 11.1 ices on ptrdiff_t
for (/*ptrdiff_t*/ int k = 0; k < regkext; k += 2 * (ORDER + 1)) {
for (/*ptrdiff_t*/ int j = 0; j < regjext; j += 2 * (ORDER + 1)) {
Expand Down Expand Up @@ -171,7 +171,7 @@ void prolongate_3d_dgfe_rf2(

// Loop over fine region
ptrdiff_t const j = 0;
#pragma omp parallel for collapse(2)
// #pragma omp parallel for collapse(2)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Zwicky's Intel compiler 11.1 ices on ptrdiff_t
for (/*ptrdiff_t*/ int k = 0; k < regkext; k += 2 * (ORDER + 1)) {
for (/*ptrdiff_t*/ int i = 0; i < regiext; i += 2 * (ORDER + 1)) {
Expand Down Expand Up @@ -207,7 +207,7 @@ void prolongate_3d_dgfe_rf2(

// Loop over fine region
ptrdiff_t const k = 0;
#pragma omp parallel for collapse(2)
// #pragma omp parallel for collapse(2)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Zwicky's Intel compiler 11.1 ices on ptrdiff_t
for (/*ptrdiff_t*/ int j = 0; j < regjext; j += 2 * (ORDER + 1)) {
for (/*ptrdiff_t*/ int i = 0; i < regiext; i += 2 * (ORDER + 1)) {
Expand Down Expand Up @@ -239,8 +239,8 @@ void prolongate_3d_dgfe_rf2(
int const srcstr[3] = {srcdi, srcdj, srcdk};
int const dststr[3] = {dstdi, dstdj, dstdk};

// Loop over fine region
#pragma omp parallel for collapse(3)
// Loop over fine region
// #pragma omp parallel for collapse(3)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Zwicky's Intel compiler 11.1 ices on ptrdiff_t
for (/*ptrdiff_t*/ int k = 0; k < regkext; k += 2 * (ORDER + 1)) {
for (/*ptrdiff_t*/ int j = 0; j < regjext; j += 2 * (ORDER + 1)) {
Expand Down
8 changes: 4 additions & 4 deletions CarpetLib/src/restrict_3d_cc_o3_rf2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,8 @@ void restrict_3d_cc_o3_rf2(

if (not use_loopcontrol_in_operators) {

// Loop over coarse region
#pragma omp parallel for collapse(3)
// Loop over coarse region
// #pragma omp parallel for collapse(3)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

for (int k = 0; k < regkext; ++k) {
for (int j = 0; j < regjext; ++j) {
for (int i = 0; i < regiext; ++i) {
Expand Down Expand Up @@ -215,8 +215,8 @@ void restrict_3d_cc_o3_rf2(

} else {

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(restrict_3d_cc_o3_rf2, i, j, k, 0, 0, 0, regiext, regjext,
regkext, dstipadext, dstjpadext, dstkpadext) {

Expand Down
8 changes: 4 additions & 4 deletions CarpetLib/src/restrict_3d_cc_o5_rf2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -167,8 +167,8 @@ void restrict_3d_cc_o5_rf2(

if (not use_loopcontrol_in_operators) {

// Loop over coarse region
#pragma omp parallel for collapse(3)
// Loop over coarse region
// #pragma omp parallel for collapse(3)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

for (int k = 0; k < regkext; ++k) {
for (int j = 0; j < regjext; ++j) {
for (int i = 0; i < regiext; ++i) {
Expand Down Expand Up @@ -201,8 +201,8 @@ void restrict_3d_cc_o5_rf2(

} else {

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(restrict_3d_cc_o5_rf2, i, j, k, 0, 0, 0, regiext, regjext,
regkext, dstipadext, dstjpadext, dstkpadext) {

Expand Down
8 changes: 4 additions & 4 deletions CarpetLib/src/restrict_3d_cc_rf2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,8 @@ void restrict_3d_cc_rf2(T const *restrict const src,

if (not use_loopcontrol_in_operators) {

// Loop over coarse region
#pragma omp parallel for collapse(3)
// Loop over coarse region
// #pragma omp parallel for collapse(3)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

for (int k = 0; k < regkext; ++k) {
for (int j = 0; j < regjext; ++j) {
for (int i = 0; i < regiext; ++i) {
Expand All @@ -131,8 +131,8 @@ void restrict_3d_cc_rf2(T const *restrict const src,

} else {

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(restrict_3d_cc_rf2, i, j, k, 0, 0, 0, regiext, regjext, regkext,
dstipadext, dstjpadext, dstkpadext) {

Expand Down
4 changes: 2 additions & 2 deletions CarpetLib/src/restrict_3d_dgfe_rf2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,8 @@ void restrict_3d_dgfe_rf2(
// Ensure we traverse an integer number of elements
assert(all(regext % (ORDER + 1) == 0));

// Loop over coarse region
#pragma omp parallel for collapse(3)
// Loop over coarse region
// #pragma omp parallel for collapse(3)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

// Zwicky's Intel compiler 11.1 ices on ptrdiff_t
for (/*ptrdiff_t*/ int k = 0; k < regkext; k += ORDER + 1) {
for (/*ptrdiff_t*/ int j = 0; j < regjext; j += ORDER + 1) {
Expand Down
4 changes: 2 additions & 2 deletions CarpetLib/src/restrict_3d_rf2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -110,8 +110,8 @@ void restrict_3d_rf2(T const *restrict const src,

} else {

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(restrict_3d_rf2, i, j, k, 0, 0, 0, regiext, regjext, regkext,
dstipadext, dstjpadext, dstkpadext) {

Expand Down
4 changes: 2 additions & 2 deletions CarpetLib/src/restrict_3d_stagger011.cc
Original file line number Diff line number Diff line change
Expand Up @@ -146,8 +146,8 @@ void restrict_3d_stagger011(

} else {

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(restrict_3d_stagger011, i, j, k, 0, 0, 0, regiext, regjext,
regkext, dstipadext, dstjpadext, dstkpadext) {

Expand Down
4 changes: 2 additions & 2 deletions CarpetLib/src/restrict_3d_stagger101.cc
Original file line number Diff line number Diff line change
Expand Up @@ -146,8 +146,8 @@ void restrict_3d_stagger101(

} else {

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(restrict_3d_stagger101, i, j, k, 0, 0, 0, regiext, regjext,
regkext, dstipadext, dstjpadext, dstkpadext) {

Expand Down
4 changes: 2 additions & 2 deletions CarpetLib/src/restrict_3d_stagger110.cc
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,8 @@ void restrict_3d_stagger110(

} else {

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(restrict_3d_stagger110, i, j, k, 0, 0, 0, regiext, regjext,
regkext, dstipadext, dstjpadext, dstkpadext) {
dst[DSTIND3(i, j, k)] = typeprops<T>::fromreal(0);
Expand Down
4 changes: 2 additions & 2 deletions CarpetLib/src/restrict_3d_stagger111.cc
Original file line number Diff line number Diff line change
Expand Up @@ -137,8 +137,8 @@ void restrict_3d_stagger111(

} else {

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(restrict_3d_stagger111, i, j, k, 0, 0, 0, regiext, regjext,
regkext, dstipadext, dstjpadext, dstkpadext) {
dst[DSTIND3(i, j, k)] = typeprops<T>::fromreal(0);
Expand Down
8 changes: 4 additions & 4 deletions CarpetLib/src/restrict_3d_vc_rf2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -211,8 +211,8 @@ void restrict_3d_vc_rf2(T const *restrict const src,

if (not use_loopcontrol_in_operators) {

// Loop over coarse region
#pragma omp parallel for collapse(3)
// Loop over coarse region
// #pragma omp parallel for collapse(3)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

for (int k = 0; k < regkext; ++k) {
for (int j = 0; j < regjext; ++j) {
for (int i = 0; i < regiext; ++i) {
Expand Down Expand Up @@ -240,8 +240,8 @@ void restrict_3d_vc_rf2(T const *restrict const src,

} else {

// Loop over coarse region
#pragma omp parallel
// Loop over coarse region
// #pragma omp parallel
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

CCTK_LOOP3(restrict_3d_vc_rf2, i, j, k, 0, 0, 0, regiext, regjext, regkext,
dstipadext, dstjpadext, dstkpadext) {
#ifdef CARPET_DEBUG
Expand Down
6 changes: 3 additions & 3 deletions CarpetLib/src/restrict_4d_rf2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -101,13 +101,13 @@ void restrict_4d_rf2(T const *restrict const src,
ptrdiff_t const dstkoff = dstoff[2];
ptrdiff_t const dstloff = dstoff[3];

// Loop over coarse region
#pragma omp parallel for collapse(4)
// Loop over coarse region
// #pragma omp parallel for collapse(4)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be committed in this form. Rather the whole pragma should be removed.

for (int l = 0; l < reglext; ++l) {
for (int k = 0; k < regkext; ++k) {
for (int j = 0; j < regjext; ++j) {
#pragma omp simd
for (int i = 0; i < regiext; ++i) {

dst[DSTIND4(i, j, k, l)] = src[SRCIND4(2 * i, 2 * j, 2 * k, 2 * l)];
}
}
Expand Down