You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This message is more of a question, so my apologies in advance if this is not the best place for this.
We are trying to use UMT to help processor architectures with their next gen design, and they want to parallelize the hot loop of the code among certain number of microthreads. We performed a VTune analysis using the MFEM test case, and it seemed the hot loops are in SweepUCBxyz.F90 line 270 and 278. Now the architects are wondering if they can parallelize the two most outer loops, HyperPlaneLoop (line 103) and ZoneLoop (line 107). The question is that is there data dependency on either of these loops preventing parallelization? Is the value of c0 unique in each iteration (it seems like that based on the data from the test case, but we can't say that for sure)? We would truly appreciate any guidance on this, and more general, on OpenMP parallelization of the code. Thank you in advance for your time.
The text was updated successfully, but these errors were encountered:
Hello,
This message is more of a question, so my apologies in advance if this is not the best place for this.
We are trying to use UMT to help processor architectures with their next gen design, and they want to parallelize the hot loop of the code among certain number of microthreads. We performed a VTune analysis using the MFEM test case, and it seemed the hot loops are in SweepUCBxyz.F90 line 270 and 278. Now the architects are wondering if they can parallelize the two most outer loops, HyperPlaneLoop (line 103) and ZoneLoop (line 107). The question is that is there data dependency on either of these loops preventing parallelization? Is the value of c0 unique in each iteration (it seems like that based on the data from the test case, but we can't say that for sure)? We would truly appreciate any guidance on this, and more general, on OpenMP parallelization of the code. Thank you in advance for your time.
The text was updated successfully, but these errors were encountered: