Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converted tests to use cgnsdiff #54

Merged
merged 18 commits into from
Oct 4, 2021
Merged

Conversation

sseraj
Copy link
Collaborator

@sseraj sseraj commented Aug 23, 2021

Purpose

This is a second attempt at revamping the pyHyp tests, this time using cgnsdiff (based on discussion in #52). I generated the ref files by running the examples with 2 procs on the u20-gcc-ompi-latest image.

I verified that the tests fail if any of the following change:

  • node coordinates (e.g. using different smoothing parameters)
  • boundary conditions
  • block dimensions

Type of change

What types of change is it?
Select the appropriate type(s) that describe this PR

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (non-backwards-compatible fix or feature)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no API changes)
  • Documentation update
  • Maintenance update
  • Other (please describe)

Testing

Testing steps are on the installation page in the docs.

Checklist

Put an x in the boxes that apply.

  • I have run flake8 and black to make sure the code adheres to PEP-8 and is consistently formatted
  • I have run unit and regression tests which pass locally with my changes
  • I have added new tests that prove my fix is effective or that my feature works
  • I have added necessary documentation

@sseraj sseraj requested a review from a team as a code owner August 23, 2021 13:18
@sseraj
Copy link
Collaborator Author

sseraj commented Aug 23, 2021

Adding the Barrier call fixed the I/O issue with the pyHypMulti tests. The tacc image failure is a separate issue that seems to be memory related. The two tests that fail are the two large 3D cases.

@sseraj
Copy link
Collaborator Author

sseraj commented Aug 23, 2021

The tacc image error occurs in the writeCGNS call, but only when running in parallel. Any ideas on why this might be happening @eirikurj @joanibal?

This did not occur before because all the examples were run in serial. I could use N_PROCS = 1 but that doesn't solve the underlying problem.

@sseraj
Copy link
Collaborator Author

sseraj commented Aug 24, 2021

Adding a barrier call seems to have fixed a synchronization issue in writeCGNS that was only occurring on the TACC image. This PR should be good to go now.

ewu63
ewu63 previously approved these changes Aug 25, 2021
Copy link
Collaborator

@ewu63 ewu63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for putting in the work -- tests look much nicer now. I would like to hear @eirikurj or others' opinions on these extra Barrier calls before merging though.

@marcomangano
Copy link
Contributor

@eirikurj is this looking good?

@sseraj
Copy link
Collaborator Author

sseraj commented Sep 20, 2021

This is still awaiting review @eirikurj @SichengHe

@sseraj sseraj requested a review from anilyil September 27, 2021 13:51
@@ -92,6 +92,9 @@ subroutine writeCGNS(fileName)
! Copy values and write:
do k=1,N

! Synchronize before scattering
call mpi_barrier(hyp_comm_world, ierr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reasoning behind this barrier?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests were failing on the TACC image without this barrier. I think it is related to the newer Intel MPI version used on that image. We did not see this problem before because all the tests ran on 1 proc.

Does it make sense to call barrier before a PETSc generalized scatter? I'm not convinced this is a bug in pyHyp as opposed to something wrong with the TACC image or Intel MPI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would have expected it to be fine even w/o the barrier but I dont know enough. If it fixes an issue on TACC, then sure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a bit weird, Sabet and I had some conversations about this but I don't really understand what's going on. Maybe it's an MPI implementation-specific behaviour?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I can try to read and understand the scatter stuff and its usage, but it will take a few more days probably. To me this is not too performance critical since these meshes are run once and not in the loop etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought scatter is blocking already, but according to @sseraj he tried many combinations and this was the only one that worked consistently.

@anilyil
Copy link
Contributor

anilyil commented Oct 1, 2021

The tests look great, thanks for the work. I am fine with the final barrier before returning on the python level but I am curious about why we have that barrier before the scatter stuff. If that also makes sense, this is good to go for me.

@anilyil anilyil mentioned this pull request Oct 1, 2021
8 tasks
@anilyil anilyil self-requested a review October 4, 2021 00:29
anilyil
anilyil previously approved these changes Oct 4, 2021
@anilyil anilyil requested a review from ewu63 October 4, 2021 00:30
@anilyil
Copy link
Contributor

anilyil commented Oct 4, 2021

Can you also review again @nwu63 so we can merge this?

Copy link
Collaborator

@ewu63 ewu63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment: we don't actually need examples/__init__.py right? AFAIK since py3.3+ we no longer need to have an init file to import, and the os.path modifications should be enough.

@@ -5,14 +5,14 @@
import sys


def _tmp_pkg(directory):
def _tmp_pkg(dirName):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this? directory isn't a built-in keyword right? I'm fine with renaming just wanted to understand

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed dir to dirName at the same time you changed dir to directory. I guess I won in the merge. I changed it back in any case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that better, I'll just remove the last commit and then approve.

N_PROCS = 2

def setUp(self):
# Insert the repo root in path so that testflo can import examples
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this was also a trick I used in order to import examples

anilyil
anilyil previously approved these changes Oct 4, 2021
@anilyil anilyil self-requested a review October 4, 2021 12:59
@ewu63 ewu63 merged commit 26916e2 into mdolab:master Oct 4, 2021
@sseraj sseraj deleted the refactor-tests branch October 4, 2021 13:30
@sseraj sseraj mentioned this pull request Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants