Make compatible with non-coupling-enabled versions of grudge #898

majosm · 2023-05-08T15:08:01Z

Makes coupling functionality conditional on having a coupling-enabled branch of grudge.

Questions for the review:

Is the scope and purpose of the PR clear?
- The PR should have a description.
- The PR should have a guide if needed (e.g., an ordering).
Is every top-level method and class documented? Are things that should be documented actually so?
Is the interface understandable? (I.e. can someone figure out what stuff does?) Is it well-defined?
Does the implementation do what the docstring claims?
Is everything that is implemented covered by tests?
Do you see any immediate risks or performance disadvantages with the design? Example: what do interface normals attach to?

MTCam · 2023-05-09T20:06:24Z

I think this change would be moving us in the wrong direction if we intend to run the prediction out of mirgecom@main. In my opinion, if the upstream packages cannot support prediction, then their tests of mirgecom deserve to fail. I don't like the strategy of massaging the tests or massaging what gets tested to cater specially to the upstream testing.

If the goal of mirgecom@main is to keep it in a state that always works with all subs@main, then I can see this change, but I thought one of our major goals was to make mirgecom@main capable of running prediction. I'd like to move the prediction testing to main, but that will definitely cause the upstream packages that cannot support prediction to fail their downstream testing of mirgecom.

majosm · 2023-05-09T20:45:42Z

I think this change would be moving us in the wrong direction if we intend to run the prediction out of mirgecom@main. In my opinion, if the upstream packages cannot support prediction, then their tests of mirgecom deserve to fail. I don't like the strategy of massaging the tests or massaging what gets tested to cater specially to the upstream testing.

If the goal of mirgecom@main is to keep it in a state that always works with all subs@main, then I can see this change, but I thought one of our major goals was to make mirgecom@main capable of running prediction. I'd like to move the prediction testing to main, but that will definitely cause the upstream packages that cannot support prediction to fail their downstream testing of mirgecom.

This PR (short of the requirements.txt change, which is temporary) shouldn't affect whether we can run the prediction out of main. The idea is to make the functionality that depends on inducer/grudge#285 optional until that PR gets merged, so that we can use it here in the meantime without also breaking all of the upstream CI. This does mean that any prediction tests that get added will need to be conditional as well.

In the future I think we should try to avoid changing requirements.txt on mirgecom@main; it tends to make a mess of things.

MTCam · 2023-05-09T21:01:32Z

This PR (short of the requirements.txt change, which is temporary) shouldn't affect whether we can run the prediction out of main. The idea is to make the functionality that depends on inducer/grudge#285 optional until that PR gets merged, so that we can use it here in the meantime without also breaking all of the upstream CI.

I get that, but that's what I meant. This change set just disables or avoids the problem that the upstream package deserves to encounter because it is not up to running what mirgecom needs to do the prediction.

This does mean that any prediction tests that get added will need to be conditional as well.

We cannot effectively test prediction in main with this strategy. I think what you are saying is that if our tests break the upstream CI tests, then we should change our tests. Should the prediction run from main? It cannot unless we change requirements@main to support prediction.

In the future I think we should try to avoid changing requirements.txt on mirgecom@main; it tends to make a mess of things.

It depends on our goals for mirgecom@main. If the goal is to keep the upstream packages' CI working on all their main branches, then I agree. If the goal is to do prediction from mirgecom@main, then the tests in mirgecom@main that check whether that is possible deserve to fail when the upstream packages do not support prediction.

At some point we should have a group discussion all together about what our goals are for mirgecom@main, and why the push to merge prediction capabilities to main. I am not opposed to keeping the production branch and moving all subpkg version tweaks up into production, but I think we should agree on a strategy together.

I've been pressured to step on the gas and make mirgecom@main capable of prediction, so that's been my focus, and the motivation behind the concern over the strategy taken by this change set.

majosm · 2023-05-09T21:13:27Z

I get that, but that's what I meant. This change set just disables or avoids the problem that the upstream package deserves to encounter because it is not up to running what mirgecom needs to do the prediction.

I don't think I agree with this. We change requirements.txt in mirgecom at our own risk. That shouldn't affect whether upstream CI deserves to pass or not.

We cannot effectively test prediction in main with this strategy. I think what you are saying is that if our tests break the upstream CI tests, then we should change our tests. Should the prediction run from main? It cannot unless we change requirements@main to support prediction.

We don't change the tests; we make running them conditional on whether we have the needed requirements or not (i.e., they run in our CI, but not upstream).

It depends on our goals for mirgecom@main. If the goal is to keep the upstream packages' CI working on all their main branches, then I agree. If the goal is to do prediction from mirgecom@main, then the tests in mirgecom@main that check whether that is possible deserve to fail when the upstream packages do not support prediction.

At some point we should have a group discussion all together about what our goals are for mirgecom@main, and why the push to merge prediction capabilities to main. I am not opposed to keeping the production branch and moving all subpkg version tweaks up into production, but I think we should agree on a strategy together.

I've been pressured to step on the gas and make mirgecom@main capable of prediction, so that's been my focus, and the motivation behind the concern over the strategy taken by this change set.

I'm suggesting that generally we shouldn't consider PRs ready to merge until their upstream dependencies have been merged. I think it creates too many potential headaches.

MTCam · 2023-05-10T01:35:16Z

We don't change the tests; we make running them conditional on whether we have the needed requirements or not (i.e., they run in our CI, but not upstream).

This is exactly what I mean by change the tests, though. We choose to only run the tests that pass by changing them in this way. I think we should indicate the capabilities that are required to do our prediction by the prediction tests, and not choose to run only the ones that pass. The failing ones kind of define what we should do next.

I'm suggesting that generally we shouldn't consider PRs ready to merge until their upstream dependencies have been merged. I think it creates too many potential headaches.

I don't disagree. That does mean that we realistically are not likely to run the prediction from main for a long time to come, and that's really ok with me. Just going by the rate at which the outstanding things are merging, and the rate that people are leaving.

It'd be great if production's only difference from main was in requirements.txt in that case. We could just switch all mirgecom@main back to subpkg development heads and upstream CIs would pass. That still leaves us with a hole in testing prediction. What if the prediction case had its own test script that upstream packages could call to test it? Then instead of making prediction tests part of the mirgecom tests, we could just run the driver-native downstream tests as their own package?

majosm · 2023-05-10T15:08:03Z

This is exactly what I mean by change the tests, though. We choose to only run the tests that pass by changing them in this way. I think we should indicate the capabilities that are required to do our prediction by the prediction tests, and not choose to run only the ones that pass. The failing ones kind of define what we should do next.

I mean, I get why you would want some indication that the upstream package does not presently support the entire feature set required by mirgecom. I don't know if making the CI fail is the right way to do that though, since it prevents other things from being merged in the meantime (unless that CI job is made optional, which is also not great). IMO CI should fail if something is wrong, not if something is TBD. pytest.xfail is maybe along the lines of what you're after (though I suppose that also counts as modifying the tests).

I don't disagree. That does mean that we realistically are not likely to run the prediction from main for a long time to come, and that's really ok with me. Just going by the rate at which the outstanding things are merging, and the rate that people are leaving.

Yeah, I'm mainly thinking about the wall coupling at the moment, which was the immediate cause of the CI issues. The fusion, etc. changes are more difficult. For long-lived subpackage branches like that, doing what we did may have been the only viable option (otherwise there would have been long-term incompatibilities between main and production, IIRC?). One thing we could maybe do in those cases is start treating the forked subpackage as the canonical one, set up CI there, and temporarily forget about the inducer version (it could disable its mirgecom CI, or limit it to a subset of supported functionality like what this change does). Then any changes from the inducer fork could be periodically merged into the canonical fork via PR (which then must pass the CI that tests mirgecom's full prediction feature set).

It'd be great if production's only difference from main was in requirements.txt in that case. We could just switch all mirgecom@main back to subpkg development heads and upstream CIs would pass. That still leaves us with a hole in testing prediction. What if the prediction case had its own test script that upstream packages could call to test it? Then instead of making prediction tests part of the mirgecom tests, we could just run the driver-native downstream tests as their own package?

(Not sure if I follow this part, I think I'd need to see more details.)

MTCam · 2023-05-10T16:10:43Z

I mean, I get why you would want some indication that the upstream package does not presently support the entire feature set required by mirgecom.

I think we need to consider whether the required feature set includes prediction. If so, then imo the upstream packages CI should show where that falls down. I.e. the upstream CI failures are meaningful.

I don't know if making the CI fail is the right way to do that though, since it prevents other things from being merged in the meantime (unless that CI job is made optional, which is also not great). IMO CI should fail if something is wrong, not if something is TBD. pytest.xfail is maybe along the lines of what you're after (though I suppose that also counts as modifying the tests).

I don't know either, but I do have a very strong feeling that we need to plug the prediction testing holes that allow packages up-and-down the toolchain to be tweaked in such a way that they break, or at least frustrate the prediction effort. I think one of the main questions here is whether that prediction hole should be plugged by mirgecom@main, or mirgecom@production. Both have deep implications about the specifics of how we can test the prediction capability.

Yeah, I'm mainly thinking about the wall coupling at the moment, which was the immediate cause of the CI issues.

Right, but wall coupling is required for prediction, and for effective testing of prediction capabilities. If we intend to run the prediction out of mirgecom@main, then that prediction capability should be tested in mirgecom@main, and the upstream CI failure is a meaningful indication that the tested branch (say of grudge or arraycontext) does not provide required capabilities. This was the point which gave me pause over this current change set.

The fusion, etc. changes are more difficult. For long-lived subpackage branches like that, doing what we did may have been the only viable option (otherwise there would have been long-term incompatibilities between main and production, IIRC?). One thing we could maybe do in those cases is start treating the forked subpackage as the canonical one, set up CI there, and temporarily forget about the inducer version (it could disable its mirgecom CI, or limit it to a subset of supported functionality like what this change does). Then any changes from the inducer fork could be periodically merged into the canonical fork via PR (which then must pass the CI that tests mirgecom's full prediction feature set).

I like this suggestion, and I think I don't fully appreciate the difference between this approach and the one we have taken for mirgecom. It was my understanding that when we change requirements.txt in mirgecom@main, we basically have done what you suggested - made those indicated versions the "canonical" ones. I think that was the intent, at least.

What if the prediction case had its own test script that upstream packages could call to test it? Then instead of making prediction tests part of the mirgecom tests, we could just run the driver-native downstream tests as their own package?

(Not sure if I follow this part, I think I'd need to see more details.)

In our current strategy, prediction drivers are not tested by the upstream pkg CIs. Our current prediction testing is really only effective for changes made to mirgecom itself. The only prediction capabilities that are tested by upstream CI are those that are tested indirectly by running the examples in mirgrecom@main - which is (iiuc) what the upstream CI tests do. That summarizes the big prediction testing hole I keep spouting on about. This hole frequently allows changes to be made (to upstream pkgs) which break or otherwise frustrate prediction efforts (and often at critical times).

On the other hand, mirgecom@main native CI actually runs the prediction drivers against proposed changes, and we get immediate feedback about whether those changes jibe with prediction. In a way, the native mirgecom@main CI's testing of prediction is our own "downstream" package test (i.e. if you view prediction as downstream of mirgecom).

So one thing that we could do is call drivers_y3-prediction a downstream package wrt mirgecom, and add that as an additional downstream testing suite that the upstream CI's (i.e. those of grudge, arraycontext, mirgecom, etc) run. So for example, current grudge@main could pass its downstream testing of mirgecom, but would fail its downstream testing of prediction (and rightly so).

Regardless, I think we should have a discussion about this, and our strategy for mirgecom@main, mirgecom@production, and how to effectively test the prediction capabilities. Ultimately, I do not have a strong preference about the how, and where, but I have a very strong preference for plugging the prediction testing holes.

majosm · 2023-05-10T18:10:21Z

I like this suggestion, and I think I don't fully appreciate the difference between this approach and the one we have taken for mirgecom. It was my understanding that when we change requirements.txt in mirgecom@main, we basically have done what you suggested - made those indicated versions the "canonical" ones. I think that was the intent, at least.

The distinction just comes down to which fork CI gets run on. Currently we run each subpackage's CI on Andreas' fork even when we're not using it (i.e. we're using mine or Kaushik's). If we ran CI on those instead, we wouldn't have to worry about whether the branch has such-and-such capability or not; we know it does because that's what we're using for the prediction. At this point though it may not be worth the effort setting that up, because it sounds like Kaushik's additions to his branches will be winding down (and my branch is presently static apart from pulling things in from inducer).

In our current strategy, prediction drivers are not tested by the upstream pkg CIs. Our current prediction testing is really only effective for changes made to mirgecom itself. The only prediction capabilities that are tested by upstream CI are those that are tested indirectly by running the examples in mirgrecom@main - which is (iiuc) what the upstream CI tests do. That summarizes the big prediction testing hole I keep spouting on about. This hole frequently allows changes to be made (to upstream pkgs) which break or otherwise frustrate prediction efforts (and often at critical times).

On the other hand, mirgecom@main native CI actually runs the prediction drivers against proposed changes, and we get immediate feedback about whether those changes jibe with prediction. In a way, the native mirgecom@main CI's testing of prediction is our own "downstream" package test (i.e. if you view prediction as downstream of mirgecom).

So one thing that we could do is call drivers_y3-prediction a downstream package wrt mirgecom, and add that as an additional downstream testing suite that the upstream CI's (i.e. those of grudge, arraycontext, mirgecom, etc) run. So for example, current grudge@main could pass its downstream testing of mirgecom, but would fail its downstream testing of prediction (and rightly so).

Regardless, I think we should have a discussion about this, and our strategy for mirgecom@main, mirgecom@production, and how to effectively test the prediction capabilities. Ultimately, I do not have a strong preference about the how, and where, but I have a very strong preference for plugging the prediction testing holes.

I see. Yeah, I don't know what would be preferable between adding a separate CI job vs. adding prediction-analogous tests to mirgecom. Maybe we can discuss this together on Friday. (I still don't think I agree that upstream CI should fail if it's missing a feature needed for the prediction, but I guess that's a matter of philosophy.)

MTCam · 2023-05-10T19:58:35Z

The distinction just comes down to which fork CI gets run on. Currently we run each subpackage's CI on Andreas' fork even when we're not using it (i.e. we're using mine or Kaushik's). If we ran CI on those instead, we wouldn't have to worry about whether the branch has such-and-such capability or not; we know it does because that's what we're using for the prediction. At this point though it may not be worth the effort setting that up, because it sounds like Kaushik's additions to his branches will be winding down (and my branch is presently static apart from pulling things in from inducer).

I like this even more now. What about if the major upstream packages including at least grudge, meshmode, pytato, arraycontext, had a "production" or "prediction" branch or fork, and we always pointed at those? Then we could update them through PR. I think that is what you are saying above, and I like it quite a lot. I'm not certain it would solve all of our prediction testing deficits, but I think it would help with some of the current dilemma over upstream pkg CIs.

It could be worth it for us to think more about this option, because it has potential for helping us encounter less of these critical prediction conflicts moving forward, even if there will be less changes to the actual branches/forks coming up.

The questions about whether we should maintain the prediction capability as a collection of outstanding developments merged together into a "production" branch, or run it out of main is still something we should think about and come together on.

I suppose if we go with your idea of having "production" versions of the sub-packages, then a "production" version of mirgecom is a nice symmetry there and gives us a chance to build on things that aren't ready for merge. The "production" subkgs could just do their downstream prediction CI tests with the "production" version of mirgecom.

On the other hand, the "production" approach encourages us to build off things that aren't ready for merge. If we have "production" subpackages then we could just point mirgecom@main at those and never change them. A more rigid/strict approach would force us to merge more often and be more mindful of changes to the prediction-supporting toolchain.

Our choices now about this development "flow" and strategy have a pretty heavy effect on our choices for testing the prediction capabilities.

majosm · 2024-08-14T14:51:07Z

Superseded by #1048, closing.

majosm force-pushed the optional-coupling branch 2 times, most recently from acb4596 to 84ce546 Compare May 9, 2023 15:06

majosm force-pushed the optional-coupling branch from 84ce546 to fba4620 Compare May 9, 2023 20:46

majosm force-pushed the optional-coupling branch from fba4620 to c8ab597 Compare May 10, 2023 15:19

make compatible with non-coupling-enabled versions of grudge

40b161e

majosm force-pushed the optional-coupling branch from c8ab597 to 40b161e Compare May 10, 2023 15:19

This was referenced Jul 22, 2024

Restore loop inference fallback inducer/arraycontext#269

Closed

make compatible with non-coupling-enabled versions of grudge v3 #1048

Merged

majosm closed this Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make compatible with non-coupling-enabled versions of grudge #898

Make compatible with non-coupling-enabled versions of grudge #898

majosm commented May 8, 2023

MTCam commented May 9, 2023 •

edited

Loading

majosm commented May 9, 2023

MTCam commented May 9, 2023

majosm commented May 9, 2023

MTCam commented May 10, 2023

majosm commented May 10, 2023

MTCam commented May 10, 2023

majosm commented May 10, 2023

MTCam commented May 10, 2023

majosm commented Aug 14, 2024

Make compatible with non-coupling-enabled versions of grudge #898

Make compatible with non-coupling-enabled versions of grudge #898

Conversation

majosm commented May 8, 2023

MTCam commented May 9, 2023 • edited Loading

majosm commented May 9, 2023

MTCam commented May 9, 2023

majosm commented May 9, 2023

MTCam commented May 10, 2023

majosm commented May 10, 2023

MTCam commented May 10, 2023

majosm commented May 10, 2023

MTCam commented May 10, 2023

majosm commented Aug 14, 2024

MTCam commented May 9, 2023 •

edited

Loading