Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test and Step Outputs #23

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Test and Step Outputs #23

wants to merge 1 commit into from

Conversation

jonsmock
Copy link
Collaborator

@jonsmock jonsmock commented Dec 4, 2024

This PR builds off of #22 (to maintain the ./test/runexamples counts). Only the last commit is new.

In short, test/step 'outputs' work like GHA's job-level 'outputs'. I chose not to implement something analogous to GHA's GITHUB_OUTPUT. Instead we use the same mechanism for both tests and steps.

The goal of outputs is to decrease the pain around comparing values within a test (start value vs. end value, or value in one container vs. value in another container) and around capturing/naming computed values to use later (either from earlier step or earlier test).

The new expression contexts are called 'tests' (not 'needs' or 'depends') and 'steps'. I chose to not attempt to check that user-defined expressions that reference the 'tests' context actually refer to a test id in their 'depends' graph (either the immediately named shallow graph or the full resolved graph). Open to discussion here. Right now the user will get a null value if they reference something that doesn't exist, and it's up to them to ensure the right test is depended upon and runs first.

The implementation of this feature is not fantastic, but I think it's really just exposing that the previous pending->/related work is still not right. I've been working on refactoring this into a form that, instead of interpolating tests and steps "in place" in execute-step and run-test, creates a new binding for the interpolated result and leaves the user's input test or step untouched. I think I'm getting closer, but it has held up the PR for this feature for a couple weeks now. If it is ok, I propose reviewing the 'outputs' feature on its own and that it doesn't make this code significantly worse, and I will continue to explore better ways to interpolate things while handling exceptions etc.

@jonsmock jonsmock requested review from kanaka and gdw2 December 4, 2024 17:49
Copy link
Collaborator

@kanaka kanaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but added a comment for discussion about what failure behavior should be related to outputs.

- exec: node1
run: /bin/true
expect:
- tests['fail-with-outputs'].outputs == {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have output from a fail and want to use that later (--continue-on-error and if: failure())?

Also, would this also be "{}" if the user had an empty output map but it succeeded? Maybe it should be nil (i.e. something falsey) on failure?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points. I didn't expect running outputs after a failed test (or step) to be useful, as something "upstream" must have gone wrong (env interpolation threw an error, run didn't provide the expected result, etc), but maybe I'm wrong about that. Perhaps we should still attempt to interpolate the outputs regardless, and if there's another error, so be it? (Btw, --continue-on-error would relate to tests observing other [failed] tests' outputs, whereas right now we only support if: failure() on steps, but I get your point).

Quick related question to probe how you'd expect things to work. Right now, if env interpolation throws an error, we fail the test or step and do not run the steps or run command respectively. If we switch things such that outputs are always interpolated (except if if is falsy), do you still think we have the correct behavior in that env-stopping-run case? Do you think we should swallow env interpolation errors, returning nil or an empty string or something, then proceed with the rest of the test/step execution? To be clear, I think our current behavior is the right thing (albeit tricky to implement, it seems), and I'm also ok with outputs being a sort of exception that always runs, especially since the user would have some control here, ex: outputs: { FOO: '${{ (failure() && ..) || ... }}' }.

I absolutely agree with a failed test returning nil instead of "{}" - thank you, I'll fix that now. And you're right that a successful test will return "{}" regardless of whether it defined an outputs map or not. Perhaps I should add a test for that as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok updated to dissoc outputs in the failing case and added a successful test outputs test. Let me know what you think about running outputs on failed tests (which will probably reverse the decision to return nil/falsy outputs on failed tests).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm misinterpreting the test. It looks to me like maybe-check-empty-failure-outputs test is not getting output because fail-with-outputs calls false which fails the step (and therefore the test). So my concern is that if instead of false, a program is called that returns some stderr output and then fails. It seems like a later test that depends on that should be able to access/check that output somehow. An error during the actual interpolation of env/outputs should return null (but I'm not sure I see that case being tested).

Allow users to define 'outputs' at the step and test levels, which is
a map (like 'env') that is interpolated last, after the 'run' command is
executed and checked by 'expect' conditions. Having 'outputs'
facilitates cleaner reuse of computed values and comparison of values
across containers without resorting to writing files and/or using ':host'.

Besides the addition of the new 'outputs' key, we allow adding 'id' to
steps and add two new contexts, 'tests' and 'steps', which represent the
previously completed tests and steps respectively.

The 'tests' context and the 'depends' key are analogous to 'needs' in
GitHub Actions; however, we currently do not check that a user
expression that references a test with 'tests' also includes that test
in their 'depends' graph. If a user references a non-existent test or
non-existent output, they get null (not an exception).

Due to our "in place" interpolation of tests and steps, we need to clear
uninterpolated 'outputs' in the case the test/step failed or was
skipped, so that future tests/steps don't observe uninterpolated strings
as 'outputs'. Future refactoring of how tests/steps are interpolated
will hopefully make this unnecessary.
;; If skipped/failed, outputs will be present but uninterpolated
test (if (passed? test)
test
(dissoc test :outputs))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the comment on this wrong now? Looks like the outputs are removed if it fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants