-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test and Step Outputs #23
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, but added a comment for discussion about what failure behavior should be related to outputs.
examples/01-fails.yaml
Outdated
- exec: node1 | ||
run: /bin/true | ||
expect: | ||
- tests['fail-with-outputs'].outputs == {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have output from a fail and want to use that later (--continue-on-error and if: failure()
)?
Also, would this also be "{}" if the user had an empty output map but it succeeded? Maybe it should be nil (i.e. something falsey) on failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. I didn't expect running outputs after a failed test (or step) to be useful, as something "upstream" must have gone wrong (env
interpolation threw an error, run
didn't provide the expected result, etc), but maybe I'm wrong about that. Perhaps we should still attempt to interpolate the outputs regardless, and if there's another error, so be it? (Btw, --continue-on-error
would relate to tests observing other [failed] tests' outputs, whereas right now we only support if: failure()
on steps, but I get your point).
Quick related question to probe how you'd expect things to work. Right now, if env
interpolation throws an error, we fail the test or step and do not run the steps or run
command respectively. If we switch things such that outputs
are always interpolated (except if if
is falsy), do you still think we have the correct behavior in that env
-stopping-run
case? Do you think we should swallow env
interpolation errors, returning nil or an empty string or something, then proceed with the rest of the test/step execution? To be clear, I think our current behavior is the right thing (albeit tricky to implement, it seems), and I'm also ok with outputs
being a sort of exception that always runs, especially since the user would have some control here, ex: outputs: { FOO: '${{ (failure() && ..) || ... }}' }
.
I absolutely agree with a failed test returning nil instead of "{}" - thank you, I'll fix that now. And you're right that a successful test will return "{}" regardless of whether it defined an outputs map or not. Perhaps I should add a test for that as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok updated to dissoc outputs in the failing case and added a successful test outputs test. Let me know what you think about running outputs on failed tests (which will probably reverse the decision to return nil/falsy outputs on failed tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm misinterpreting the test. It looks to me like maybe-check-empty-failure-outputs
test is not getting output because fail-with-outputs
calls false
which fails the step (and therefore the test). So my concern is that if instead of false
, a program is called that returns some stderr output and then fails. It seems like a later test that depends on that should be able to access/check that output somehow. An error during the actual interpolation of env/outputs should return null (but I'm not sure I see that case being tested).
Allow users to define 'outputs' at the step and test levels, which is a map (like 'env') that is interpolated last, after the 'run' command is executed and checked by 'expect' conditions. Having 'outputs' facilitates cleaner reuse of computed values and comparison of values across containers without resorting to writing files and/or using ':host'. Besides the addition of the new 'outputs' key, we allow adding 'id' to steps and add two new contexts, 'tests' and 'steps', which represent the previously completed tests and steps respectively. The 'tests' context and the 'depends' key are analogous to 'needs' in GitHub Actions; however, we currently do not check that a user expression that references a test with 'tests' also includes that test in their 'depends' graph. If a user references a non-existent test or non-existent output, they get null (not an exception). Due to our "in place" interpolation of tests and steps, we need to clear uninterpolated 'outputs' in the case the test/step failed or was skipped, so that future tests/steps don't observe uninterpolated strings as 'outputs'. Future refactoring of how tests/steps are interpolated will hopefully make this unnecessary.
;; If skipped/failed, outputs will be present but uninterpolated | ||
test (if (passed? test) | ||
test | ||
(dissoc test :outputs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment on this wrong now? Looks like the outputs are removed if it fails.
This PR builds off of #22 (to maintain the
./test/runexamples
counts). Only the last commit is new.In short, test/step 'outputs' work like GHA's job-level 'outputs'. I chose not to implement something analogous to GHA's
GITHUB_OUTPUT
. Instead we use the same mechanism for both tests and steps.The goal of outputs is to decrease the pain around comparing values within a test (start value vs. end value, or value in one container vs. value in another container) and around capturing/naming computed values to use later (either from earlier step or earlier test).
The new expression contexts are called 'tests' (not 'needs' or 'depends') and 'steps'. I chose to not attempt to check that user-defined expressions that reference the 'tests' context actually refer to a test id in their 'depends' graph (either the immediately named shallow graph or the full resolved graph). Open to discussion here. Right now the user will get a null value if they reference something that doesn't exist, and it's up to them to ensure the right test is depended upon and runs first.
The implementation of this feature is not fantastic, but I think it's really just exposing that the previous
pending->
/related work is still not right. I've been working on refactoring this into a form that, instead of interpolating tests and steps "in place" inexecute-step
andrun-test
, creates a new binding for the interpolated result and leaves the user's input test or step untouched. I think I'm getting closer, but it has held up the PR for this feature for a couple weeks now. If it is ok, I propose reviewing the 'outputs' feature on its own and that it doesn't make this code significantly worse, and I will continue to explore better ways to interpolate things while handling exceptions etc.