-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor!: rename Container.exec_mocks to Container.execs and extend mocking #124
refactor!: rename Container.exec_mocks to Container.execs and extend mocking #124
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description seems to suggest this isn't quite ready for review. Gave it a quick look and found a couple of things that need clarification.
handlers = { | ||
tuple(exc.command): (key, exc) for key, exc in self._container.execs.items() | ||
} | ||
for prefix_len in reversed(range(len(command) + 1)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand what's going on.
Suppose you have two execs, one for ls -ll
and one for ls -la
, and the charm does ls --fubar
. Which exec mock will this match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that neither will match.
However, if the real command is ls -la -ht
, then ls -la
will match.
Did I get that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Harness behaviour (which I intended to match here, although I think I'm too low on tests) is that the longest match wins, but also that you have to match what's there. So in your example, neither would match.
exec 1 | exec 2 | command | winner |
---|---|---|---|
['ls', '-ll'] | ['ls', '-la'] | ['ls', '--fubar'] | none |
[] | ['ls'] | ['ls', '--fubar'] | exec 2 |
[] | ['ls', '-ll'] | ['ls', '--fubar'] | exec 1 |
['ls', '-ll', 'foo.txt'] | ['ls', '-ll'] | ['ls', '-ll', 'foo.txt'] | exec 1 |
I'm not sure what Harness does if there are two that have the exact same command pattern, but it seems to me that should be an error (consistency check?).
What I see charms doing (with Harness) is:
- using
[]
, presumably because they know that only one command will run and they either want to keep the test code clean or can't be bothered copying the exact command over to the test code - using
[first]
when the command is things like[first, arg1, arg2, arg2]
- probably partly the same reasoning as[]
but maybe also because there are multiple ways that the charm could run it and the result should always be the same (I haven't looked deeply enough to check) - using the full command
It does seem like having this is a nice convenience. @benhoyt probably has the full story on how Harness came to have that behaviour (it slightly predates me joining Canonical if I have the timing right).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked over this with @benhoyt on a call also, and he agrees that it's worth keeping the prefix matching in. The spec for adding exec testing in Harness has some more of the background/details.
scenario/mocking.py
Outdated
return _MockExecProcess( | ||
change_id=change_id, | ||
command=command, | ||
out=out, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we should uniformly rename 'out' to 'exec' while we're at it
Ah indeed, I missed that when moving a bunch to ready. Should be ok for review now, although there are some things to decide before it could actually be mergable. |
README.md
Outdated
execs={ | ||
"list-directory": scenario.Exec( # "list-directory" is an arbitrary tag for the command | ||
('ls', '-ll'): # this is the command we're mocking | ||
return_code=0, | ||
stdout=LS_LL | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was based on the earlier plan that @benhoyt and I came up with for 7.0 where State
components would generally be dicts. However, since we're looking at sets instead now, I think this would be better as:
execs={ | |
"list-directory": scenario.Exec( # "list-directory" is an arbitrary tag for the command | |
('ls', '-ll'): # this is the command we're mocking | |
return_code=0, | |
stdout=LS_LL | |
), | |
execs={ | |
scenario.Exec( | |
('ls', '-ll'): # this is the command we're mocking | |
return_code=0, | |
stdout=LS_LL | |
), | |
}, |
However, if we're going to adopt the "use the first command match" behaviour from Harness (it does seem like a bunch of charms are making use of this) then order matters, so it would actually need to be a list (or tuple):
execs={ | |
"list-directory": scenario.Exec( # "list-directory" is an arbitrary tag for the command | |
('ls', '-ll'): # this is the command we're mocking | |
return_code=0, | |
stdout=LS_LL | |
), | |
execs=[ | |
scenario.Exec( | |
('ls', '-ll'): # this is the command we're mocking | |
return_code=0, | |
stdout=LS_LL | |
), | |
], |
I'm no longer convinced that having mocked command encapsulated into the Exec
object alongside the results makes sense. So it would go back to the original, just with the rename:
execs={ | |
"list-directory": scenario.Exec( # "list-directory" is an arbitrary tag for the command | |
('ls', '-ll'): # this is the command we're mocking | |
return_code=0, | |
stdout=LS_LL | |
), | |
# The command we're mocking. | |
cmd = ('ls', '-ll') | |
# The mocked result to provide | |
result = scenario.Exec(return_code=0, stdout=LS_LL) | |
container = scenario.Container(name='foo', execs={cmd: result}) |
@benhoyt @PietroPasotti what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the last suggestion (back to mapping, rename only) makes sense but has a few downsides:
- all other objects are flat frozensets, why is this one a mapping?
- Dict is mutable. Perhaps we could use a frozendict to prevent further mutation at runtime?
Re the 'people rely on definition order' issue: it's true that dicts are ordered nowadays, but still it's a bit smelly to rely on that behaviour IMHO.
I also miss the context of the harness exec design, but I think we could make our lives easier if we force people to use the full command they want to mock.
We could also allow multiple commands per exec:
execs={
scenario.Exec(
cmds=(
('ls'),
('ls', '-ll'),
),
return_code=0,
stdout=LS_LL
),
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tonyandrewmeyer Let's discuss briefly after our daily sync. Given that we've settled on sets for everything, that's where I'd lean. Also, the "prefix matching" has been quite useful for Harness, though if that makes the implementation problematic, I'm open to not doing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@benhoyt and I discussed this in a call and agree that since most of the other components are moving to frozensets, this should as well (which means moving the command back into the object). @benhoyt suggested using command_prefix
rather than command
(like Harness) to make it clearer that it's going to pattern match.
I think if you want multiple commands to have the same return code / stdout / stderr, then it's pretty simple to just put those in variables and use them when constructing multiple Execs (in a set comprehension, perhaps).
handlers = { | ||
tuple(exc.command): (key, exc) for key, exc in self._container.execs.items() | ||
} | ||
for prefix_len in reversed(range(len(command) + 1)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Harness behaviour (which I intended to match here, although I think I'm too low on tests) is that the longest match wins, but also that you have to match what's there. So in your example, neither would match.
exec 1 | exec 2 | command | winner |
---|---|---|---|
['ls', '-ll'] | ['ls', '-la'] | ['ls', '--fubar'] | none |
[] | ['ls'] | ['ls', '--fubar'] | exec 2 |
[] | ['ls', '-ll'] | ['ls', '--fubar'] | exec 1 |
['ls', '-ll', 'foo.txt'] | ['ls', '-ll'] | ['ls', '-ll', 'foo.txt'] | exec 1 |
I'm not sure what Harness does if there are two that have the exact same command pattern, but it seems to me that should be an error (consistency check?).
What I see charms doing (with Harness) is:
- using
[]
, presumably because they know that only one command will run and they either want to keep the test code clean or can't be bothered copying the exact command over to the test code - using
[first]
when the command is things like[first, arg1, arg2, arg2]
- probably partly the same reasoning as[]
but maybe also because there are multiple ways that the charm could run it and the result should always be the same (I haven't looked deeply enough to check) - using the full command
It does seem like having this is a nice convenience. @benhoyt probably has the full story on how Harness came to have that behaviour (it slightly predates me joining Canonical if I have the timing right).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only a partial review today
handlers = { | ||
tuple(exc.command): (key, exc) for key, exc in self._container.execs.items() | ||
} | ||
for prefix_len in reversed(range(len(command) + 1)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that neither will match.
However, if the real command is ls -la -ht
, then ls -la
will match.
Did I get that right?
We (Tony, Dima, me) briefly discussed in our daily sync today and agreed:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good now :)
Co-authored-by: PietroPasotti <[email protected]>
Co-authored-by: PietroPasotti <[email protected]>
d088f85
to
02d59be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, but one minor question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some thoughts about the toplevel API.
scenario/state.py
Outdated
return_code: int = 0 | ||
"""The return code of the process (0 is success).""" | ||
stdout: str = "" | ||
"""Any content written to stdout by the process.""" | ||
stderr: str = "" | ||
"""Any content written to stderr by the process.""" | ||
stdin: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in order to avoid confusions, how about we make this private and add a stdin
property?
Also, conceptually, I feel that stdin
does not quite belong to the State but to Context and is more on the level of juju_log
.
Just spitballing here, but how do we feel about:
ctx = Context(MyCharm)
exe = MyExec(['ls', '-ll'])
ctr = Container("grafana", execs=[exe])
ctx.run('update-status', State(containers={ctr}))
# give me the stdin the charm sent to the grafana container while executing this command
stdin: str = ctx.get_stdin(ctr, exe)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in order to avoid confusions, how about we make this private and add a stdin property?
Yes, that's much better, thanks! Done. (Although it may end up being irrelevant anyway!)
Also, conceptually, I feel that stdin does not quite belong to the State but to Context and is more on the level of juju_log.
I feel like it's much tidier to have everything bundled up together in the one mock process object rather than split off the input to it into the context.
In addition, it does seem reasonably likely that we'd want to extend this in the future to handle the other 'input' elements of exec()
- to verify that the right envars are being set, or that it's being run in the right working directory, or as the right user, and so on. It would be fairly clean to add those as additional private attributes of Exec
and properties to expose them. I don't think we would want all of them as additional context methods, or to have ctx.get_exec_input(container, command)
return some new object (essentially ops.testing.ExecArgs
) that collects them all.
I also don't love how the Context
object is becoming more and more context+side_effects.
I do get that it's not in the output state in the way that changes to secrets or relation data are - once the process has finished, it's gone. It somewhat feels like stdout/stderr/return_code are the same, in that the executable presumably still lives in the container but what happens when you (mock) run it isn't really "state".
I spent a few minutes convincing the rest of charm-tech that it should stay as it is in our daily call today, and then as I've tried to write it out I've found myself disagreeing with my own arguments and trying to come up with alternatives (I will spare you the terrible ones!).
An executable is basically a method that takes some input and returns some output. What if we did something similar to the handle_exec
system from Harness instead?
class MyCharm(ops.CharmBase):
...
def _on_update_status(self, event):
container = self.unit.get_container("foo")
grep = container.exec(["grep", "scenario"])
grep.stdin.write("foo\nsome text the charm wrote in the scenario test\nfoo")
out, err = grep.wait_output()
# do something with out, err
def mock_grep(args: ops.testing.ExecArgs, ) -> ops.testing.ExecResult:
assert args.command == ["grep", "scenario"]
assert args.stdin == "foo\nsome text the charm wrote in the scenario test\nfoo"
return ops.testing.ExecResult(stdout="some text the charm wrote in the scenario test")
ctx = Context(MyCharm)
grep = Exec(["grep"], mock_grep)
container = Container("foo", execs={grep})
state_in = State(containers={container})
state_out = ctx.run(ctx.on.update_status(), state_in)
assert ...
So the mock executable lives on in the state, but the input and output is something that's is re-run each time. The input, output, and exit code all live together in one (temporary) place.
I'm not a huge fan of having some of the asserts end up in the mock function, breaking the very clean typical pattern of arrange/act/assert. However, you could still do that with nonlocal
or like:
command = []
input = []
def mock_grep(args: ops.testing.ExecArgs) -> ops.testing.ExecResult:
command.append(args.command)
input.append(args.stdin)
return ops.testing.ExecResult(stdout="some text the charm wrote in the scenario test")
ctx = Context(MyCharm)
grep = Exec(["grep"], mock_grep)
container = Container("foo", execs={grep})
state_in = State(containers={container})
state_out = ctx.run(ctx.on.update_status(), state_in)
assert command == ["grep", "scenario"]
assert input == ["..."]
I would personally leave out the "results" shortcut that Harness has. It could always be added later, and it's simple to do - result=127
is just lambda _: ExecResult(exit_code=127)
and result="foo"
is just lambda _: ExecResult(stdout="foo")
.
WDYT? Also @benhoyt since we were in agreement on a different result earlier today, sorry! I probably should have written up my draft response before talking to the team and then would have come to a different conclusion earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm I don't really like the proposal of doing the asserts in a temporary spot. It's very un-scenario-y.
perhaps:
with container.mock('ps', '-aux') as exec:
ctx.run(event, state)
assert exec.input == foo
assert exec.output == foo
with the risk that if you need to mock lots of commands you end up with a confusing bunch of individual contexts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do also like the flexibility that a method has, for things like returning different output depending on the arguments or the stdin. On the other hand, it can't be included in a JSON dump like the rest of the State
can, which is not great - although arguably it's not that different from Container.mounts
serialising to a map of locations of content, not the content itself.
I played around with contexts as well - they do seem to be a reasonably natural fit in terms of having something that only exists within a specific scope. It can indeed end up with a lot, e.g.:
with (
container.mock(['ps', 'aux'], stdout=PS),
container.mock(['sleep', '10']),
container.mock(['mysql']) as mysql_exec,
container.mock(['/bin/false'], exit_code=1),
container.mock(['grep', 'scenario']) as grep_exec,
):
ctx.run(event, state)
assert grep_exec.stdin == bar
assert mysql_exec.stdin == baz
It's worse in Python <3.10 where you don't get to wrap the contexts in ()
.
What about bundling it into the existing context manager?
exec = Exec(["ls", "-lh"], stdout=LS_LH)
container = Container(execs={exec})
state = State(containers={container})
ctx = Context(MyCharm)
with ctx(event, state) as mgr:
state_out = mgr.run()
assert mgr.exec_history == [ExecArgs(...), ExecArgs(...)]
If you don't care about how the process is executed and just need to mock out the output/return code then you don't need to use the context manager. I do like it more when all the pieces of the process are together, but I also get that it's potentially confusing with some being input and some output (and it's kinda opposite to what you expect, since the input the charm writes is the output of the test, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm, quite like the idea of enclosing it in the Manager object, but I don't think that's going to be as easy to remember as having it directly on the exec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in terms of API discoverability that's a net loss I'm afraid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so I've taken this back to something much closer to @PietroPasotti's earlier suggestion, as also discussed with the Charm-Tech team last Friday.
The Exec
object only has the results of the process (exit code, stdout, stderr), just like in Scenario 6, and the input to the process (stdin, user, group, environment, etc) are available from the Context in a new exec_history
attribute.
I feel like we're doing a great job of keeping the State
only state but not a great job at keeping Context
only context - it's more and more "context plus side effects". But this way of handling exec is consistent at the moment, so let's go with this way, and I'll open a different ticket for tidying up Context
.
scenario/state.py
Outdated
@property | ||
def stdin(self): | ||
"""The content the charm wrote to the mock process's stdin.""" | ||
return self._stdin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, stdin is always string-like (str,bytes), and can never be None?
Or is None used as a sentinel?
🤔 I wonder if perhaps this should raise if the given Exec was never called... wdyt? Or... does Scenario only instantiate this object when charm attempts to exec something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, stdin is always string-like (str,bytes), and can never be None?
I'm completely ignoring bytes at the moment, which is probably wrong (but also unchanged from earlier Scenario). If you consistently use bytes then it should work apart from static checks complaining, I think. It probably ought to be explicit, although it would be nice to avoid all the overloading that ops has (but maybe that's unavoidable given that this is mocking ops).
To your actual point: it's currently None when the object is created and then will be a string in the output state, as long as wait()
or wait_output()
was called on the process. Perhaps ""
would be a better default value, although if it does grow to handle bytes as well then that's more complex, so None
is probably better in that case. But this might end up immaterial, see the long comment above.
🤔 I wonder if perhaps this should raise if the given Exec was never called... wdyt? Or... does Scenario only instantiate this object when charm attempts to exec something?
Exec
is instantiated by charm tests, but _ExecProcess
is only instantiated by Scenario when something gets exec'd.
I don't think raising if an Exec
isn't used is a good idea:
- It doesn't really align with the general Scenario behaviour. The tests don't care if you have a secret or relation or network or whatever that isn't used.
- It's convenient to have a fixture or similar helper that gives state components, and to be able to use that for a bunch of tests, only some of which might be executing the processes.
- It doesn't really match the real world: the executable is (most likely) still in the filesystem regardless of whether you actually execute it or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, I think I'm happy with where this ended up. A few minor questions.
@PietroPasotti are you good with how this has ended up? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tonyandrewmeyer yes, definitely! Thanks for the patience :)
…mocking (#124) Firstly, some simple renaming: * `Container.exec_mocks` becomes `Container.execs` * `Container.service_status` becomes `Container.service_statuses` * `ExecOutput` becomes `Exec` A behaviour change that is a bugfix: * When a process exits non-zero, the `ExecError` should have the stdout/stderr if `wait_output` was used (it's not readable afterwards by the charm, although the Scenario code doesn't enforce that). Some more substantial changes: * Provide the ability to get the exact command, the stdin, and all the other args that the charm used with the process (everything from os.testing's `ExecArgs`), via a new context attribute `exec_history`, which is a (default) dict where the key is the container name and the value is a list of Pebble exec's that have been run. * Support the same "find the closest match to this prefix" system for commands as Harness does We could add more of the functionality that Harness has, but I think this is a solid subset (I've wanted to be able to get to the stdin in writing tests myself, and the simple mock matching seems handy). It should be easy enough to extend in the future without needing API changes, I think, since this now has both input and output. The key parts that are missing are properly supporting binary IO and the execution context. --------- Co-authored-by: PietroPasotti <[email protected]>
Firstly, some simple renaming:
Container.exec_mocks
becomesContainer.execs
Container.service_status
becomesContainer.service_statuses
ExecOutput
becomesExec
A behaviour change that is a bugfix:
ExecError
should have the stdout/stderr ifwait_output
was used (it's not readable afterwards by the charm, although the Scenario code doesn't enforce that).Some more substantial changes:
ExecArgs
), via a new context attributeexec_history
, which is a (default) dict where the key is the container name and the value is a list of Pebble exec's that have been run.We could add more of the functionality that Harness has, but I think this is a solid subset (I've wanted to be able to get to the stdin in writing tests myself, and the simple mock matching seems handy). It should be easy enough to extend in the future without needing API changes, I think, since this now has both input and output. The key parts that are missing are properly supporting binary IO and the execution context.