Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: rename Container.exec_mocks to Container.execs and extend mocking #124

Merged

Conversation

tonyandrewmeyer
Copy link
Collaborator

@tonyandrewmeyer tonyandrewmeyer commented Apr 18, 2024

Firstly, some simple renaming:

  • Container.exec_mocks becomes Container.execs
  • Container.service_status becomes Container.service_statuses
  • ExecOutput becomes Exec

A behaviour change that is a bugfix:

  • When a process exits non-zero, the ExecError should have the stdout/stderr if wait_output was used (it's not readable afterwards by the charm, although the Scenario code doesn't enforce that).

Some more substantial changes:

  • Provide the ability to get the exact command, the stdin, and all the other args that the charm used with the process (everything from os.testing's ExecArgs), via a new context attribute exec_history, which is a (default) dict where the key is the container name and the value is a list of Pebble exec's that have been run.
  • Support the same "find the closest match to this prefix" system for commands as Harness does

We could add more of the functionality that Harness has, but I think this is a solid subset (I've wanted to be able to get to the stdin in writing tests myself, and the simple mock matching seems handy). It should be easy enough to extend in the future without needing API changes, I think, since this now has both input and output. The key parts that are missing are properly supporting binary IO and the execution context.

@tonyandrewmeyer tonyandrewmeyer marked this pull request as ready for review May 22, 2024 00:08
Copy link
Collaborator

@PietroPasotti PietroPasotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description seems to suggest this isn't quite ready for review. Gave it a quick look and found a couple of things that need clarification.

handlers = {
tuple(exc.command): (key, exc) for key, exc in self._container.execs.items()
}
for prefix_len in reversed(range(len(command) + 1)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand what's going on.
Suppose you have two execs, one for ls -ll and one for ls -la, and the charm does ls --fubar. Which exec mock will this match?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that neither will match.

However, if the real command is ls -la -ht, then ls -la will match.

Did I get that right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Harness behaviour (which I intended to match here, although I think I'm too low on tests) is that the longest match wins, but also that you have to match what's there. So in your example, neither would match.

exec 1 exec 2 command winner
['ls', '-ll'] ['ls', '-la'] ['ls', '--fubar'] none
[] ['ls'] ['ls', '--fubar'] exec 2
[] ['ls', '-ll'] ['ls', '--fubar'] exec 1
['ls', '-ll', 'foo.txt'] ['ls', '-ll'] ['ls', '-ll', 'foo.txt'] exec 1

I'm not sure what Harness does if there are two that have the exact same command pattern, but it seems to me that should be an error (consistency check?).

What I see charms doing (with Harness) is:

  • using [], presumably because they know that only one command will run and they either want to keep the test code clean or can't be bothered copying the exact command over to the test code
  • using [first] when the command is things like [first, arg1, arg2, arg2] - probably partly the same reasoning as [] but maybe also because there are multiple ways that the charm could run it and the result should always be the same (I haven't looked deeply enough to check)
  • using the full command

It does seem like having this is a nice convenience. @benhoyt probably has the full story on how Harness came to have that behaviour (it slightly predates me joining Canonical if I have the timing right).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked over this with @benhoyt on a call also, and he agrees that it's worth keeping the prefix matching in. The spec for adding exec testing in Harness has some more of the background/details.

scenario/mocking.py Outdated Show resolved Hide resolved
scenario/mocking.py Outdated Show resolved Hide resolved
return _MockExecProcess(
change_id=change_id,
command=command,
out=out,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we should uniformly rename 'out' to 'exec' while we're at it

scenario/ops_main_mock.py Outdated Show resolved Hide resolved
@tonyandrewmeyer
Copy link
Collaborator Author

The PR description seems to suggest this isn't quite ready for review. Gave it a quick look and found a couple of things that need clarification.

Ah indeed, I missed that when moving a bunch to ready. Should be ok for review now, although there are some things to decide before it could actually be mergable.

README.md Outdated
Comment on lines 701 to 654
execs={
"list-directory": scenario.Exec( # "list-directory" is an arbitrary tag for the command
('ls', '-ll'): # this is the command we're mocking
return_code=0,
stdout=LS_LL
),
Copy link
Collaborator Author

@tonyandrewmeyer tonyandrewmeyer May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was based on the earlier plan that @benhoyt and I came up with for 7.0 where State components would generally be dicts. However, since we're looking at sets instead now, I think this would be better as:

Suggested change
execs={
"list-directory": scenario.Exec( # "list-directory" is an arbitrary tag for the command
('ls', '-ll'): # this is the command we're mocking
return_code=0,
stdout=LS_LL
),
execs={
scenario.Exec(
('ls', '-ll'): # this is the command we're mocking
return_code=0,
stdout=LS_LL
),
},

However, if we're going to adopt the "use the first command match" behaviour from Harness (it does seem like a bunch of charms are making use of this) then order matters, so it would actually need to be a list (or tuple):

Suggested change
execs={
"list-directory": scenario.Exec( # "list-directory" is an arbitrary tag for the command
('ls', '-ll'): # this is the command we're mocking
return_code=0,
stdout=LS_LL
),
execs=[
scenario.Exec(
('ls', '-ll'): # this is the command we're mocking
return_code=0,
stdout=LS_LL
),
],

I'm no longer convinced that having mocked command encapsulated into the Exec object alongside the results makes sense. So it would go back to the original, just with the rename:

Suggested change
execs={
"list-directory": scenario.Exec( # "list-directory" is an arbitrary tag for the command
('ls', '-ll'): # this is the command we're mocking
return_code=0,
stdout=LS_LL
),
# The command we're mocking.
cmd = ('ls', '-ll')
# The mocked result to provide
result = scenario.Exec(return_code=0, stdout=LS_LL)
container = scenario.Container(name='foo', execs={cmd: result})

@benhoyt @PietroPasotti what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the last suggestion (back to mapping, rename only) makes sense but has a few downsides:

  • all other objects are flat frozensets, why is this one a mapping?
  • Dict is mutable. Perhaps we could use a frozendict to prevent further mutation at runtime?

Re the 'people rely on definition order' issue: it's true that dicts are ordered nowadays, but still it's a bit smelly to rely on that behaviour IMHO.

I also miss the context of the harness exec design, but I think we could make our lives easier if we force people to use the full command they want to mock.

We could also allow multiple commands per exec:

        execs={
            scenario.Exec(
                cmds=(
                    ('ls'), 
                    ('ls', '-ll'), 
                ),
                return_code=0,
                stdout=LS_LL
            ),
        },
   

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonyandrewmeyer Let's discuss briefly after our daily sync. Given that we've settled on sets for everything, that's where I'd lean. Also, the "prefix matching" has been quite useful for Harness, though if that makes the implementation problematic, I'm open to not doing it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benhoyt and I discussed this in a call and agree that since most of the other components are moving to frozensets, this should as well (which means moving the command back into the object). @benhoyt suggested using command_prefix rather than command (like Harness) to make it clearer that it's going to pattern match.

I think if you want multiple commands to have the same return code / stdout / stderr, then it's pretty simple to just put those in variables and use them when constructing multiple Execs (in a set comprehension, perhaps).

handlers = {
tuple(exc.command): (key, exc) for key, exc in self._container.execs.items()
}
for prefix_len in reversed(range(len(command) + 1)):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Harness behaviour (which I intended to match here, although I think I'm too low on tests) is that the longest match wins, but also that you have to match what's there. So in your example, neither would match.

exec 1 exec 2 command winner
['ls', '-ll'] ['ls', '-la'] ['ls', '--fubar'] none
[] ['ls'] ['ls', '--fubar'] exec 2
[] ['ls', '-ll'] ['ls', '--fubar'] exec 1
['ls', '-ll', 'foo.txt'] ['ls', '-ll'] ['ls', '-ll', 'foo.txt'] exec 1

I'm not sure what Harness does if there are two that have the exact same command pattern, but it seems to me that should be an error (consistency check?).

What I see charms doing (with Harness) is:

  • using [], presumably because they know that only one command will run and they either want to keep the test code clean or can't be bothered copying the exact command over to the test code
  • using [first] when the command is things like [first, arg1, arg2, arg2] - probably partly the same reasoning as [] but maybe also because there are multiple ways that the charm could run it and the result should always be the same (I haven't looked deeply enough to check)
  • using the full command

It does seem like having this is a nice convenience. @benhoyt probably has the full story on how Harness came to have that behaviour (it slightly predates me joining Canonical if I have the timing right).

tests/test_e2e/test_pebble.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only a partial review today

handlers = {
tuple(exc.command): (key, exc) for key, exc in self._container.execs.items()
}
for prefix_len in reversed(range(len(command) + 1)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that neither will match.

However, if the real command is ls -la -ht, then ls -la will match.

Did I get that right?

@tonyandrewmeyer tonyandrewmeyer marked this pull request as draft May 30, 2024 07:24
@benhoyt
Copy link
Collaborator

benhoyt commented May 31, 2024

We (Tony, Dima, me) briefly discussed in our daily sync today and agreed:

  1. execs should be a set (like other top-level types are now)
  2. The prefix-matching in Harness.handle_exec has been very useful, and we think we should replicate it here. The first argument of scenario.Exec will be command_prefix (not keyword-only). Often just saying execs={scenario.Exec(['ls'])} or similar will be enough.
  3. The stdin field is an output field, so we need a way to get the Exec output instance. We'd add a Container.get_exec() method where the lookup argument was either the command_prefix or the input exec instance -- same as other get_foo methods at the top level.

.pre-commit-config.yaml Outdated Show resolved Hide resolved
Copy link
Collaborator

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good now :)

.pre-commit-config.yaml Outdated Show resolved Hide resolved
tests/test_consistency_checker.py Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
scenario/consistency_checker.py Outdated Show resolved Hide resolved
scenario/mocking.py Outdated Show resolved Hide resolved
scenario/mocking.py Outdated Show resolved Hide resolved
scenario/state.py Outdated Show resolved Hide resolved
@tonyandrewmeyer tonyandrewmeyer force-pushed the container-exec-enhancements branch from d088f85 to 02d59be Compare August 7, 2024 00:26
@tonyandrewmeyer tonyandrewmeyer marked this pull request as ready for review August 7, 2024 01:28
Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but one minor question.

scenario/state.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@PietroPasotti PietroPasotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some thoughts about the toplevel API.

README.md Outdated Show resolved Hide resolved
scenario/consistency_checker.py Outdated Show resolved Hide resolved
scenario/mocking.py Outdated Show resolved Hide resolved
scenario/mocking.py Outdated Show resolved Hide resolved
scenario/mocking.py Outdated Show resolved Hide resolved
return_code: int = 0
"""The return code of the process (0 is success)."""
stdout: str = ""
"""Any content written to stdout by the process."""
stderr: str = ""
"""Any content written to stderr by the process."""
stdin: Optional[str] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in order to avoid confusions, how about we make this private and add a stdin property?

Also, conceptually, I feel that stdin does not quite belong to the State but to Context and is more on the level of juju_log.

Just spitballing here, but how do we feel about:

ctx = Context(MyCharm)
exe = MyExec(['ls', '-ll'])
ctr = Container("grafana", execs=[exe])
ctx.run('update-status', State(containers={ctr}))

# give me the stdin the charm sent to the grafana container while executing this command
stdin: str = ctx.get_stdin(ctr, exe)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in order to avoid confusions, how about we make this private and add a stdin property?

Yes, that's much better, thanks! Done. (Although it may end up being irrelevant anyway!)

Also, conceptually, I feel that stdin does not quite belong to the State but to Context and is more on the level of juju_log.

I feel like it's much tidier to have everything bundled up together in the one mock process object rather than split off the input to it into the context.

In addition, it does seem reasonably likely that we'd want to extend this in the future to handle the other 'input' elements of exec() - to verify that the right envars are being set, or that it's being run in the right working directory, or as the right user, and so on. It would be fairly clean to add those as additional private attributes of Exec and properties to expose them. I don't think we would want all of them as additional context methods, or to have ctx.get_exec_input(container, command) return some new object (essentially ops.testing.ExecArgs) that collects them all.

I also don't love how the Context object is becoming more and more context+side_effects.

I do get that it's not in the output state in the way that changes to secrets or relation data are - once the process has finished, it's gone. It somewhat feels like stdout/stderr/return_code are the same, in that the executable presumably still lives in the container but what happens when you (mock) run it isn't really "state".

I spent a few minutes convincing the rest of charm-tech that it should stay as it is in our daily call today, and then as I've tried to write it out I've found myself disagreeing with my own arguments and trying to come up with alternatives (I will spare you the terrible ones!).

An executable is basically a method that takes some input and returns some output. What if we did something similar to the handle_exec system from Harness instead?

class MyCharm(ops.CharmBase):
    ...
    def _on_update_status(self, event):
        container = self.unit.get_container("foo")
        grep = container.exec(["grep", "scenario"])
        grep.stdin.write("foo\nsome text the charm wrote in the scenario test\nfoo")
        out, err = grep.wait_output()
        # do something with out, err

def mock_grep(args: ops.testing.ExecArgs, ) -> ops.testing.ExecResult:
    assert args.command == ["grep", "scenario"]
    assert args.stdin == "foo\nsome text the charm wrote in the scenario test\nfoo"
    return ops.testing.ExecResult(stdout="some text the charm wrote in the scenario test")

ctx = Context(MyCharm)
grep = Exec(["grep"], mock_grep)
container = Container("foo", execs={grep})
state_in = State(containers={container})

state_out = ctx.run(ctx.on.update_status(), state_in)

assert ...

So the mock executable lives on in the state, but the input and output is something that's is re-run each time. The input, output, and exit code all live together in one (temporary) place.

I'm not a huge fan of having some of the asserts end up in the mock function, breaking the very clean typical pattern of arrange/act/assert. However, you could still do that with nonlocal or like:

command = []
input = []

def mock_grep(args: ops.testing.ExecArgs) -> ops.testing.ExecResult:
    command.append(args.command)
    input.append(args.stdin)
    return ops.testing.ExecResult(stdout="some text the charm wrote in the scenario test")

ctx = Context(MyCharm)
grep = Exec(["grep"], mock_grep)
container = Container("foo", execs={grep})
state_in = State(containers={container})

state_out = ctx.run(ctx.on.update_status(), state_in)

assert command == ["grep", "scenario"]
assert input == ["..."]

I would personally leave out the "results" shortcut that Harness has. It could always be added later, and it's simple to do - result=127 is just lambda _: ExecResult(exit_code=127) and result="foo" is just lambda _: ExecResult(stdout="foo").

WDYT? Also @benhoyt since we were in agreement on a different result earlier today, sorry! I probably should have written up my draft response before talking to the team and then would have come to a different conclusion earlier.

Copy link
Collaborator

@PietroPasotti PietroPasotti Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm I don't really like the proposal of doing the asserts in a temporary spot. It's very un-scenario-y.

perhaps:

with container.mock('ps', '-aux') as exec:
    ctx.run(event, state)
    assert exec.input == foo
    assert exec.output == foo

with the risk that if you need to mock lots of commands you end up with a confusing bunch of individual contexts?

Copy link
Collaborator Author

@tonyandrewmeyer tonyandrewmeyer Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do also like the flexibility that a method has, for things like returning different output depending on the arguments or the stdin. On the other hand, it can't be included in a JSON dump like the rest of the State can, which is not great - although arguably it's not that different from Container.mounts serialising to a map of locations of content, not the content itself.

I played around with contexts as well - they do seem to be a reasonably natural fit in terms of having something that only exists within a specific scope. It can indeed end up with a lot, e.g.:

with (
    container.mock(['ps', 'aux'], stdout=PS),
    container.mock(['sleep', '10']),
    container.mock(['mysql']) as mysql_exec,
    container.mock(['/bin/false'], exit_code=1),
    container.mock(['grep', 'scenario']) as grep_exec,
):
    ctx.run(event, state)
    assert grep_exec.stdin == bar
    assert mysql_exec.stdin == baz

It's worse in Python <3.10 where you don't get to wrap the contexts in ().

What about bundling it into the existing context manager?

exec = Exec(["ls", "-lh"], stdout=LS_LH)
container = Container(execs={exec})
state = State(containers={container})
ctx = Context(MyCharm)
with ctx(event, state) as mgr:
    state_out = mgr.run()
    assert mgr.exec_history == [ExecArgs(...), ExecArgs(...)]

If you don't care about how the process is executed and just need to mock out the output/return code then you don't need to use the context manager. I do like it more when all the pieces of the process are together, but I also get that it's potentially confusing with some being input and some output (and it's kinda opposite to what you expect, since the input the charm writes is the output of the test, etc).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm, quite like the idea of enclosing it in the Manager object, but I don't think that's going to be as easy to remember as having it directly on the exec.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in terms of API discoverability that's a net loss I'm afraid

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I've taken this back to something much closer to @PietroPasotti's earlier suggestion, as also discussed with the Charm-Tech team last Friday.

The Exec object only has the results of the process (exit code, stdout, stderr), just like in Scenario 6, and the input to the process (stdin, user, group, environment, etc) are available from the Context in a new exec_history attribute.

I feel like we're doing a great job of keeping the State only state but not a great job at keeping Context only context - it's more and more "context plus side effects". But this way of handling exec is consistent at the moment, so let's go with this way, and I'll open a different ticket for tidying up Context.

scenario/state.py Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Comment on lines 700 to 703
@property
def stdin(self):
"""The content the charm wrote to the mock process's stdin."""
return self._stdin
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, stdin is always string-like (str,bytes), and can never be None?

Or is None used as a sentinel?

🤔 I wonder if perhaps this should raise if the given Exec was never called... wdyt? Or... does Scenario only instantiate this object when charm attempts to exec something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, stdin is always string-like (str,bytes), and can never be None?

I'm completely ignoring bytes at the moment, which is probably wrong (but also unchanged from earlier Scenario). If you consistently use bytes then it should work apart from static checks complaining, I think. It probably ought to be explicit, although it would be nice to avoid all the overloading that ops has (but maybe that's unavoidable given that this is mocking ops).

To your actual point: it's currently None when the object is created and then will be a string in the output state, as long as wait() or wait_output() was called on the process. Perhaps "" would be a better default value, although if it does grow to handle bytes as well then that's more complex, so None is probably better in that case. But this might end up immaterial, see the long comment above.

🤔 I wonder if perhaps this should raise if the given Exec was never called... wdyt? Or... does Scenario only instantiate this object when charm attempts to exec something?

Exec is instantiated by charm tests, but _ExecProcess is only instantiated by Scenario when something gets exec'd.

I don't think raising if an Exec isn't used is a good idea:

  • It doesn't really align with the general Scenario behaviour. The tests don't care if you have a secret or relation or network or whatever that isn't used.
  • It's convenient to have a fixture or similar helper that gives state components, and to be able to use that for a bunch of tests, only some of which might be executing the processes.
  • It doesn't really match the real world: the executable is (most likely) still in the filesystem regardless of whether you actually execute it or not.

Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, I think I'm happy with where this ended up. A few minor questions.

README.md Outdated Show resolved Hide resolved
scenario/context.py Outdated Show resolved Hide resolved
scenario/mocking.py Show resolved Hide resolved
@tonyandrewmeyer
Copy link
Collaborator Author

@PietroPasotti are you good with how this has ended up?

Copy link
Collaborator

@PietroPasotti PietroPasotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonyandrewmeyer yes, definitely! Thanks for the patience :)

@tonyandrewmeyer tonyandrewmeyer merged commit 0e4ed16 into canonical:7.0 Aug 20, 2024
2 checks passed
@tonyandrewmeyer tonyandrewmeyer deleted the container-exec-enhancements branch August 20, 2024 06:19
tonyandrewmeyer added a commit that referenced this pull request Sep 2, 2024
…mocking (#124)

Firstly, some simple renaming:

* `Container.exec_mocks` becomes `Container.execs`
* `Container.service_status` becomes `Container.service_statuses`
* `ExecOutput` becomes `Exec`

A behaviour change that is a bugfix:

* When a process exits non-zero, the `ExecError` should have the
stdout/stderr if `wait_output` was used (it's not readable afterwards by
the charm, although the Scenario code doesn't enforce that).

Some more substantial changes:

* Provide the ability to get the exact command, the stdin, and all the
other args that the charm used with the process (everything from
os.testing's `ExecArgs`), via a new context attribute `exec_history`,
which is a (default) dict where the key is the container name and the
value is a list of Pebble exec's that have been run.
* Support the same "find the closest match to this prefix" system for
commands as Harness does

We could add more of the functionality that Harness has, but I think
this is a solid subset (I've wanted to be able to get to the stdin in
writing tests myself, and the simple mock matching seems handy). It
should be easy enough to extend in the future without needing API
changes, I think, since this now has both input and output. The key
parts that are missing are properly supporting binary IO and the
execution context.

---------

Co-authored-by: PietroPasotti <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants