How to use `--store-durations` in Github Actions?

Let me first thank you for this great tool, which makes it super easy to decrease testing time in Github Actions.

I have some interrogations about how the new feature to combine `--store-durations` with `--groups` is supposed to be used to update test timings while running the splitted test suite during CI. I couldn't find any documentation, but I assume the idea is to do as suggested by @sondrelg in https://github.com/jerry-git/pytest-split/issues/11#issuecomment-850256162 and basically use the Github [actions/cache](https://github.com/actions/cache) to cache the `.test_durations` file.

I see two problems with that approach, both due to the fact that, as far as I understand, loading the cache is done at the beginning of the job, storing the cache at the end. 
1. If there are several concurrent runs of the tests for different groups, they will read the same value of the file at the beginning if available (from the previous run), but they will try to overwrite the cache when they finish. The result is that the slowest group will have its durations persisted, because faster groups will have their durations overwritten almost immediately by slower ones. I believe this will cause all test durations from faster groups to be overestimated (as they will have no durations, so average test duration of the slowest group will be taken), so on average the other groups with unestimated tests will still finish faster, and it's not clear to me that over several runs, the durations will converge to the accurate values, as the slowest group will probably remain the slowest.
2. I don't think there's any guarantee that a job for a given group can't start after the job for a different group in the same run finishes. However, if that happens, the first job will have already updated the durations in the cache before the second one starts, causing a potentially different split into groups, which might cause some tests to run twice or not to run at all.

Unless I misunderstood how this whole thing works, I think the more robust approach would be to store the group durations as artifacts, and then have an additional job in the workflow which depends on the group runs that consolidates all the separate artifacts into one duration file and caches that. That would solve problem 1. as all group durations will be taken into account, and problem 2. as the caching will happen only after all test runs finish. The missing piece to implement such a strategy is a tool that can combine the test durations from the different groups, and a way to annotate in the `.test_durations` file whether a duration has been updated in the current run in order for the tool to know which durations need to be put into the combined file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to use `--store-durations` in Github Actions? #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to use --store-durations in Github Actions? #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

How to use `--store-durations` in Github Actions? #20