Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/design: More details for distributed execution #3598

Merged
merged 2 commits into from
Feb 19, 2024

Conversation

codebien
Copy link
Contributor

@codebien codebien commented Feb 14, 2024

Added more details to the current distributed execution design doc.

There is still one open point regarding error handling API. But I'm still working on it and it will be addressed in a dedicated PR.

@codebien codebien self-assigned this Feb 14, 2024
@codecov-commenter
Copy link

codecov-commenter commented Feb 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (distributed-execution-design-doc@4ff4a8e). Click here to learn what that means.

❗ Current head 96f8f38 differs from pull request most recent head f2a0655. Consider uploading reports for the commit f2a0655 to get more accurate results

Additional details and impacted files
@@                         Coverage Diff                         @@
##             distributed-execution-design-doc    #3598   +/-   ##
===================================================================
  Coverage                                    ?   72.68%           
===================================================================
  Files                                       ?      259           
  Lines                                       ?    19864           
  Branches                                    ?        0           
===================================================================
  Hits                                        ?    14438           
  Misses                                      ?     4522           
  Partials                                    ?      904           
Flag Coverage Δ
ubuntu 72.62% <0.00%> (?)
windows 72.53% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@mstoykov mstoykov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, but need to review it with the whole other PR, so 👍 for merging

I have a single comment around runCloudLocally

7. **Done status**: Even if the test hasn't finished prematurely, _something_ needs to detect that all instances are done with their part of the test run. Because of executors like [`shared-iterations`](https://k6.io/docs/using-k6/scenarios/executors/shared-iterations/) and [`per-vu-iterations`](https://k6.io/docs/using-k6/scenarios/executors/per-vu-iterations/) and because iteration durations vary, only the maximum test duration is predictable and bounded, but the test might finish a lot sooner than that max possible duration and good UX would be to not force the user to wait needlessly.
8. **Run teardown once**: Regardless of whether the test has finished nominally or prematurely, _something_ needs to detect that it _has_ finished and must run `teardown()` on only one of the available instances, even if there were errors during the test. This is important because `setup()` might have potentially allocated costly resources. That said, any errors during `teardown()` execution must also be handled nicely.
9. **End-of-test and handleSummary**: After `teardown()` has been executed, we _somehow_ need to produce the [end-of-test summary](https://k6.io/docs/results-output/end-of-test/) by executing the [`handleSummary()` function](https://k6.io/docs/results-output/end-of-test/custom-summary/) on a k6 instance _somewhere_. For the best UX, the differences between local and distributed k6 runs should be as minimal as possible, so the user should be able to see the end-of-test summary in their terminal or CI system, regardless of whether the k6 test was local or distributed.
10. **Cloud metrics output**: We need to support `k6 run -o cloud script.js` case, and unfortunately, it requires a [creation phase](https://github.com/grafana/k6/blob/b5a6febd56385326ea849bde25ba09ed6324c046/output/cloud/output.go#L184-L188) for the test for registering the test on the k6 Cloud platform. It means that in case of distributed test, we may end with registering the same test multiple times. Then, moving out from the Cloud metrics output the test creation phase sounds more or less a pre-requiste for being able to deliver and use a distributed execution integrated with k6 Cloud. In concrete, it means to address [#3282](https://github.com/grafana/k6/issues/3282). If we need to narrow down the scope then we may decide, for a fist experimental phase, to not support this use-case. It would error if the distributed execution runs with the Cloud metrics output (`-o cloud`) set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need #3282 for this.

The call that creates the test can be ran only by the testcoordinator and then the whole configuration ca be given as pat of 1. here (distributing the archive). I expect other options will also need to be distributed or changed between instances so that seems plasusable as well.

I don't see how runLocally will help here, but it is possibility- just not really one we need.

see

k6/output/cloud/output.go

Lines 163 to 166 in b5a6feb

if out.config.PushRefID.Valid {
out.testRunID = out.config.PushRefID.String
out.logger.WithField("testRunId", out.testRunID).Debug("Directly pushing metrics without init")
return out.startVersionedOutput()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving logic to the coordinator means we are going to create a different execution path between the distributed and non-distributed versions.

Where instead, having k6 cloud --local-run would be the same execution path. It should allow to generate the cloud test run outside of the metrics output and only after inject the variable.

func cmdCloud() {
   if runLocal {
     cloudTest := createCloudTest()
     output := cloud.NewOutput(parms)
     output.SetTestRunID(cloudTest.ID)
     output.Start()
   }
}

@codebien
Copy link
Contributor Author

I've removed the changes regarding the Cloud output and error handling. We will discuss them maybe in a dedicated pull request. Or directly in the main one after we had a first consensus on the current design.

@codebien codebien merged commit 015ab96 into distributed-execution-design-doc Feb 19, 2024
11 of 19 checks passed
@codebien codebien deleted the codebien-nde branch February 19, 2024 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants