-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[spike] Gather info for Vai Testing #1276
Comments
UI Benchmark Considerations:
Steve API Benchmark Considerations:
Ideally we can automate these sooner rather than later, if push comes to shove we can do the 1st run "manually" |
We will need to rely on Cypress or a similar tool in order to perform browser-based frontend tests. We hope to leverage the rancher/dashboard test framework wherever we can for this effort. Loading up the cluster with ConfigMaps can be done via shepherd or a relatively simple bash script, this process will need to be batched.
We can utilize |
Found an old golang library for ingesting JUnit XML reports: https://github.com/joshdk/go-junit Cypress can output a JUnit XML report: https://docs.cypress.io/guides/tooling/reporters |
Organizational notes:
I suggest you create two separate issues and discuss frameworks, test setup, metrics and criteria separately. |
Backend test notesCluster setup
That is a monster setup, I expect this benchmark to actually pass with way less hardware - and in any case 95% of the development should be carried out on a smaller setup, and hardware maxing should happen as a last step, if needed. A starting point could be: AWS: 3 nodes, 4 vCPUs, 16 GiB of RAM each (eg. (I have no problem in doing development with k3d on your laptop if that is more convenient - then re-running on the above "light" setup and leave the "heavy monster" setup only if all else fails as a last option) Repeating the test on k3s is relatively unimportant and can be left at a later point, eg. after the browser tests are complete. If you need any other details about setup please ask. Benchmarking criteria notes
What you really care is that asking for a page (100 resources) consistently stays below half a second in http duration (see below) in 95% of cases - no matter the sorting, filtering, resource type and size. You should see how well that number scales if the number of virtual users grows - the minimum being 20 users making 1 request every 5 seconds. As a second objective, add virtual users who concurrently change the ConfigMaps and see how performance degrades as more virtual users change them (this is more exploratory, we can set a pass/fail limit when we see the first results). Metrics tracking: as a first step, make sure relevant stats are recorded in Qase (eg. p(95) expected: under 500ms, actual: 234 ms, test PASS). Full k6 output nice to have. Grafana tracking can be added later. Framework choice notes
Implementation notes
type Metrics struct {
HTTPReqDuration struct {
Values struct {
P95 float64 `json:"p(95)"`
} `json:"values"`
} `json:"http_req_duration"`
}
...
bytes, _ := ioutil.ReadAll(jsonFile)
var result Result
json.Unmarshal(bytes, &result)
|
UI side, it's important to note that the new vai backed API and it's features will be used
This effort is tracked in rancher/dashboard#8527 and will be partially complete in 2.9.0 (as described |
@moio In regards to the upstream cluster setup for vai testing, should we model the same config? Example: 20 projects, 1000 Secrets, 5 users, 10 roles, 50 workload pods, etc. |
@git-ival FMPOV not necessarily. To me, they could as well be empty or almost empty (as empty as a default installation is). What you will need tens of thousands of the specific resource under test (eg. ConfigMaps if you are testing the ConfigMaps page, Secrets if it is Secrets and so on) on the cluster under test (upstream and at least one downstream should be tested, because affected Steve code is in both). But in principle, other resources should not matter. |
The text was updated successfully, but these errors were encountered: