[RFE] Tolerate failures of HC installs in HyperShift rosa-cli wrapper #126

smalleni · 2023-03-14T14:36:29Z

Currently, the workload is kicked only when all hosted clusters are created. However, we have seen that it's highly likely that one or more HC installs fails leading to the entire test failing and having to cleanup, and rerun.

It would be good to have a flag to consume toleration for failure like (10%) and as long as the number of Hosted Clusters that have not been created after a timeout is less than that, the script should proceed to creating the workload and cleaning up after that.

smalleni · 2023-03-14T14:36:37Z

@morenod

venkataanil · 2023-03-14T14:45:33Z

We can enhance wrapper to have 2 ways to start e2e

current way of when all the clusters moved to ready state
Additionally make wrapper listen on a socket (or a file) and wait for user command like "start_e2e" to start e2e before above condition is met (i.e before all clusters are ready). In this way user can ask wrapper whenever he wants the e2e to get started for example, start e2e when he sees 50 clusters are ready during 80HC testing.

morenod self-assigned this Mar 14, 2023

venkataanil mentioned this issue Mar 23, 2023

start e2e before all clusters are ready #127

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFE] Tolerate failures of HC installs in HyperShift rosa-cli wrapper #126

[RFE] Tolerate failures of HC installs in HyperShift rosa-cli wrapper #126

smalleni commented Mar 14, 2023

smalleni commented Mar 14, 2023

venkataanil commented Mar 14, 2023

[RFE] Tolerate failures of HC installs in HyperShift rosa-cli wrapper #126

[RFE] Tolerate failures of HC installs in HyperShift rosa-cli wrapper #126

Comments

smalleni commented Mar 14, 2023

smalleni commented Mar 14, 2023

venkataanil commented Mar 14, 2023