Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add CAPI kubetest2 deployer #4041

Conversation

randomvariable
Copy link
Member

What this PR does / why we need it:
Follow up from #3855 to bring this to completion.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #4040

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 4, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign cecilerobertmichon after the PR has been reviewed.
You can assign the PR to them by writing /assign @cecilerobertmichon in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@elmiko
Copy link
Contributor

elmiko commented Jan 4, 2021

happy to see this pr continuing. i am going to keep testing but i do still have a few open questions. see #3855 (review)

@jsturtevant
Copy link
Contributor

This should address some of #2826

As an example, to deploy with CAPD:

```
export KIND_EXPERIMENTAL_DOCKER_NETWORK=bridge
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can go away now that CAPD works on the kind network (it requires kind >= v0.8)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, i found i wasn't needing this env variable while testing

if err := process.ExecJUnitContext(ctx, "kubectl", args, os.Environ()); err != nil {
return err
}
args = []string{"--kubeconfig", kubeconfig, "wait", "--for=condition=Available", "--all", "--all-namespaces", "deployment", "--timeout=-1m"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a little bit fragile because it assumes the CNI is using deployments. What about waiting for the nodes to be ready instead?

}

println("Up: waiting for cluster to become ready\n")
args = []string{"wait", "--for=condition=Ready", "cluster/" + d.workloadClusterName, "--timeout=-1m"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intention here is to check if the cluster is fully provisioned, we should check also for the machines to be provisioned. This is tricky because it usually means to wait for the control plane provider to be ready and for the higher-level abstractions (MachineDeployments/MachinePools) to be ready.
Might be a possible way forward is to wait for the expected number of machines to be ready...

Comment on lines +161 to +169
args = []string{
"config",
"cluster", d.workloadClusterName,
"--infrastructure", d.provider,
"--kubernetes-version", d.kubernetesVersion,
"--worker-machine-count", d.workerCount,
"--control-plane-machine-count", d.controlPlaneCount,
"--flavor", d.flavor,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for confirmation: this syntax assumes that all the flavors required for E2E should be part of the CAP* release artifact. Is this the desired behavior? (I'm wondering if this is an excessive burden for the CAP* maintainers, and also a limitation for the end users)

"--flavor", d.flavor,
}

clusterctl := exec.CommandContext(ctx, "clusterctl", args...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if instead of shelling out we can use clusterctl as a library (directly or via the Cluster API E2E test framework) so we can have more control over this integration.
It would also remove the dependency on the clusterctl binary, which simplifies the test setup

}

func (d *deployer) DumpClusterLogs() error {
panic("Not implemented")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we could leverage on the Cluster API E2E test framework for solving this problem...

// helper used to create & bind a flagset to the deployer
func bindFlags(d *deployer, flags *pflag.FlagSet) {
flags.StringVar(
&d.provider, "provider", "", "--provider flag for clusterctl",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&d.provider, "provider", "", "--provider flag for clusterctl",
&d.provider, "provider", "", "--infrastructure flag for clusterctl",

What about renaming the flag into infrastructure-provider, cloud-provider or something more intuitive for people not aware of the Cluster API internals/terminology

&d.workerCount, "worker-machine-count", "1", "--worker-machine-count flag for clusterctl",
)
flags.StringVar(
&d.flavor, "flavor", "", "--flavor flag for clusterctl",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about renaming the flag into cluster-template, cluster-spec or something more intuitive for people not aware of the Cluster API internals/terminology

&d.cniManifest, "cni-manifest", "", "automatically install this CNI manifest when the cluster becomes available",
)
flags.StringVar(
&d.workloadClusterName, "workload-cluster-name", "capi-workload-cluster", "the workload cluster name",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we drop the workload- prefix, which is part of the CAPI jargon.
Also, the default could be adapted to reflect the purpose of the cluster instead of focusing on the provisioning method (test something)

Comment on lines +43 to +51
--image-name "kindest/node:$KUBE_GIT_VERSION" \
--config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraMounts:
- hostPath: /var/run/docker.sock
containerPath: /var/run/docker.sock
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we can use the Cluster API E2E test framework instead of "sigs.k8s.io/kubetest2/kubetest2-kind/deployer", so we can hide all those details from the end-users.
The tricky part will be how to support --build with the different infrastructure providers...

@fabriziopandini
Copy link
Member

/milestone v0.4.0

@k8s-ci-robot k8s-ci-robot added this to the v0.4.0 milestone Jan 19, 2021
Comment on lines +5 to +9
- cluster-api-admins
- cluster-api-maintainers
- benmoss
- elmiko
- randomvariable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this list up to date?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel like @benmoss may not want to stay on that list. i'm happy to be an approver, but i wonder if we don't have some other folks who would like to get more involved as well?

@vincepri
Copy link
Member

vincepri commented Feb 4, 2021

@randomvariable What's the status of this PR?

@@ -0,0 +1,26 @@
/*
Copyright 2019 The Kubernetes Authors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Copyright 2019 The Kubernetes Authors.
Copyright 2021 The Kubernetes Authors.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 23, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 22, 2021
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build Kubetest2 Deployer for testing across Kubernetes projects
8 participants