-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Start building out shim layer for Tinkerbell #8
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Hardware type still needs to be built out.
- Scaffolding for workflow and hardware controllers needs to be built out
- Tinkerbell informers need to be setup for all types and plumbed through to trigger reconciliations of appropriate resources.
- Add support/rbac permissions for creating k8s resources to mirror Tinkerbell resources when the k8s resources do not yet exist.
limitations under the License. | ||
*/ | ||
|
||
package v1alpha3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this mainly for completeness
go.mod
Outdated
k8s.io/component-base v0.17.14 | ||
k8s.io/klog v1.0.0 | ||
k8s.io/utils v0.0.0-20201110183641-67b214c5f920 | ||
k8s.io/utils v0.0.0-20201110183641-67b214c5f920 // indirect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes came from go mod tidy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be committed separately I think, together with go.sum
. And there should be a CI check for that.
Edit: #10
tink/api/v1alpha1/hardware_types.go
Outdated
type HardwareSpec struct { | ||
} | ||
|
||
// HardwareStatus defines the observed state of Hardware. | ||
type HardwareStatus struct { | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't get far enough today to finish out the Hardware types
const ( | ||
// TemplateIDAnnotation is used by the controller to store the | ||
// ID assigned to the workflow by Tinkerbell. | ||
TemplateIDAnnotation = "template.tinkerbell.org/id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally I was planning on adding a Spec
field for the ID that is stored in Tinkerbell, but then I thought it might be confusing for users, so I figured an annotation might work. We can also add it to the Status to make it more easily visible to the user, but I also wanted to make sure that it was stored in a persistent way for idempotency.
client.Client | ||
Log logr.Logger | ||
Recorder record.EventRecorder | ||
Scheme *runtime.Scheme |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: need to plumb through a tinkerbell client
err := ErrNotImplemented | ||
|
||
logger.Error(err, "Not yet implemented") | ||
|
||
return ctrl.Result{}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: use the tinkerbell client to reconcile resource
logger := r.Log.WithValues("template", template.Name) | ||
err := ErrNotImplemented | ||
|
||
logger.Error(err, "Not yet implemented") | ||
|
||
// controllerutil.RemoveFinalizer(template, tinkv1alpha1.TemplateFinalizer) | ||
|
||
return ctrl.Result{}, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: use the tinkerbell client to delete resource and remove finalizer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to get this through checks added in #9.
Left some thoughts and questions.
go.mod
Outdated
k8s.io/component-base v0.17.14 | ||
k8s.io/klog v1.0.0 | ||
k8s.io/utils v0.0.0-20201110183641-67b214c5f920 | ||
k8s.io/utils v0.0.0-20201110183641-67b214c5f920 // indirect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be committed separately I think, together with go.sum
. And there should be a CI check for that.
Edit: #10
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need doc.go
file if it contains no doc? 😉 Perhaps we should move the docs from groupversion_info.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't necessarily, but we've had issues with web-based tools like godoc in the past with the lack of doc.go files when someone tries to access docs specific to the api types and the project wasn't already indexed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Godoc require at least one file in the package to have documentation, it doesn't necessarily be doc.go
file. Though when you have many files in the package, IMO it's good to have a separate file for docs, so no other files becomes "special".
TL;DR I suggest moving the Package v1alpha1 ..
comment from groupversion_info.go
to doc.go
file.
tink/api/v1alpha1/hardware_types.go
Outdated
// HardwareIDAnnotation is used by the controller to store the | ||
// ID assigned to the workflow by Tinkerbell. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// HardwareIDAnnotation is used by the controller to store the | |
// ID assigned to the workflow by Tinkerbell. | |
// HardwareIDAnnotation is used by the controller to store the | |
// hardware ID assigned to the workflow by Tinkerbell. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, copypasta issue. Should have been 'ID assigned to the hardware by Tinkerbell'
before := template.DeepCopy() | ||
controllerutil.AddFinalizer(template, tinkv1alpha1.TemplateFinalizer) | ||
|
||
if err := r.Client.Patch(ctx, template, client.MergeFrom(before)); err != nil { | ||
logger.Error(err, "Failed to add finalizer to template") | ||
|
||
return ctrl.Result{}, fmt.Errorf("failed to add finalizer: %w", err) | ||
} | ||
|
||
return ctrl.Result{}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to me that this all could go to separate function to keep Reconcile
shorter.
// +kubebuilder:resource:path=hardware,scope=Cluster,categories=tinkerbell | ||
// +kubebuilder:storageversion | ||
|
||
// Hardware is the Schema for the Hardware API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, those types will be exposed by Kubernetes API, so cluster API controller can create those instead of Tinkerbell structs directly.
I would like to know:
- What is the benefit of this indirection? Other than visibility (ability to see Tinkerbell records using Kubernetes API). I guess also separation of concerns as creating Tinkerbell records is now independent from Cluster API.
- If I understand correctly, Tinkerbell controllers could live as a separate project, to allow interacting with Tinkerbell via Kubernetes API. But for now they will live here? What if we have a separate controller in the future and we would like create entries for multiple Tinkerbell instances. Is this solution extensible for such use case? Perhaps having a separate controller per namespace would allow that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, those types will be exposed by Kubernetes API, so cluster API controller can create those instead of Tinkerbell structs directly.
I would like to know:
- What is the benefit of this indirection? Other than visibility (ability to see Tinkerbell records using Kubernetes API). I guess also separation of concerns as creating Tinkerbell records is now independent from Cluster API.
It is indeed mostly a separation of concerns. This also aligns with some of the discussions that have taken place upstream about trying to leverage similar tooling for other providers, such as the AWS Controllers for Kubernetes as an implementation detail of cluster-api-provider-aws.
- If I understand correctly, Tinkerbell controllers could live as a separate project, to allow interacting with Tinkerbell via Kubernetes API. But for now they will live here?
Yes, longer term I would expect this to move out to a separate project. There will need to be some thought put into how to reconcile with similar work for the tinkerbell crossplane provider to avoid too much unnecessary duplication.
What if we have a separate controller in the future and we would like create entries for multiple Tinkerbell instances. Is this solution extensible for such use case? Perhaps having a separate controller per namespace would allow that?
That is definitely something that we need to consider. I created the resources as cluster scoped with the idea of a 1-1 mapping between Cluster API and a Tinkerbell instance, but I think we have a few options available if we want to remove that assumption:
- Make the resources namespaced, when the controller reconciles a resource it pulls the connection information for the related tinkerbell instance from a secret stored in the namespace
- this would simplify mapping of names between tinkerbell and the controller, but would complicate the interaction between CAPT and the controllers
- Have a cluster scoped resource that can be used for storing connection information for multiple tinkerbell instances
- Each resource can optionally specify which connection information to use
- If not specified, have some method of falling back to a default
- Will need to figure out how to handle conflicting naming between k8s resources and tinkerbell resources across multiple tinkerbell instances
The latter approach is what is done by https://github.com/aws/aws-controllers-k8s and likely the cleaner approach, however they are not doing a sync back the other way where naming could conflict. We could potentially avoid the two way sync and instead opt for having a way to import specific pre-existing resources from tinkerbell instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is indeed mostly a separation of concerns. This also aligns with some of the discussions that have taken place upstream about trying to leverage similar tooling for other providers, such as the AWS Controllers for Kubernetes as an implementation detail of cluster-api-provider-aws.
Make sense, thanks for confirmation and insights.
Yes, longer term I would expect this to move out to a separate project. There will need to be some thought put into how to reconcile with similar work for the tinkerbell crossplane provider to avoid too much unnecessary duplication.
I don't know about the crossplane provider, so I trust your judgement here then!
Make the resources namespaced, when the controller reconciles a resource it pulls the connection information for the related tinkerbell instance from a secret stored in the namespace
this would simplify mapping of names between tinkerbell and the controller, but would complicate the interaction between CAPT and the controllers
Yeah, namespacing seemed like an easy solution to me.
It seems to me that Tinkerbell is sort of special, as with cloud providers you usually have just one set of credentials and single API endpoint, while with Tinkerbell, you could have multiple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are similar issues that are being tracked for various cloud providers to support the idea of multiple accounts, which isn't too far removed from what we'd need for tinkerbell, for example here's the tracking issue for cluster-api-provider-aws: kubernetes-sigs/cluster-api-provider-aws#1552
Signed-off-by: Jason DeTiberus <[email protected]>
// HardwareSpec defines the desired state of Hardware. | ||
type HardwareSpec struct { | ||
// ID is the ID of the hardware in Tinkerbell | ||
// +kubebuilder:validation:MinLength=1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should actually have a fixed length I think as it's an UUID.
k8s.io/api v0.17.14 | ||
github.com/google/uuid v1.1.2 | ||
github.com/onsi/gomega v1.10.1 | ||
github.com/tinkerbell/tink v0.0.0-20201210163923-6d9159b63857 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should use the same version as is used in sandbox repo right now: https://github.com/tinkerbell/sandbox/blob/master/current_versions.sh#L8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I suppose this version does not have informers implemented yet 😞
tink/internal/client/common.go
Outdated
// ErrNotFound is returned if a requested resource is not found. | ||
var ErrNotFound = errors.New("resource not found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could follow https://travix.io/errors-derived-from-constants-in-go-fda6748b4072, so we avoid having exported global variable and we can use errors.Is()
.
f.Objs[in.Id] = proto.Clone(in).(*hardware.Hardware) | ||
|
||
return nil | ||
} | ||
|
||
return errors.New("nobody home") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd swap the logic to keep the happy path on the left side instead of indented.
} | ||
|
||
// Get gets a Hardware from Tinkerbell. | ||
func (f *Hardware) Get(ctx context.Context, id, ip, mac string) (*hardware.Hardware, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a struct here for selecting the hardware? Otherwise you get this ambigous calls:
f.Get(ctx, "", "", mac)
f.Get(ctx, "", ip, "")
It seems to me that it's easy to mess up the order.
With struct, you would have something like:
f.Get(ctx, &fake.HardwareSelector{Mac: mac})
Also, is it desired to return first met hardware if selectors are contradicting?
} | ||
|
||
// Deletion is a noop. | ||
if !hardware.DeletionTimestamp.IsZero() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should log or emit event here?
logger.Error(err, "Failed to get hardware from Tinkerbell") | ||
|
||
return ctrl.Result{}, fmt.Errorf("failed to get hardware from Tinkerbell: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. I think controller manager will log the returned error anyway.
client.Client | ||
TemplateClient templateClient | ||
Log logr.Logger | ||
Scheme *runtime.Scheme |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is Scheme
field needed here?
Scheme: scheme, | ||
} | ||
|
||
got, err := r.reconcileDelete(context.Background(), tt.in) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could just call Reconcile
here and avoid having this test as internal? As we set deletion timestamp anyway.
*/ | ||
|
||
// Package controllers contains controllers for Tinkerbell. | ||
package controllers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this package be called workflow
?
"github.com/tinkerbell/cluster-api-provider-tinkerbell/tink/internal/client" | ||
) | ||
|
||
func TestHardwareLifecycle(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to break down those super large tests, so we can give them a better name and make them more focused on what we actually test here.
func TestHardwareLifecycle(t *testing.T) { | ||
g := NewWithT(t) | ||
ctx := context.Background() | ||
hardwareClient := client.NewHardwareClient(realHardwareClient(t)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, we have a fake client for Tinkerbell, I guess we can use it here so we don't require real tinkerbell server?
BTW, if we want tests with Tinkerbell server, we can probably borrow the setup from https://github.com/tinkerbell/terraform-provider-tinkerbell/tree/master/test.
@@ -0,0 +1,123 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, why tink/internal
for directory structure and not internal/tink
?
Also, maybe tinkerbell
instead of tink
?
) | ||
|
||
type hardwareClient interface { | ||
// Create(ctx context.Context, h *hardware.Hardware) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being able to push hardware to Tinkerbell via K8s API would be very useful for testing the CAPT controllers, as then you can push some test hardware via fake client-go.
Without that, one will need to push the hardware to Tinkerbell fake client directly I think.
templateClient := client.NewTemplateClient(tinkclient.TemplateClient) | ||
workflowClient := client.NewWorkflowClient(tinkclient.WorkflowClient, hwClient) | ||
|
||
stopCh := ctrl.SetupSignalHandler() | ||
|
||
if webhookPort == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should flip the logic here and remove the else
, as this no longer make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually addressed in #24. Hope it gets merged before this PR.
templateChan := make(chan event.GenericEvent) | ||
workflowChan := make(chan event.GenericEvent) | ||
|
||
if err := mgr.Add(&sources.TinkEventWatcher{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add those in the loop to deduplicate the exit logic? I'd also move it to separate function to avoid further extension of main()
function, as it should be as minimal as possible.
k8s.io/api v0.17.14 | ||
github.com/google/uuid v1.1.2 | ||
github.com/onsi/gomega v1.10.1 | ||
github.com/tinkerbell/tink v0.0.0-20201210163923-6d9159b63857 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I suppose this version does not have informers implemented yet 😞
@@ -104,7 +116,7 @@ func main() { | |||
|
|||
flag.Parse() | |||
|
|||
ctrl.SetLogger(zap.New(zap.UseDevMode(true))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Included committed explicitly in #24. I guess klog is standard logger we should use.
} | ||
}() | ||
|
||
w.Status.Data = tinkWorkflow.GetData() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I guess this is content of rendered workflow, right? If so, it may contain secrets. And it will contain secrets when we use it for CAPT.
Should we then create a secret for each workflow and store the data there? This actually also applies to Templates and hardware selectors :/
"testing" | ||
|
||
. "github.com/onsi/gomega" | ||
tinkv1alpha1 "github.com/tinkerbell/cluster-api-provider-tinkerbell/tink/api/v1alpha1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Module-scoped imports should be separate from 3rd party modules.
Closed in favor of #32 |
Description