Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cOdiglet to use criwrapper to create ic #2015

Open
wants to merge 35 commits into
base: main
Choose a base branch
from

Conversation

tamirdavid1
Copy link
Collaborator

No description provided.

@tamirdavid1 tamirdavid1 force-pushed the odiglet-to-use-criwrapper-to-create-ic branch from 298dd4b to 613ebf4 Compare December 16, 2024 15:27
@tamirdavid1 tamirdavid1 force-pushed the odiglet-to-use-criwrapper-to-create-ic branch from 7311ce9 to b2978f7 Compare December 17, 2024 07:22
runtimeDetailsByContainer:
description: Capture Runtime Details for the workloads that this CR
applies to.
items:
properties:
containerName:
type: string
criErrorMessage:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt status a better place for error messages instead of spec?

for _, endpoint := range defaultRuntimeEndpoints {
rc.Logger.Info("Attempting to connect to CRI runtime", "endpoint", endpoint)

rc.conn, err = grpc.NewClient(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is actually trying to connect and not just creating a client?
Few thoughts:

  • As only one of the default Endpoints will respond - maybe worth putting a timeout when opening a connection as some of them will just time out so we can bail quickly
  • I think the CRI implementation is part of the container id in Kubernetes maybe we can use that (for example getting the container id of the current Odiglet Pod) and connector only to the right one instead of trying all the options

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I Warp it up as a Connect() - The function’s primary purpose is to connect to the CRI runtime, which it achieves by establishing and validating a gRPC connection.

I change the logic so it will try to find the unix domain socket in the FS so we wont need to iterate over them and try to connect to each one of them

}

// Call the ContainerStatus API
response, err := rc.client.ContainerStatus(ctx, &criapi.ContainerStatusRequest{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we want to put a timeout on that? In case the remote CRI does not respond in time / unavailable?

return
}

// Verify if environment variables already exist in the container manifest.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not always collect the relevant env vars from Dokcerfile?
I don't have a good scenario in my head, but my feeling is that the more stateless we can make everything the simpler it will be

}

// Check if status is already set
if len(currentConfig.Status.RuntimeDetailsByContainer) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not update always? This means only the first Odiglet that runs this wins, so we will not update the value if the user changed the docker image?

@tamirdavid1 tamirdavid1 force-pushed the odiglet-to-use-criwrapper-to-create-ic branch from 71831ff to 1c14d62 Compare December 22, 2024 09:51
Comment on lines 54 to 63
CriErrorMessage *string `json:"criErrorMessage,omitempty"`
EnvVarsFromDockerFile []EnvVar `json:"envVarsFromDockerFile,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a comment here describing what these values record and why we need them

api/odigos/v1alpha1/instrumentedapplication_types.go Outdated Show resolved Hide resolved
k8sutils/pkg/cri/criwrapper.go Outdated Show resolved Hide resolved

// Validate the connection by invoking a lightweight method
ctx, cancel := context.WithTimeout(context.Background(), time.Second*2)
defer cancel()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will run after rc.client.Version has returned, right at which point cancel is useless

Is it needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to what i've read:
You need defer cancel() to properly release resources associated with the context. It's a best practice whenever you create a context with WithTimeout or WithCancel, even if the timeout expires or the function exits early. This habit helps prevent subtle resource leaks and keeps your program efficient.

In addition to it, we're doing that in everyplace we're creating context with timeout (config_provider, kubelet communication and more)

k8sutils/pkg/cri/criwrapper.go Outdated Show resolved Hide resolved
return ""
}

func checkEnvVarsInContainerManifest(container corev1.Container, envVarNames []string) bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do it per environment not per container, no?

LibCType *common.LibCType `json:"libCType,omitempty"`
CriErrorMessage *string `json:"criErrorMessage,omitempty"`
EnvFromContainerRuntime []EnvVar `json:"envFromContainerRuntime,omitempty"`
RuntimeUpdateState *ProcessingState `json:"runtimeUpdateState,omitempty"` // Tracks whether the new runtime detection process has been executed. If empty, the process has not been executed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to save this per workload and not per container?

odigosConfig, err := k8sutils.GetCurrentOdigosConfig(ctx, r.Client)
if err != nil {
return reconcile.Result{}, err
}

runtimeResults, err := runtimeInspection([]corev1.Pod{*selectedPodForInspection}, odigosConfig.IgnoredContainers)
runtimeResults, err := runtimeInspection(ctx, pods, odigosConfig.IgnoredContainers, r.CriClient)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider passing just one pod for efficiency

LibCType *common.LibCType `json:"libCType,omitempty"`
CriErrorMessage *string `json:"criErrorMessage,omitempty"`
EnvFromContainerRuntime []EnvVar `json:"envFromContainerRuntime,omitempty"`
RuntimeUpdateState *ProcessingState `json:"runtimeUpdateState,omitempty"` // Tracks whether the new runtime detection process has been executed. If empty, the process has not been executed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment that this property is temporary and only used for the migration and showed be removed after some time

@@ -72,7 +74,7 @@ func inspectRuntimesOfRunningPods(ctx context.Context, logger *logr.Logger, labe
return nil
}

func runtimeInspection(pods []corev1.Pod, ignoredContainers []string) ([]odigosv1.RuntimeDetailsByContainer, error) {
func runtimeInspection(ctx context.Context, pods []corev1.Pod, ignoredContainers []string, criClient *criwrapper.CriClient) ([]odigosv1.RuntimeDetailsByContainer, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignoredContainers - make sure it still works after the refactor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants