Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helm: fix invalid nri-plugin-index that causes plugins to crash #247

Closed
wants to merge 1 commit into from

Conversation

askervin
Copy link
Collaborator

Original printf format "%02d" argument for --nri-plugin-index produced argument value "%!d(float64=90)", if nri.pluginIndex was 90. As helm handles the value 90 as float64, use format "%02.0f" instead.

Original printf format "%02d" argument for --nri-plugin-index produced
argument value "%!d(float64=90)", if nri.pluginIndex was 90. As helm
handles the value 90 as float64, use format "%02.0f" instead.

Signed-off-by: Antti Kervinen <[email protected]>
@klihub
Copy link
Collaborator

klihub commented Jan 26, 2024

That's odd. I really thought I tested this properly with various valid and invalid indices, but maybe I did not after all.

BTW, is it so that now you can use --set nri.index=55.666 and it will "work" by using the index 55 ?

@klihub
Copy link
Collaborator

klihub commented Jan 26, 2024

Original printf format "%02d" argument for --nri-plugin-index produced argument value "%!d(float64=90)", if nri.pluginIndex was 90. As helm handles the value 90 as float64, use format "%02.0f" instead.

Hmm... what Helm version are you using ?

[root@crio-devel xfer]# grep -Hn -B1 -A2 pluginIndex ./helm/topology-aware/templates/daemonset.yaml
./helm/topology-aware/templates/daemonset.yaml-57-            - --nri-plugin-index
./helm/topology-aware/templates/daemonset.yaml:58:            - "{{ printf "%02d" .Values.nri.pluginIndex }}"
./helm/topology-aware/templates/daemonset.yaml-59-            {{- if .Values.configGroupLabel }}
./helm/topology-aware/templates/daemonset.yaml-60-            - --config-group-label
[root@crio-devel xfer]# helm install policy ./helm/topology-aware -n kube-system --set image.name=localhost/policy-test --set image.tag=latest --set nri.pluginIndex=1
NAME: policy
LAST DEPLOYED: Fri Jan 26 09:24:33 2024
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
[root@crio-devel xfer]# kubectl get pods -A | grep nri-resource-policy
kube-system   nri-resource-policy-topology-aware-pcjxh   1/1     Running   0          13s
[root@crio-devel xfer]# ps axuw | grep /bin/nri-resource-policy-topology-aware | grep -v grep
root        8009  0.3  0.4 1271032 37376 ?       Ssl  09:24   0:00 /bin/nri-resource-policy-topology-aware --host-root /host --config-namespace kube-system --pid-file /tmp/nri-resource-policy.pid -metrics-interval 5s --nri-plugin-index 01
[root@crio-devel xfer]# helm uninstall -n kube-system policy
release "policy" uninstalled
[root@crio-devel xfer]# helm install policy ./helm/topology-aware -n kube-system --set image.name=localhost/policy-test --set image.tag=latest --set nri.pluginIndex=50
NAME: policy
LAST DEPLOYED: Fri Jan 26 09:25:34 2024
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
[root@crio-devel xfer]# kubectl get pods -A | grep nri-resource-policy
kube-system   nri-resource-policy-topology-aware-jrrp8   1/1     Running   0          4s
[root@crio-devel xfer]# ps axuw | grep /bin/nri-resource-policy-topology-aware | grep -v grep
root        8346  1.0  0.4 1270776 37888 ?       Ssl  09:25   0:00 /bin/nri-resource-policy-topology-aware --host-root /host --config-namespace kube-system --pid-file /tmp/nri-resource-policy.pid -metrics-interval 5s --nri-plugin-index 50
[root@crio-devel xfer]# helm uninstall -n kube-system policy
release "policy" uninstalled
[root@crio-devel xfer]# helm install policy ./helm/topology-aware -n kube-system --set image.name=localhost/policy-test --set image.tag=latest --set nri.pluginIndex=90
NAME: policy
LAST DEPLOYED: Fri Jan 26 09:26:16 2024
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
[root@crio-devel xfer]# kubectl get pods -A | grep nri-resource-policy
kube-system   nri-resource-policy-topology-aware-bstdj   1/1     Running   0          3s
[root@crio-devel xfer]# ps axuw | grep /bin/nri-resource-policy-topology-aware | grep -v grep
root        8679  1.6  0.4 1271032 37888 ?       Ssl  09:26   0:00 /bin/nri-resource-policy-topology-aware --host-root /host --config-namespace kube-system --pid-file /tmp/nri-resource-policy.pid -metrics-interval 5s --nri-plugin-index 90
[root@crio-devel xfer]# helm version
version.BuildInfo{Version:"v3.11", GitCommit:"", GitTreeState:"", GoVersion:"go1.21rc3"}
[root@crio-devel xfer]# 

@klihub
Copy link
Collaborator

klihub commented Jan 26, 2024

Also, with balloons:

[root@crio-devel xfer]# helm install policy ./helm/balloons -n kube-system --set nri.pluginIndex=90
NAME: policy
LAST DEPLOYED: Fri Jan 26 09:33:35 2024
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
[root@crio-devel xfer]# kubectl get pods -A | grep balloons | grep -v grep
kube-system   nri-resource-policy-balloons-rwmjz   1/1     Running   0          94s
[root@crio-devel xfer]# ps axuw | grep /bin/nri-resource-policy- | grep -v grep 
root        9129  0.3  0.1 1271104 12800 ?       Ssl  09:33   0:00 /bin/nri-resource-policy-balloons --host-root /host --config-namespace kube-system --pid-file /tmp/nri-resource-policy.pid -metrics-interval 5s --nri-plugin-index 90

@klihub
Copy link
Collaborator

klihub commented Jan 26, 2024

Can you try if a self-compiled helm (from head or the latest release tag) in the same test environment then works as expected ? And if not, if the golang toolchain has any significance ? If it does, then maybe an alternative fix of self-compiling/go-installing helm in the e2e tests could also work.

@askervin
Copy link
Collaborator Author

askervin commented Jan 26, 2024

I tried with both the helm release installed by e2e tests, and most resent prebuilt helm release:

  • version.BuildInfo{Version:"v3.11", GitCommit:"", GitTreeState:"", GoVersion:"go1.19.6"}
  • version.BuildInfo{Version:"v3.14.0", GitCommit:"3fc9f4b2638e76f26739cd77c7017139be81d0ea", GitTreeState:"clean", GoVersion:"go1.21.5"}

Both of these versions fail with %02d and work with %02.0f. And I suppose your version will give an error with %02.0f.

It seems likely that both versions of helm exist in the wild, so maybe we should reject this PR and give up trying to do zero padding with printf in our templates. We could as well use string as index, like "90" or "08" for zero-padding.

@klihub
Copy link
Collaborator

klihub commented Jan 29, 2024

I tried with both the helm release installed by e2e tests, and most resent prebuilt helm release:

  • version.BuildInfo{Version:"v3.11", GitCommit:"", GitTreeState:"", GoVersion:"go1.19.6"}
  • version.BuildInfo{Version:"v3.14.0", GitCommit:"3fc9f4b2638e76f26739cd77c7017139be81d0ea", GitTreeState:"clean", GoVersion:"go1.21.5"}

Both of these versions fail with %02d and work with %02.0f. And I suppose your version will give an error with %02.0f.

It seems likely that both versions of helm exist in the wild, so maybe we should reject this PR and give up trying to do zero padding with printf in our templates. We could as well use string as index, like "90" or "08" for zero-padding.

Looks like this is a known problem that people are actively trying to fix, but haven't managed to fully do so yet:

commit f94bac0643ad5d39c740c57c6c8ea6a4569a1db0
Author: Oleg Sidorov <[email protected]>
Date:   Wed Jul 17 10:21:17 2019 +0200

    chartutil.ReadValues is forced to unmarshal numbers into json.Number refs #1
707 [dev-v3]
    
    Backport of https://github.com/helm/helm/pull/6010 to dev-v3 (the
    description below is a copy-paste from the original v2 branch PR).
    As https://github.com/helm/helm/pull/6016 is now merged to dev-v3, the
    change is reasonably trivial.
    
    This change is an attempt to address the common problem of json number
    unmarshalling where any number is converted into a float64 and
    represented in a scientific notation on a marshall call. This behavior
    breaks things like: chart versions and image tags if not converted to
    yaml strings explicitly.
    
    An example of this behavior: k8s failure to fetch an image tagged with a
    big number like: $IMAGE:20190612073634 after a few steps of yaml
    re-rendering turns into: $IMAGE:2.0190612073634e+13.
    
    Example issue: #1707
    
    This commit forces yaml parser to use JSON modifiers and explicitly
    enables interface{} unmarshalling instead of float64. The change
    introduced might be breaking so should be processed with an extra care.
    
    Due to the fact helm mostly dals with human-produced data (charts), we
    have a decent level of confidence this change looses no functionality
    helm users rely upon (the scientific notation).
    
    Relevant doc: https://golang.org/pkg/encoding/json/#Decoder.UseNumber
    
    Signed-off-by: Oleg Sidorov <[email protected]>
    Signed-off-by: Oleg Sidorov <[email protected]>

@klihub
Copy link
Collaborator

klihub commented Jan 29, 2024

Therefore I suggest we add this workaround until that problem is gotten under control in Helm:

diff --git a/deployment/helm/balloons/templates/daemonset.yaml b/deployment/helm/balloons/templates/daemonset.yaml
index e9998613..d978aeb5 100644
--- a/deployment/helm/balloons/templates/daemonset.yaml
+++ b/deployment/helm/balloons/templates/daemonset.yaml
@@ -55,7 +55,7 @@ spec:
             - -metrics-interval
             - 5s
             - --nri-plugin-index
-            - "{{ printf "%02d" .Values.nri.pluginIndex }}"
+            - "{{ .Values.nri.pluginIndex | int | printf "%02d"  }}"
             {{- if .Values.configGroupLabel }}
             - --config-group-label
             - {{ .Values.configGroupLabel }}

I'll file a PR.

@klihub
Copy link
Collaborator

klihub commented Jan 29, 2024

Filed #250 to fix this until Helm gets it under control. @askervin PTAL.

@askervin
Copy link
Collaborator Author

Closing this PR, replaced by #250

@askervin askervin closed this Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants