Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNV-52722: Pass through extra VDDK configuration options to importer pod. #3572

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

mrnold
Copy link
Contributor

@mrnold mrnold commented Dec 16, 2024

What this PR does / why we need it:
This pull request adds a new annotation "cdi.kubevirt.io/storage.pod.vddk.extraargs", referencing a ConfigMap that contains extra parameters to pass directly to the VDDK library. The use case is to allow tuning of asynchronous buffer counts for MTV as requested in CNV-52722. Testing has shown good results for cold migrations with:

VixDiskLib.nfcAio.Session.BufSizeIn64KB=16
VixDiskLib.nfcAio.Session.BufCount=4

These parameters are stored in a file whose path is passed to the VDDK via the nbdkit "config=" option. The file contents come from the referenced ConfigMap, and the ConfigMap is mounted to the importer pod as a volume.

Which issue(s) this PR fixes:
Fixes CNV-52722

Special notes for your reviewer:
As far as I can tell, a ConfigMap volume mount must be in the same namespace as the importer pod. So MTV will need to create or duplicate the ConfigMap to the same namespace as the DataVolume it creates.

Release note:

Allow extra VDDK configuration parameters to be passed to VDDK importer pods.

The VDDK library itself accepts infrequently-used arguments in a
configuration file, and some of these arguments have been tested to show
a significant transfer speedup in some environments. This adds an
annotation that references a ConfigMap holding the contents of this VDDK
configuration file, and mounts it to the importer pod. The first file in
the mounted directory is passed to the VDDK.

Signed-off-by: Matthew Arnold <[email protected]>
@kubevirt-bot kubevirt-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Dec 16, 2024
@kubevirt-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign awels for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mrnold
Copy link
Contributor Author

mrnold commented Dec 16, 2024

@mnecas Any concerns about the need to copy the ConfigMap to the importer namespace? I wasn't sure if that would make things awkward to use from the forklift side.

@coveralls
Copy link

coveralls commented Dec 16, 2024

Coverage Status

coverage: 59.406% (+0.03%) from 59.373%
when pulling aacc61a on mrnold:cnv-52722
into 791bbd6 on kubevirt:main.

Copy link
Contributor

@mnecas mnecas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, I have added some a note and NP but nothing on my side

withHidden, err := os.ReadDir(common.VddkArgsDir)
if err != nil {
if os.IsNotExist(err) {
return "", nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NP: Please add a comment that the user did not specify the vddk additional config

@@ -228,6 +236,30 @@ func getVddkPluginPath() NbdkitPlugin {
return NbdkitVddkPlugin
}

// Extra VDDK configuration options are stored in a ConfigMap mounted to the
// importer pod. Just look for the first file in the mounted directory, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just look for the first file just note we need to be sure to document this so user does not chain the configs to separate configmaps.

@kubevirt-bot
Copy link
Contributor

@mnecas: changing LGTM is restricted to collaborators

In response to this:

Generally LGTM, I have added some a note and NP but nothing on my side

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mnecas
Copy link
Contributor

mnecas commented Dec 16, 2024

@mnecas Any concerns about the need to copy the ConfigMap to the importer namespace? I wasn't sure if that would make things awkward to use from the forklift side.

I think it should be okay, but also think we should document/verify.
Verify that the Forklift service account has enough permissions on the local cluster.
And check if the documentation specifies the permissions on a remote cluster.

@mrnold
Copy link
Contributor Author

mrnold commented Dec 18, 2024

/retest

Copy link
Collaborator

@akalenyu akalenyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@@ -1254,6 +1276,30 @@ var _ = Describe("[vendor:[email protected]][level:component]DataVolume tests",
Message: "Import Complete",
Reason: "Completed",
}}),
Entry("[test_id:5083]succeed importing VDDK data volume with extra arguments ConfigMap set", dataVolumeTestArguments{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to assert something in the final step of this test? like checking the config was applied?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this test I wanted to make sure that the contents of the ConfigMap are present in the file, so the assertion happens in the vddk-test-plugin (the fgets/strcmp). Is there a better way to check the result from the importer? Like can this test read the pod logs or termination message?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh okay I see

if (strcmp(extras, "VixDiskLib.nfcAio.Session.BufSizeIn64KB=16") != 0) { // Must match datavolume_test
Yeah I think we can come up with a cleaner way,
You can either read pod logs or make this a part of the termination message struct
// GetTerminationMessage returns data to be serialized and used as the termination message of the importer.
func (vs *VDDKDataSource) GetTerminationMessage() *common.TerminationMessage {
return &common.TerminationMessage{
VddkInfo: &common.VddkInfo{
Version: vddkVersion,
Host: vddkHost,
},
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an in-progress implementation to check the pod logs, but before I commit to that I am also going to try getting the result into the termination message to see if it's any cleaner. Either way it's definitely better to check the result from the test code than from the VDDK test plugin itself, that should get rid of a source of silent failures in CI.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The termination message approach should be cleaner, you would be asserting on a common.TerminationMessage struct that you got from the importer's API termination message

tools/vddk-test/vddk-test-plugin.c Show resolved Hide resolved
pkg/image/nbdkit.go Outdated Show resolved Hide resolved
tests/datavolume_test.go Outdated Show resolved Hide resolved
@@ -349,6 +349,13 @@ spec:
[Get VDDK ConfigMap example](../manifests/example/vddk-configmap.yaml)
[Ways to find thumbprint](https://libguestfs.org/nbdkit-vddk-plugin.1.html#THUMBPRINTS)

#### Extra VDDK Configuration Options

The VDDK library itself looks in a configuration file (such as `/etc/vmware/config`) for extra options to fine tune data transfers. To pass these options through to the VDDK, store the configuration file contents in a ConfigMap and add a `cdi.kubevirt.io/storage.pod.vddk.extraargs` annotation to the DataVolume specification. The ConfigMap will be mounted to the importer pod as a volume, and the first file in the mounted directory will be passed to the VDDK. This means that the ConfigMap must be placed in the same namespace as the DataVolume, and the ConfigMap should only have one file entry.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you considered making this an API on the DataVolume, but, since you need a backport, you prefer the annotation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it didn't seem worth changing the CRDs and all the generated stuff for just for this uncommon fine-tuning configuration option. I can certainly change the API if that would be better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhenriks wdyt? since this is backporting I am leaning to the annotation as well, but I am not sure.. usually any annotation becomes an API that we forget about

mnecas added a commit to mnecas/forklift that referenced this pull request Dec 18, 2024
Issue:
The scale and perf team found a way how to improve the transfer speeds.
Right now the only way to enable this feature is to set the v2v extra
vars. The v2v extra vars pass the configuration to the virt-v2v and
virt-v2v-in-place. The v2v extra vars configuration is general and not
specific for VDDK. This causes the warm migration which uses the
virt-v2v-in-place to fail as it does not use any VDDK parameters.
Those parameters should be passed to the CNV CDI instead.

Fix:
Add a way to easily enable and configure the AIO.
This feature is VDDK and provider-specific as it requires to have
specific vSphere and VDDK versions. So we can't enable this feature
globally nor by default. So this PR adds the configuration to the
Provider spec settings and create a configmap with the necessary
configuration and either mounts the configmap to the guest conversion
pod for cold migration or passes the configmap name to the CDI DV
annotation.

Example:
```
apiVersion: forklift.konveyor.io/v1beta1
kind: Provider
metadata:
  name: vsphere
  namespace: forklift
spec:
  settings:
    sdkEndpoint: vcenter
    useVddkAioOptimization: 'true'
    vddkAioBufSize: 16 // optional defaults to 16
    vddkAioBufCount: 4 // optional defaults to 4
    vddkInitImage: 'quay.io/xiaodwan/vddk:8'
  type: vsphere
```

Ref:
- https://issues.redhat.com/browse/MTV-1804
- kubevirt/containerized-data-importer#3572
- https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv

Signed-off-by: Martin Necas <[email protected]>
mnecas added a commit to mnecas/forklift that referenced this pull request Dec 18, 2024
Issue:
The scale and perf team found a way how to improve the transfer speeds.
Right now the only way to enable this feature is to set the v2v extra
vars. The v2v extra vars pass the configuration to the virt-v2v and
virt-v2v-in-place. The v2v extra vars configuration is general and not
specific for VDDK. This causes the warm migration which uses the
virt-v2v-in-place to fail as it does not use any VDDK parameters.
Those parameters should be passed to the CNV CDI instead.

Fix:
Add a way to easily enable and configure the AIO.
This feature is VDDK and provider-specific as it requires to have
specific vSphere and VDDK versions. So we can't enable this feature
globally nor by default. So this PR adds the configuration to the
Provider spec settings and create a configmap with the necessary
configuration and either mounts the configmap to the guest conversion
pod for cold migration or passes the configmap name to the CDI DV
annotation.

Example:
```
apiVersion: forklift.konveyor.io/v1beta1
kind: Provider
metadata:
  name: vsphere
  namespace: forklift
spec:
  settings:
    sdkEndpoint: vcenter
    useVddkAioOptimization: 'true'
    vddkAioBufSize: 16 // optional defaults to 16
    vddkAioBufCount: 4 // optional defaults to 4
    vddkInitImage: 'quay.io/xiaodwan/vddk:8'
  type: vsphere
```

Ref:
- https://issues.redhat.com/browse/MTV-1804
- kubevirt/containerized-data-importer#3572
- https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv

Signed-off-by: Martin Necas <[email protected]>
@mrnold
Copy link
Contributor Author

mrnold commented Dec 18, 2024

/retest

mnecas added a commit to mnecas/forklift that referenced this pull request Dec 19, 2024
Issue:
The scale and perf team found a way how to improve the transfer speeds.
Right now the only way to enable this feature is to set the v2v extra
vars. The v2v extra vars pass the configuration to the virt-v2v and
virt-v2v-in-place. The v2v extra vars configuration is general and not
specific for VDDK. This causes the warm migration which uses the
virt-v2v-in-place to fail as it does not use any VDDK parameters.
Those parameters should be passed to the CNV CDI instead.

Fix:
Add a way to easily enable and configure the AIO.
This feature is VDDK and provider-specific as it requires to have
specific vSphere and VDDK versions. So we can't enable this feature
globally nor by default. So this PR adds the configuration to the
Provider spec settings and create a configmap with the necessary
configuration and either mounts the configmap to the guest conversion
pod for cold migration or passes the configmap name to the CDI DV
annotation.

Example:
```
apiVersion: forklift.konveyor.io/v1beta1
kind: Provider
metadata:
  name: vsphere
  namespace: forklift
spec:
  settings:
    sdkEndpoint: vcenter
    useVddkAioOptimization: 'true'
    vddkAioBufSize: 16 // optional defaults to 16
    vddkAioBufCount: 4 // optional defaults to 4
    vddkInitImage: 'quay.io/xiaodwan/vddk:8'
  type: vsphere
```

Ref:
- https://issues.redhat.com/browse/MTV-1804
- kubevirt/containerized-data-importer#3572
- https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv

Signed-off-by: Martin Necas <[email protected]>
mnecas added a commit to mnecas/forklift that referenced this pull request Dec 19, 2024
Issue:
The scale and perf team found a way how to improve the transfer speeds.
Right now the only way to enable this feature is to set the v2v extra
vars. The v2v extra vars pass the configuration to the virt-v2v and
virt-v2v-in-place. The v2v extra vars configuration is general and not
specific for VDDK. This causes the warm migration which uses the
virt-v2v-in-place to fail as it does not use any VDDK parameters.
Those parameters should be passed to the CNV CDI instead.

Fix:
Add a way to easily enable and configure the AIO.
This feature is VDDK and provider-specific as it requires to have
specific vSphere and VDDK versions. So we can't enable this feature
globally nor by default. So this PR adds the configuration to the
Provider spec settings and create a configmap with the necessary
configuration and either mounts the configmap to the guest conversion
pod for cold migration or passes the configmap name to the CDI DV
annotation.

Example:
```
apiVersion: forklift.konveyor.io/v1beta1
kind: Provider
metadata:
  name: vsphere
  namespace: forklift
spec:
  settings:
    sdkEndpoint: vcenter
    useVddkAioOptimization: 'true'
    vddkAioBufSize: 16 // optional defaults to 16
    vddkAioBufCount: 4 // optional defaults to 4
    vddkInitImage: 'quay.io/xiaodwan/vddk:8'
  type: vsphere
```

Ref:
- https://issues.redhat.com/browse/MTV-1804
- kubevirt/containerized-data-importer#3572
- https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv

Signed-off-by: Martin Necas <[email protected]>
mnecas added a commit to mnecas/forklift that referenced this pull request Dec 19, 2024
Issue:
The scale and perf team found a way how to improve the transfer speeds.
Right now the only way to enable this feature is to set the v2v extra
vars. The v2v extra vars pass the configuration to the virt-v2v and
virt-v2v-in-place. The v2v extra vars configuration is general and not
specific for VDDK. This causes the warm migration which uses the
virt-v2v-in-place to fail as it does not use any VDDK parameters.
Those parameters should be passed to the CNV CDI instead.

Fix:
Add a way to easily enable and configure the AIO.
This feature is VDDK and provider-specific as it requires to have
specific vSphere and VDDK versions. So we can't enable this feature
globally nor by default. So this PR adds the configuration to the
Provider spec settings and create a configmap with the necessary
configuration and either mounts the configmap to the guest conversion
pod for cold migration or passes the configmap name to the CDI DV
annotation.

Example:
```
apiVersion: forklift.konveyor.io/v1beta1
kind: Provider
metadata:
  name: vsphere
  namespace: forklift
spec:
  settings:
    sdkEndpoint: vcenter
    useVddkAioOptimization: 'true'
    vddkAioBufSize: 16 // optional defaults to 16
    vddkAioBufCount: 4 // optional defaults to 4
    vddkInitImage: 'quay.io/xiaodwan/vddk:8'
  type: vsphere
```

Ref:
- https://issues.redhat.com/browse/MTV-1804
- kubevirt/containerized-data-importer#3572
- https://docs.redhat.com/en/documentation/migration_toolkit_for_virtualization/2.7/html-single/installing_and_using_the_migration_toolkit_for_virtualization/index#mtv-aio-buffer_mtv

Signed-off-by: Martin Necas <[email protected]>
Instead of listing the mounted VDDK arguments directory and filtering
out hidden files, just hard-code the expected file name and ConfigMap
key.

Signed-off-by: Matthew Arnold <[email protected]>
@kubevirt-bot
Copy link
Contributor

@mrnold: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cdi-unit-test-s390x aacc61a link false /test pull-cdi-unit-test-s390x

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@rwmjones
Copy link
Contributor

rwmjones commented Jan 2, 2025

For interest only, here's an alternative to maintaining the fake VDDK plugin:

In nbdkit itself we are able to test the real nbdkit-vddk-plugin without VDDK, using a fake VDDK:

https://gitlab.com/nbdkit/nbdkit/-/blob/master/tests/dummy-vddk.c?ref_type=heads

which is compiled to libvixDiskLib.so.6:

https://gitlab.com/nbdkit/nbdkit/-/blob/9ed65418c57128d4bf372f39b9f98bf6ecbe470a/tests/Makefile.am

then you just have to point libdir= to the directory containing this file:

https://gitlab.com/nbdkit/nbdkit/-/blob/9ed65418c57128d4bf372f39b9f98bf6ecbe470a/tests/test-vddk.c#L61

@mrnold
Copy link
Contributor Author

mrnold commented Jan 3, 2025

For interest only, here's an alternative to maintaining the fake VDDK plugin:

In nbdkit itself we are able to test the real nbdkit-vddk-plugin without VDDK, using a fake VDDK:

https://gitlab.com/nbdkit/nbdkit/-/blob/master/tests/dummy-vddk.c?ref_type=heads

which is compiled to libvixDiskLib.so.6:

https://gitlab.com/nbdkit/nbdkit/-/blob/9ed65418c57128d4bf372f39b9f98bf6ecbe470a/tests/Makefile.am

then you just have to point libdir= to the directory containing this file:

https://gitlab.com/nbdkit/nbdkit/-/blob/9ed65418c57128d4bf372f39b9f98bf6ecbe470a/tests/test-vddk.c#L61

This is cool, I will look into this. I think it would still require maintaining and updating an equivalent image to hold libvixDiskLib.so.6, but it would let me get rid of this test plugin check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates the PR's author has DCO signed all their commits. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants