Add remote-test option for E2E #876

ArangoGutierrez · 2025-01-23T09:29:01Z

No description provided.

ArangoGutierrez · 2025-01-23T09:53:34Z

Next step is to have this to run during PR's, merge events and Tag cuts

ArangoGutierrez · 2025-01-23T09:54:56Z

test/e2e/e2e_test.go

+	return localRunScript(script)
+}
+
+func localRunScript(script string) (string, error) {


I wonder if I should move everything from this line forward to the utils.go file

I think we could move all the script specifics to a separate file, but not utils.go.

Sorry, I should have been more clear. I think we should separate logic according to functionality / domain and name the files accordingly. Using tools.go as a catch-all does not achieve this.

ArangoGutierrez · 2025-01-23T09:55:53Z

test/e2e/utils.go

+set -xe
+`
+
+var dockerInstallTemplate = `


The idea here is that in the future, we will have different methods of installation to test, for example, CDI vs non-CDI.

elezar · 2025-01-23T10:56:34Z

test/e2e/utils.go

+
+: ${IMAGE:={{.Image}}}
+
+sudo ln -s /var/run/nvidia-container-toolkit/toolkit/nvidia-container-runtime-hook /usr/bin/nvidia-container-runtime-hook


Do we want to add a comment as to why this is required?

sure, more documentation never hurts

elezar · 2025-01-23T11:00:40Z

test/e2e/Makefile

-.PHONY: test
-test:
+E2E_IMAGE_NAME ?= ghcr.io/nvidia/container-toolkit
+E2E_IMAGE_TAG ?= latest


latest is never a valid tag in this repo. Can we rather error out if this isn't set for the remote case.

elezar · 2025-01-23T11:01:49Z

test/e2e/Makefile

+E2E_SSH_HOST ?= 
+
+.PHONY: local-test remote-test
+# Local test assumes that the container-toolkit is already installed on the host


Do we need the distinction? Even when running locally, I may want to install the toolkit from an image that I've just built.

elezar · 2025-01-23T11:02:58Z

test/e2e/e2e_test.go

+	imageRepo string
+	imageTag  string


Does this warrant a type? (same for the ssh info below)?

elezar · 2025-01-23T11:05:45Z

test/e2e/nvidia-container-toolkit_test.go

+var _ = Describe("docker", Ordered, func() {
+	// Install the NVIDIA Container Toolkit
+	BeforeAll(func(ctx context.Context) {
+		if sshKey != "" {


As I mentioned, we should probably allow the isntallation logic to be run locally too.

elezar · 2025-01-23T11:06:21Z

test/e2e/nvidia-container-toolkit_test.go

+	// Install the NVIDIA Container Toolkit
+	BeforeAll(func(ctx context.Context) {
+		if sshKey != "" {
+			installScript, err := getInstallScript(dockerInstallTemplate, fmt.Sprintf("%s:%s", imageRepo, imageTag))


does it make sense to return an Installer instead which implements Install() error?

elezar · 2025-01-23T11:06:52Z

test/e2e/utils.go

+	--restart-mode=systemd
+`
+
+type DockerInstall struct {


This type isn't used.

elezar · 2025-01-23T11:07:52Z

test/e2e/utils.go

+	"text/template"
+)
+
+const Shebang = `#! /usr/bin/env bash


Is it not more confusing to have to remember to add the Shebang each time? Should we not use include this in the template / script.

elezar · 2025-01-23T11:08:48Z

test/e2e/utils.go

+docker run --pid=host --rm -i --privileged	\
+	-v /:/host	\
+	-v /var/run/docker.sock:/var/run/docker.sock	\
+	-v /var/run/nvidia-container-toolkit:/var/run/nvidia-container-toolkit	\


I think we should use a temporary path here since we ideally want to remove the config again after the tests.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>

elezar · 2025-01-29T10:01:52Z

tests/e2e/Makefile

+E2E_IMAGE_TAG ?=
+ifeq ($(E2E_IMAGE_TAG),)
+$(error E2E_IMAGE_TAG is not set)
+endif


Would it not make sense to use the tag that would be generated when running:

make -f deployments/container/Makefile build-ubuntu20.04

in the root by default.

i.e:

--tag nvidia/container-toolkit:1.17.3-ubuntu20.04 \

In this context, tag refers to the image tag in your example being 1.17.3-ubuntu20.04.
Do we want to have tag refer to the image name+tag for this context? (I know some projects do this, so I am ok with either)

I just coppied a snippet from the shell output when running the make command. Do we think the following is a reasonable mode of operation:

Run ./scripts/build-packages.sh ubuntu18.04-amd64 to build the packages.

Run make -f deployment/container/Makefile build to build the required image.

Run make -f tests/e2e/Makefile test to run the tests for the built version locally?

elezar · 2025-01-29T10:16:12Z

tests/e2e/tools.go

+	--restart-mode=systemd
+`
+
+type Installer struct {


Move installer to installer.go.

elezar · 2025-01-29T10:17:06Z

tests/e2e/tools.go

+	SshKey     string
+	SshUser    string
+	RemoteHost string


These should not be installer members. The installer doesn't care if we're running remotely or locally. It should have a reference to a Runner that it uses to run the required commands.

elezar · 2025-01-29T10:17:44Z

tests/e2e/tools.go

+
+func (i *Installer) Install() error {
+	// Parse the combined template
+	tmpl, err := template.New("dockerScript").Parse(i.Template)


nit: The template name need not be dockerScript.

elezar · 2025-01-29T10:20:20Z

tests/e2e/nvidia-container-toolkit_test.go

@@ -24,7 +24,27 @@ import (
 )

 // Integration tests for Docker runtime
-var _ = Describe("docker", func() {
+var _ = Describe("docker", Ordered, func() {
+	var s Script


This should probably be a r Runner and not a s Script.

elezar · 2025-01-29T10:23:21Z

tests/e2e/nvidia-container-toolkit_test.go

+		switch {
+		case sshKey == "":
+			s = localRun{}
+		default:
+			s = remoteRun{sshKey, sshUser, remoteHost}
+		}
+
+		if installCTK {
+			installer, err := newInstaller(dockerInstallTemplate, imageRepo+":"+imageTag, sshKey, sshUser, remoteHost)
+			Expect(err).ToNot(HaveOccurred())
+
+			err = installer.Install()
+			Expect(err).ToNot(HaveOccurred())
+		}


The logic on which Script (or Runner) should be used should be pulled into a constructor with functional arguments. This instance should also be reused in the Installer so as to not duplicate the logic there.

Suggested change

switch {

case sshKey == "":

s = localRun{}

default:

s = remoteRun{sshKey, sshUser, remoteHost}

}

if installCTK {

installer, err := newInstaller(dockerInstallTemplate, imageRepo+":"+imageTag, sshKey, sshUser, remoteHost)

Expect(err).ToNot(HaveOccurred())

err = installer.Install()

Expect(err).ToNot(HaveOccurred())

}

s := NewRunner(

WithHost(remoteHost),

WithSshKey(sshKey),

WithSshUser(sshUser),

)

if installCTK {

installer, err := NewToolkitInstaller(

WithRunner(s),

WithVersion(version),

)

Expect(err).ToNot(HaveOccurred())

err = installer.Install()

Expect(err).ToNot(HaveOccurred())

}

elezar · 2025-01-29T10:24:32Z

tests/e2e/tools.go

+	return err
+}
+
+type Script interface {


Since this inteface only has a single function Run the idiomatic Go name for it would be Runner. Let's also move it to a separate file.

elezar · 2025-01-29T10:25:01Z

tests/e2e/tools.go

+	Run(script string) (string, error)
+}
+
+type localRun struct{}


Suggested change

type localRun struct{}

// A localRunner runs scripts locally.

type localRunner struct{}

elezar · 2025-01-29T10:26:14Z

tests/e2e/tools.go

+}
+
+type localRun struct{}
+type remoteRun struct {


Suggested change

type remoteRun struct {

// A remoteRunner runs scripts over SSH.

type remoteRunner struct {

(also wondering whether sshRunner isn't more accurate).

elezar · 2025-01-29T10:27:22Z

tests/e2e/tools.go

+}
+
+type Script interface {
+	Run(script string) (string, error)


Just wondering about this. Would we expect this to return (string, string, error) instead so that we can always access STDOUT and STDERR?

elezar · 2025-01-29T10:29:50Z

tests/e2e/tools.go

+
+	connectionFailed := false
+	for i := 0; i < 20; i++ {
+		client, err = ssh.Dial("tcp", remoteHost+":22", sshConfig)


Nit: Does it make sense to make the port configurable?

elezar · 2025-01-29T10:31:32Z

tests/e2e/tools.go

+
+func (r remoteRun) Run(script string) (string, error) {
+	// Create a new SSH connection
+	client, err := connectOrDie(r.SshKey, r.SshUser, r.RemoteHost)


Should we create a new SSH session every time we Run a script?

elezar · 2025-01-29T10:32:00Z

tests/e2e/tools.go

+	}
+
+	connectionFailed := false
+	for i := 0; i < 20; i++ {


I don't think we need a retry at this stage.

ArangoGutierrez requested review from elezar and tariq1890 January 23, 2025 09:31

ArangoGutierrez self-assigned this Jan 23, 2025

ArangoGutierrez added the testing issue/PR to fix/edit/create/enhance a project unit/e2e test label Jan 23, 2025

ArangoGutierrez commented Jan 23, 2025

View reviewed changes

elezar reviewed Jan 23, 2025

View reviewed changes

test/e2e/utils.go Outdated

--restart-mode=systemd

`

type DockerInstall struct {

Copy link

Member

elezar Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type isn't used.

elezar reviewed Jan 23, 2025

View reviewed changes

Add remote-test option for E2E

f2a0b50

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>

ArangoGutierrez force-pushed the reg_test02 branch from 58379b2 to f2a0b50 Compare January 24, 2025 13:47

ArangoGutierrez requested a review from elezar January 24, 2025 13:47

elezar reviewed Jan 29, 2025

View reviewed changes

tests/e2e/tools.go

--restart-mode=systemd

`

type Installer struct {

Copy link

Member

elezar Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move installer to installer.go.

elezar reviewed Jan 29, 2025

View reviewed changes

tests/e2e/tools.go

}

connectionFailed := false

for i := 0; i < 20; i++ {

Copy link

Member

elezar Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a retry at this stage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add remote-test option for E2E #876

Add remote-test option for E2E #876

ArangoGutierrez commented Jan 23, 2025

ArangoGutierrez commented Jan 23, 2025

ArangoGutierrez Jan 23, 2025

elezar Jan 23, 2025

elezar Jan 24, 2025

ArangoGutierrez Jan 23, 2025

elezar Jan 23, 2025

ArangoGutierrez Jan 23, 2025

elezar Jan 23, 2025 •

edited

Loading

elezar Jan 23, 2025 •

edited

Loading

elezar Jan 23, 2025

elezar Jan 23, 2025

elezar Jan 23, 2025

elezar Jan 23, 2025

elezar Jan 23, 2025

elezar Jan 23, 2025

elezar Jan 29, 2025

ArangoGutierrez Jan 29, 2025

elezar Jan 29, 2025

elezar Jan 29, 2025

elezar Jan 29, 2025

elezar Jan 29, 2025

elezar Jan 29, 2025 •

edited

Loading

elezar Jan 29, 2025

elezar Jan 29, 2025

elezar Jan 29, 2025 •

edited

Loading

elezar Jan 29, 2025

elezar Jan 29, 2025

elezar Jan 29, 2025

elezar Jan 29, 2025

elezar Jan 29, 2025


		: ${IMAGE:={{.Image}}}

		sudo ln -s /var/run/nvidia-container-toolkit/toolkit/nvidia-container-runtime-hook /usr/bin/nvidia-container-runtime-hook

	type localRun struct{}
	// A localRunner runs scripts locally.
	type localRunner struct{}

-type remoteRun struct {
+// A remoteRunner runs scripts over SSH.
+type remoteRunner struct {

Add remote-test option for E2E #876

Are you sure you want to change the base?

Add remote-test option for E2E #876

Conversation

ArangoGutierrez commented Jan 23, 2025

ArangoGutierrez commented Jan 23, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elezar Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

elezar Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elezar Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elezar Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elezar Jan 23, 2025 •

edited

Loading

elezar Jan 23, 2025 •

edited

Loading

elezar Jan 29, 2025 •

edited

Loading

elezar Jan 29, 2025 •

edited

Loading