Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure proper gofail package version in robustness tests #18397

Merged
merged 1 commit into from
Aug 2, 2024

Conversation

serathius
Copy link
Member

@serathius serathius commented Aug 2, 2024

Addition of tools/mod package into older branches moved the source of truth for gofail version, so the existing makefile code stopped overriding changes. This resulted in robustness tests flaking with release of [email protected] that allowed new type of failpoint.

This PR should help us ensure that gofail version is set properly on older branches.

Before we see downgraded go.etcd.io/gofail v0.2.0 => v0.1.0 line in the following build logs

$ make /tmp/etcd-release-3.5-failpoints/bin 
rm -rf /tmp/etcd-release-3.5-failpoints/
mkdir -p /tmp/etcd-release-3.5-failpoints/
cd /tmp/etcd-release-3.5-failpoints/; \
  git clone --depth 1 --branch release-3.5 https://github.com/etcd-io/etcd.git .; \
  go get go.etcd.io/[email protected]; \
  (cd server; go get go.etcd.io/[email protected]); \
  (cd etcdctl; go get go.etcd.io/[email protected]); \
  (cd etcdutl; go get go.etcd.io/[email protected]); \
  FAILPOINTS=true ./build;
Cloning into '.'...
remote: Enumerating objects: 1656, done.
remote: Counting objects: 100% (1656/1656), done.
remote: Compressing objects: 100% (1465/1465), done.
remote: Total 1656 (delta 355), reused 667 (delta 149), pack-reused 0
Receiving objects: 100% (1656/1656), 4.26 MiB | 15.26 MiB/s, done.
Resolving deltas: 100% (355/355), done.
go: upgraded go.etcd.io/gofail v0.1.0 => v0.2.0
go: upgraded github.com/stretchr/testify v1.8.4 => v1.9.0
go: upgraded go.etcd.io/gofail v0.1.0 => v0.2.0
DEPRECATED!!! Use build.sh script instead.

% 'gofail' 'enable' 'server/etcdserver/' 'server/lease/leasehttp' 'server/mvcc/' 'server/wal/' 'server/mvcc/backend/'
go: removed go.etcd.io/etcd/etcdctl/v3 v3.5.15
go: removed go.etcd.io/etcd/etcdutl/v3 v3.5.15
go: removed go.etcd.io/etcd/server/v3 v3.5.15
go: removed go.etcd.io/etcd/tests/v3 v3.5.15
go: downgraded go.etcd.io/gofail v0.2.0 => v0.1.0
go: downgraded go.etcd.io/gofail v0.2.0 => v0.1.0
go: downgraded go.etcd.io/gofail v0.2.0 => v0.1.0
go: downgraded go.etcd.io/gofail v0.2.0 => v0.1.0
go: upgraded github.com/stretchr/testify v1.8.4 => v1.9.0
% 'gofail' 'enable' 'server/etcdserver/' 'server/lease/leasehttp' 'server/mvcc/' 'server/wal/' 'server/mvcc/backend/'
% 'rm' '-f' 'bin/etcd'
% (cd server && 'env' 'CGO_ENABLED=0' 'GO_BUILD_FLAGS=' 'GOOS=linux' 'GOARCH=amd64' 'go' 'build' '-trimpath' '-installsuffix=cgo' '-ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=6abcc18-FAILPOINTS' '-o=../bin/etcd' '.')
% 'rm' '-f' 'bin/etcdutl'
% (cd etcdutl && 'env' 'GO_BUILD_FLAGS=' 'CGO_ENABLED=0' 'GO_BUILD_FLAGS=' 'GOOS=linux' 'GOARCH=amd64' 'go' 'build' '-trimpath' '-installsuffix=cgo' '-ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=6abcc18-FAILPOINTS' '-o=../bin/etcdutl' '.')
% 'rm' '-f' 'bin/etcdctl'
% (cd etcdctl && 'env' 'GO_BUILD_FLAGS=' 'CGO_ENABLED=0' 'GO_BUILD_FLAGS=' 'GOOS=linux' 'GOARCH=amd64' 'go' 'build' '-trimpath' '-installsuffix=cgo' '-ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=6abcc18-FAILPOINTS' '-o=../bin/etcdctl' '.')
SUCCESS: etcd_build (GOARCH=amd64)

With the change we no longer get it:

make /tmp/etcd-release-3.5-failpoints/bin 
rm -rf /tmp/etcd-release-3.5-failpoints/
mkdir -p /tmp/etcd-release-3.5-failpoints/
cd /tmp/etcd-release-3.5-failpoints/; \
  git clone --depth 1 --branch release-3.5 https://github.com/etcd-io/etcd.git .; \
  go get go.etcd.io/[email protected]; \
  (cd tools/mod; go get go.etcd.io/[email protected]); \
  FAILPOINTS=true ./build;
Cloning into '.'...
remote: Enumerating objects: 1656, done.
remote: Counting objects: 100% (1656/1656), done.
remote: Compressing objects: 100% (1465/1465), done.
remote: Total 1656 (delta 355), reused 667 (delta 149), pack-reused 0
Receiving objects: 100% (1656/1656), 4.26 MiB | 9.02 MiB/s, done.
Resolving deltas: 100% (355/355), done.
go: upgraded go.etcd.io/gofail v0.1.0 => v0.2.0
go: upgraded go.etcd.io/gofail v0.1.0 => v0.2.0
DEPRECATED!!! Use build.sh script instead.

% 'gofail' 'enable' 'server/etcdserver/' 'server/lease/leasehttp' 'server/mvcc/' 'server/wal/' 'server/mvcc/backend/'
go: upgraded github.com/stretchr/testify v1.8.4 => v1.9.0
go: upgraded go.etcd.io/gofail v0.1.0 => v0.2.0
go: upgraded github.com/stretchr/testify v1.8.4 => v1.9.0
go: upgraded go.etcd.io/gofail v0.1.0 => v0.2.0
% 'gofail' 'enable' 'server/etcdserver/' 'server/lease/leasehttp' 'server/mvcc/' 'server/wal/' 'server/mvcc/backend/'
% 'rm' '-f' 'bin/etcd'
% (cd server && 'env' 'CGO_ENABLED=0' 'GO_BUILD_FLAGS=' 'GOOS=linux' 'GOARCH=amd64' 'go' 'build' '-trimpath' '-installsuffix=cgo' '-ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=6abcc18-FAILPOINTS' '-o=../bin/etcd' '.')
% 'rm' '-f' 'bin/etcdutl'
% (cd etcdutl && 'env' 'GO_BUILD_FLAGS=' 'CGO_ENABLED=0' 'GO_BUILD_FLAGS=' 'GOOS=linux' 'GOARCH=amd64' 'go' 'build' '-trimpath' '-installsuffix=cgo' '-ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=6abcc18-FAILPOINTS' '-o=../bin/etcdutl' '.')
% 'rm' '-f' 'bin/etcdctl'
% (cd etcdctl && 'env' 'GO_BUILD_FLAGS=' 'CGO_ENABLED=0' 'GO_BUILD_FLAGS=' 'GOOS=linux' 'GOARCH=amd64' 'go' 'build' '-trimpath' '-installsuffix=cgo' '-ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=6abcc18-FAILPOINTS' '-o=../bin/etcdctl' '.')
SUCCESS: etcd_build (GOARCH=amd64)

@serathius
Copy link
Member Author

cc @ah8ad3 @ahrtr @siyuanfoundation @fuweid @MadhavJivrajani

@@ -124,9 +125,7 @@ $(GOPATH)/bin/gofail: tools/mod/go.mod tools/mod/go.sum
cd /tmp/etcd-release-3.5-failpoints/; \
git clone --depth 1 --branch release-3.5 https://github.com/etcd-io/etcd.git .; \
go get go.etcd.io/gofail@${GOFAIL_VERSION}; \
(cd server; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that for /tmp/etcd-v3.5.12-beforeSendWatchResponse/bin we keep

(cd server; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \
(cd etcdctl; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \
(cd etcdutl; go get go.etcd.io/gofail@${GOFAIL_VERSION}); \ 

but we remove those for /tmp/etcd-release-3.5-failpoints/bin? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know that release branch has the tools/mod directory, however for tags we don't know. Some of them might have it, some don't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we will always have server, etcdctl, and etcdutl directories, so we will always need to go get the gofail library for them, no? :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build.sh will execute the go get go.etcd.io/gofail@${GOFAIL_VERSION}) commands for all servers, etcdutl, etcdctl and tests., so it's OK to remove the duplicated commands in makefile.

etcd/build.sh

Lines 33 to 36 in d83c8bd

cd ./server && go get go.etcd.io/gofail@"${GOFAIL_VERSION}"
cd ../etcdutl && go get go.etcd.io/gofail@"${GOFAIL_VERSION}"
cd ../etcdctl && go get go.etcd.io/gofail@"${GOFAIL_VERSION}"
cd ../tests && go get go.etcd.io/gofail@"${GOFAIL_VERSION}"

@ahrtr
Copy link
Member

ahrtr commented Aug 2, 2024

Before we see downgraded go.etcd.io/gofail v0.2.0 => v0.1.0 line in the following build logs

I did not reproduce this. The following is what I saw in my local environment,

$ make /tmp/etcd-release-3.5-failpoints/bin
go install go.etcd.io/[email protected]
rm -rf /tmp/etcd-release-3.5-failpoints/
mkdir -p /tmp/etcd-release-3.5-failpoints/
cd /tmp/etcd-release-3.5-failpoints/; \
	  git clone --depth 1 --branch release-3.5 https://github.com/etcd-io/etcd.git .; \
	  go get go.etcd.io/[email protected]; \
	  (cd server; go get go.etcd.io/[email protected]); \
	  (cd etcdctl; go get go.etcd.io/[email protected]); \
	  (cd etcdutl; go get go.etcd.io/[email protected]); \
	  FAILPOINTS=true ./build;
Cloning into '.'...
remote: Enumerating objects: 1656, done.
remote: Counting objects: 100% (1656/1656), done.
remote: Compressing objects: 100% (1473/1473), done.
remote: Total 1656 (delta 356), reused 654 (delta 141), pack-reused 0
Receiving objects: 100% (1656/1656), 4.26 MiB | 4.58 MiB/s, done.
Resolving deltas: 100% (356/356), done.
go: upgraded go.etcd.io/gofail v0.1.0 => v0.2.0
\e[91mDEPRECATED!!! Use build.sh script instead.\e[0m

% gofail enable server/etcdserver/ server/lease/leasehttp server/mvcc/ server/wal/ server/mvcc/backend/
% gofail enable server/etcdserver/ server/lease/leasehttp server/mvcc/ server/wal/ server/mvcc/backend/
% rm -f bin/etcd
% (cd server && env CGO_ENABLED=0 GO_BUILD_FLAGS= GOOS=darwin GOARCH=arm64 go build -trimpath -installsuffix=cgo -ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=d83c8bd-FAILPOINTS -o=../bin/etcd .)
% rm -f bin/etcdutl
% (cd etcdutl && env GO_BUILD_FLAGS= CGO_ENABLED=0 GO_BUILD_FLAGS= GOOS=darwin GOARCH=arm64 go build -trimpath -installsuffix=cgo -ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=d83c8bd-FAILPOINTS -o=../bin/etcdutl .)
% rm -f bin/etcdctl
% (cd etcdctl && env GO_BUILD_FLAGS= CGO_ENABLED=0 GO_BUILD_FLAGS= GOOS=darwin GOARCH=arm64 go build -trimpath -installsuffix=cgo -ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=d83c8bd-FAILPOINTS -o=../bin/etcdctl .)
SUCCESS: etcd_build (GOARCH=arm64)

The only reason I can think of is that the GOFAIL_VERSION was v0.1.0 when executing the command go get go.etcd.io/gofail@${GOFAIL_VERSION}.

@serathius
Copy link
Member Author

@ahrtr
Copy link
Member

ahrtr commented Aug 2, 2024

The problem happens in my environment

Please execute command below and let me know what you get,

$ cd /tmp/etcd-release-3.5-failpoints
$ go list -m -f '{{.Version}}' go.etcd.io/gofail
v0.2.0

cleanup /tmp/etcd-release-3.5-failpoints and try again?

@ahrtr
Copy link
Member

ahrtr commented Aug 2, 2024

Sorry, please execute commands below instead,

$ cd /tmp/etcd-release-3.5-failpoints
$ cd tools/mod && go list -m -f '{{.Version}}' go.etcd.io/gofail
v0.2.0

@serathius
Copy link
Member Author

serathius commented Aug 2, 2024

Oh, with #18395 merged you cannot test it on branch target like /tmp/etcd-release-3.5-failpoints/bin. You can still do that on /tmp/etcd-v3.5.15-failpoints/bin target.

See my results:

etcd $ git checkout origin/main
HEAD is now at 8ac9ce832 Merge pull request #18392 from serathius/robustness-deflake-watch-progress
etcd $ rm -rf /tmp/etcd-v3.5.15-failpoints/bin/
etcd $ make /tmp/etcd-v3.5.15-failpoints/bin
rm -rf /tmp/etcd-v3.5.15-failpoints/
mkdir -p /tmp/etcd-v3.5.15-failpoints/
cd /tmp/etcd-v3.5.15-failpoints/; \
  git clone --depth 1 --branch v3.5.15 https://github.com/etcd-io/etcd.git .; \
  go get go.etcd.io/[email protected]; \
  (cd server; go get go.etcd.io/[email protected]); \
  (cd etcdctl; go get go.etcd.io/[email protected]); \
  (cd etcdutl; go get go.etcd.io/[email protected]); \
  FAILPOINTS=true ./build;
Cloning into '.'...
remote: Enumerating objects: 1664, done.
remote: Counting objects: 100% (1664/1664), done.
remote: Compressing objects: 100% (1472/1472), done.
remote: Total 1664 (delta 357), reused 672 (delta 150), pack-reused 0
Receiving objects: 100% (1664/1664), 4.26 MiB | 9.17 MiB/s, done.
Resolving deltas: 100% (357/357), done.
Note: switching to '9a5533382d84999e4e79642e1ec0f8bfa9b70ba8'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

go: upgraded go.etcd.io/gofail v0.1.0 => v0.2.0
go: upgraded github.com/stretchr/testify v1.8.4 => v1.9.0
go: upgraded go.etcd.io/gofail v0.1.0 => v0.2.0
DEPRECATED!!! Use build.sh script instead.

% 'gofail' 'enable' 'server/etcdserver/' 'server/lease/leasehttp' 'server/mvcc/' 'server/wal/' 'server/mvcc/backend/'
go: removed go.etcd.io/etcd/etcdctl/v3 v3.5.15
go: removed go.etcd.io/etcd/etcdutl/v3 v3.5.15
go: removed go.etcd.io/etcd/server/v3 v3.5.15
go: removed go.etcd.io/etcd/tests/v3 v3.5.15
go: downgraded go.etcd.io/gofail v0.2.0 => v0.1.0
go: downgraded go.etcd.io/gofail v0.2.0 => v0.1.0
go: downgraded go.etcd.io/gofail v0.2.0 => v0.1.0
go: downgraded go.etcd.io/gofail v0.2.0 => v0.1.0
go: upgraded github.com/stretchr/testify v1.8.4 => v1.9.0
% 'gofail' 'enable' 'server/etcdserver/' 'server/lease/leasehttp' 'server/mvcc/' 'server/wal/' 'server/mvcc/backend/'
% 'rm' '-f' 'bin/etcd'
% (cd server && 'env' 'CGO_ENABLED=0' 'GO_BUILD_FLAGS=' 'GOOS=linux' 'GOARCH=amd64' 'go' 'build' '-trimpath' '-installsuffix=cgo' '-ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=9a55333-FAILPOINTS' '-o=../bin/etcd' '.')
% 'rm' '-f' 'bin/etcdutl'
% (cd etcdutl && 'env' 'GO_BUILD_FLAGS=' 'CGO_ENABLED=0' 'GO_BUILD_FLAGS=' 'GOOS=linux' 'GOARCH=amd64' 'go' 'build' '-trimpath' '-installsuffix=cgo' '-ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=9a55333-FAILPOINTS' '-o=../bin/etcdutl' '.')
% 'rm' '-f' 'bin/etcdctl'
% (cd etcdctl && 'env' 'GO_BUILD_FLAGS=' 'CGO_ENABLED=0' 'GO_BUILD_FLAGS=' 'GOOS=linux' 'GOARCH=amd64' 'go' 'build' '-trimpath' '-installsuffix=cgo' '-ldflags=-X=go.etcd.io/etcd/api/v3/version.GitSHA=9a55333-FAILPOINTS' '-o=../bin/etcdctl' '.')
SUCCESS: etcd_build (GOARCH=amd64)
etcd $ cd  /tmp/etcd-v3.5.15-failpoints/tools/mod/
mod $ go list -m -f '{{.Version}}' go.etcd.io/gofail
v0.1.0

@ahrtr
Copy link
Member

ahrtr commented Aug 2, 2024

/tmp/etcd-v3.5.15-failpoints/bin

It makes sense now. Your previous example was based on /tmp/etcd-release-3.5-failpoints/bin.

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, fuweid, henrybear327, serathius

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@serathius serathius merged commit e758ffc into etcd-io:main Aug 2, 2024
77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants