Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Cache tests in CI #1818

Merged
merged 21 commits into from
Jun 6, 2024
Merged

Conversation

michaljurecko
Copy link
Contributor

@michaljurecko michaljurecko commented Jun 4, 2024

Jira:

Intro:
In last weeks, @Matovidlo intensively dealt with how to speed up our CI pipeline and cache tests (if the code, ENVs or external files have not changed). During the review, I found out that a lot of good work was done, but some things were still missing to make the pipeline even faster. It wasn't much work to complete/debug it, so I prepared this PR.

Changes:

  • Unit and E2E test are cached.
  • Cache is stored in our S3 bucket using runs-on/cache@v4 action (an in-place replacement for actions/cache@v4).
    • So there is no more 10GB limit.
    • Expiration is set to 14day in the S3 bucket and can be modified.

Commented per commit.


Original PRs


First run - no cache at all - empty S3 bucket
https://github.com/keboola/keboola-as-code/actions/runs/9379476266/attempts/2

image


Second run - cached:
https://github.com/keboola/keboola-as-code/actions/runs/9379476266/attempts/3

image

@michaljurecko michaljurecko changed the base branch from main to mv-PSGO-621-change-test-kbc-project-environment-to-refer-to-path June 4, 2024 10:33
@michaljurecko michaljurecko force-pushed the michaljurecko-PSGO-565-test-cache branch from 518ce96 to a976ae7 Compare June 4, 2024 10:37
Base automatically changed from mv-PSGO-621-change-test-kbc-project-environment-to-refer-to-path to main June 4, 2024 10:46
@michaljurecko michaljurecko force-pushed the michaljurecko-PSGO-565-test-cache branch 9 times, most recently from 7d0dcd1 to acae08b Compare June 4, 2024 15:50
@michaljurecko michaljurecko force-pushed the michaljurecko-PSGO-565-test-cache branch 11 times, most recently from 4d04479 to 3e90569 Compare June 5, 2024 06:16
@michaljurecko michaljurecko force-pushed the michaljurecko-PSGO-565-test-cache branch 2 times, most recently from 8c64959 to d868378 Compare June 5, 2024 06:44
@michaljurecko michaljurecko force-pushed the michaljurecko-PSGO-565-test-cache branch from d868378 to c8d4461 Compare June 5, 2024 06:45
Comment on lines +33 to +63
# GOPATH - Should be empty in our setup, see GOMODCACHE and GOBIN.
# GOCACHE - Build and test cache.
# GOMODCACHE - The directory where the go command will store downloaded modules and related files.
# GOBIN - Compiled binaries from "go install ...", we need a directory outside GOPATH to cache only installed tools.
- name: Set and export Go envs
shell: bash
run: |
if [ "$RUNNER_OS" == "Windows" ]; then
# C: is slow: https://github.com/actions/runner-images/issues/8755
GODIRS=D:/tmp/go
elif [ "$RUNNER_OS" == "macOS" ]; then
GODIRS=/Users/runner/.cache/go
else
GODIRS=/home/runner/.cache/go
fi

mkdir -p $GODIRS/path
mkdir -p $GODIRS/cache
mkdir -p $GODIRS/modcache
mkdir -p $GODIRS/lintcache
mkdir -p $GODIRS/bin

go env -w \
GOPATH=$GODIRS/path \
GOCACHE=$GODIRS/cache \
GOMODCACHE=$GODIRS/modcache \
GOBIN=$GODIRS/bin \

echo "GODIRS=$GODIRS" >> $GITHUB_ENV
echo "GOLANGCI_LINT_CACHE=$GODIRS/lintcache" >> $GITHUB_ENV
echo "$GODIRS/bin" >> $GITHUB_PATH
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is important to know which directories is Go using, and which should be cached.

On Windows they must be present on the faster D: disk.

Without customization, without customization, most files are saved in GOPATH.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So judging by the code below we're caching GOCACHE, GOMODCACHE, GOBIN and GOLANGCI_LINT_CACHE right?

What's left in GOPATH now with all these directories separate? In a comment above you wrote that "Should be empty in our setup" - but that means we're still caching everything that used to be in GOPATH?

What exactly are we trying to exclude from caching with all this setup then? Or what's the point of all this splitting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's left in GOPATH now with all these directories separate?

There will be nothing left in GOPATH that we should cache. I didn't check if there is anything there at all, you can find out if you want.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand what was the reason for this change if it wasn't to exclude some things that are in GOPATH but shouldn't be cached. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GOPATH is some older/default/legacy variable, GOCACHE and GOMODCACHE are some "new" ENVs for fine tuning. We can discuss it on the Standup call.

@michaljurecko michaljurecko marked this pull request as ready for review June 5, 2024 07:17
Copy link
Contributor

@Matovidlo Matovidlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just small adjustements

Comment on lines +33 to +63
# GOPATH - Should be empty in our setup, see GOMODCACHE and GOBIN.
# GOCACHE - Build and test cache.
# GOMODCACHE - The directory where the go command will store downloaded modules and related files.
# GOBIN - Compiled binaries from "go install ...", we need a directory outside GOPATH to cache only installed tools.
- name: Set and export Go envs
shell: bash
run: |
if [ "$RUNNER_OS" == "Windows" ]; then
# C: is slow: https://github.com/actions/runner-images/issues/8755
GODIRS=D:/tmp/go
elif [ "$RUNNER_OS" == "macOS" ]; then
GODIRS=/Users/runner/.cache/go
else
GODIRS=/home/runner/.cache/go
fi

mkdir -p $GODIRS/path
mkdir -p $GODIRS/cache
mkdir -p $GODIRS/modcache
mkdir -p $GODIRS/lintcache
mkdir -p $GODIRS/bin

go env -w \
GOPATH=$GODIRS/path \
GOCACHE=$GODIRS/cache \
GOMODCACHE=$GODIRS/modcache \
GOBIN=$GODIRS/bin \

echo "GODIRS=$GODIRS" >> $GITHUB_ENV
echo "GOLANGCI_LINT_CACHE=$GODIRS/lintcache" >> $GITHUB_ENV
echo "$GODIRS/bin" >> $GITHUB_PATH

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So judging by the code below we're caching GOCACHE, GOMODCACHE, GOBIN and GOLANGCI_LINT_CACHE right?

What's left in GOPATH now with all these directories separate? In a comment above you wrote that "Should be empty in our setup" - but that means we're still caching everything that used to be in GOPATH?

What exactly are we trying to exclude from caching with all this setup then? Or what's the point of all this splitting?

outDir := filesystem.Join(r.testsDir, ".out", testDirName)
require.NoError(t, os.RemoveAll(outDir))
require.NoError(t, os.MkdirAll(outDir, 0o755))
require.NoError(t, aferofs.CopyFs2Fs(nil, workingDir, nil, outDir))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go doesn't have a built-in copy directory function huh? Apparently os.CopyFS() has been accepted for Go 1.23.
I assume we'll be able to use that instead when we update?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would spend time on it in different PR where we will replace it everywhere in project. 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course you can change it then.

Well, it's not a complete replacement. aferofs is a library for virtual file systems, we use it to lock operations only to the working directory (something like a chroot), and we also use in-memory fs.

But in this place it can be replaced.

id: go-cache-tools
uses: actions/cache@v4
uses: runs-on/cache@v4 # replacement for actions/cache@v4
with:
path: |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so this is in order to use AWS S3 as cache storage.

But why? Did you run into some issues when using actions/cache? Or do you expect some of the limitations mentioned by RunsOn documentation to matter for us in the near future?

@@ -77,13 +80,15 @@ runs:
# Load compiled tools from cache to skip installation
- name: Load Go tools cache
id: go-cache-tools
uses: actions/cache@v4
uses: runs-on/cache@v4 # replacement for actions/cache@v4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity what runners are we currently using for GitHub Actions? Are we using GutHub's free runners because our repository is open-source? Or is Keboola paying for some runners?

Asking because why checking RunsOn it claims to have lot cheaper runners than GitHub.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using free runners that GitHub Actions offers us. They run on Azure. You can have open-source repo and pay for better runners when you need more capacity. Also it is possible to use dedicated runners but I suppose we do not own any infrasctructure that we would run it on (We run in cloud).
As we have free runners and it seems to suite our project well I would skip it, but when the load increases, we will have discussion about it.
Can you please put this runners preview into our confluence under go team or other space? Thanks !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are now using free runnes, self-hosted Linux runners are easy and used by other teams, but Windows and MacOS is more hard to setup.

@michaljurecko
Copy link
Contributor Author

@keboola/go team, I have taken your comments into account:

@michaljurecko michaljurecko requested a review from Matovidlo June 6, 2024 08:12
Copy link
Contributor

@Matovidlo Matovidlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@michaljurecko michaljurecko merged commit 2348433 into main Jun 6, 2024
13 checks passed
@michaljurecko michaljurecko deleted the michaljurecko-PSGO-565-test-cache branch June 6, 2024 12:04
Matovidlo added a commit that referenced this pull request Jun 10, 2024
…test-cache"

This reverts commit 2348433, reversing
changes made to 212dccb.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants