-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve CI performance #133
Open
joneshf
wants to merge
7
commits into
main
Choose a base branch
from
joneshf/improve-ci-performance
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CI is back to being slow again. Since removing a bunch of parallelism in #126, things take a long time to run. Some of this is expected when there are large changes. However, when adding a new Data Source/Resource, it seems like the entire test suite runs again from scratch. It seems like there should be some kind of caching in the tests that isn't showing up, and it looks like it's due to the change in build flags/environment variables used to run integration tests. Because we use the same cache for all the tests, we end up doing something like: 1. Run the unit tests with one set of build flags/environment variables. 2. It creates a cached test run. 3. Run the integration tests a different set of build flags/environment variables. This causes the cached tests from the unit tests to be invalidated. 4. It creates a new cached test run. 5. In a future commit, run the unit tests with the original set of build flags/environment variables. This causes the cached tests from the integration tests to be invalidated. Because we're always alternating between unit tests and integration tests, the cached tests are never being used. In order to address that, we're splitting the test jobs between integration test and unit tests. We make a different cache for each job, so they can reuse their cached tests when possible. This should allow the tests to only run what has changed between commits, instead of needing to re-run everything, all the time.
We mostly removed these fallback keys so we wouldn't get a hit on the new jobs. Now that we've got caches for the different jobs, we can add back the fallback keys.
We want the caches to update whenever there are changes to the files. This way, when we alter any of the tests, they should update the cache with their latest result. We do this by appending a hash of the go files to the cache key. We still want to fallback to the nearest cache that works though, so we can bring it any test results that might be successfully cached, and only worry about the new stuff. We do this by making sure one of the restore keys drops the appended hash.
It's not clear why we're re-running the tests only for the acceptance tests. But, this is the main thing that's making things flaky. Maybe we can get some insights by debugging the cache decisions.
Not sure this is going to provide more useful information, but maybe it will surface something.
We weren't doing a recursive search of Go files, just whatever was at the top-level. We want to make sure we get the files within directories as well.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
CI is back to being slow again. Since removing a bunch of parallelism in
#126, things
take a long time to run. Some of this is expected when there are large
changes. However, when adding a new Data Source/Resource, it seems like
the entire test suite runs again from scratch.
It seems like there should be some kind of caching in the tests that
isn't showing up, and it looks like it's due to the change in build
flags/environment variables used to run integration tests. Because we
use the same cache for all the tests, we end up doing something like:
variables. This causes the cached tests from the unit tests to be
invalidated.
flags/environment variables. This causes the cached tests from the
integration tests to be invalidated.
Because we're always alternating between unit tests and integration
tests, the cached tests are never being used.
In order to address that, we're splitting the test jobs between
integration test and unit tests. We make a different cache for each job,
so they can reuse their cached tests when possible. This should allow
the tests to only run what has changed between commits, instead of
needing to re-run everything, all the time.