Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Tweaks, Fix #38, and Single Cache #37

Merged
merged 13 commits into from
Nov 4, 2023

Conversation

niklasdewally
Copy link
Contributor

@niklasdewally niklasdewally commented Nov 3, 2023

As per #36.

This PR adds / will add:

  • Job names are all fully qualified (Minion: Ubuntu Build, not Ubuntu Build) so that they are easier to read in PRs.
  • All CI use a single cache, with a hash dynamically generated from git submodule commits.
  • CI to build and test Conjure Oxide.
  • Add directory index to github pages coverage

@niklasdewally niklasdewally force-pushed the pr/ci-fixes branch 4 times, most recently from 11a667a to acbb5a2 Compare November 3, 2023 20:06
@ozgurakgun
Copy link
Contributor

I have a crazy idea. Hashes are timestamped after the prefix we give to it. When fetching a cache, the newest one is selected within the prefix-matching candidates.

Why don't we just use a very plain prefix (just OS-toolname, i.e. linux-minion/macos-chuffed etc)?

We could extend to OS-compilername-toolname etc if needed. Basically only mention something in the hash key if there is a specific entry for it in the build matrix + the name of the thing we are building.

AFAICS the only thing we are losing is if we have version A of something in build 1 then B in build 2 then A in build 3 again, the 3rd build won't be able to find the 1st cache. Since it's not content prefixed, just timestamped. But this is quite unlikely in practice anyway?

@ozgurakgun
Copy link
Contributor

To be very explicit, yes you read it right: I am suggesting we remove all SHAs and hashFiless from the cache-keys.

@niklasdewally
Copy link
Contributor Author

niklasdewally commented Nov 3, 2023

To be very explicit, yes you read it right: I am suggesting we remove all SHAs and hashFiless from the cache-keys.

I agree - the newest cache on a branch would always be used by Github, and we don't need cache invalidation.
I just don't get why github / articles on the internet reccomend you use a hash.

Given the caching rules:

key: nightly-${{ runner.os }}-gitmodules-${{ steps.cache-vars.outputs.submodule_sha }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: nightly-${{ runner.os }}

My guess is that if (eg) I am working on a PR based on an old Cargo lock file, it would find an older cache from main with that cargo lock file and not use the latest one.

However, if I had a new Cargo.lock file in my PR, then it would use the latest one that matches nightly-${{ runner.os }}.

If we did not have the first key, every PR using different versions would change the cache and another PR would change it back and so on. Whether this matters or not at this scale is another question (https://stackoverflow.com/a/77194233)

@ozgurakgun
Copy link
Contributor

Hashes are good in general because they allow you to fetch by content as opposed just the most recent.

I am just thinking it's very unlikely to exactly match for us.

Also, if the second hashFiles matches in your example but the earlier submodule_sha's do not, we have a cache miss anyway, don't we?

Maybe we can just try without the hashes and see how it behaves.

@niklasdewally niklasdewally force-pushed the pr/ci-fixes branch 3 times, most recently from 8dd4d30 to b41c943 Compare November 4, 2023 00:24
@ozgurakgun
Copy link
Contributor

A quick question: do we need to vendor genindex.py or can we set it up in a different way? I couldn't see a link to the source of this file either (there is a name and a license at the top, but I couldn't find the source with this info, yet).

@ozgurakgun
Copy link
Contributor

A quick question: do we need to vendor genindex.py or can we set it up in a different way? I couldn't see a link to the source of this file either (there is a name and a license at the top, but I couldn't find the source with this info, yet).

oh it's this one of course: https://github.com/glowinthedark/index-html-generator/

you linked to it from #36

there are no releases/packages etc. so short of fetching the git repo every time I don't see any other option. we should add the source link clearly somewhere though so we can check for newer versions in the future.

@niklasdewally
Copy link
Contributor Author

Hashes are good in general because they allow you to fetch by content as opposed just the most recent.

I am just thinking it's very unlikely to exactly match for us.

Also, if the second hashFiles matches in your example but the earlier submodule_sha's do not, we have a cache miss anyway, don't we?

Maybe we can just try without the hashes and see how it behaves.

I'll try this.

My assumption was that we could have one cache that all jobs update, giving us a really complete blob of the entire project for future CI, but it seems like caches are read only and can't be updated.

@niklasdewally
Copy link
Contributor Author

niklasdewally commented Nov 4, 2023

I imagine with that limitation the way to do a full cache properly is to have a full build CI for the entire project which keeps a full cache of all the project at every commit on main, which can then be used for PRs....

@niklasdewally niklasdewally changed the title CI Tweaks and Single Cache CI Tweaks, Fix #38, and Single Cache Nov 4, 2023
@niklasdewally niklasdewally force-pushed the pr/ci-fixes branch 4 times, most recently from 6358b6c to 3470f85 Compare November 4, 2023 09:57
@ozgurakgun
Copy link
Contributor

My assumption was that we could have one cache that all jobs update, giving us a really complete blob of the entire project for future CI, but it seems like caches are read only and can't be updated.

I don't think I follow this but you have been spending a lot of effort into this so maybe you should set it up in the best way according to your understanding and we can review after that?

OK a short version: I thought we could have action A fetch from the cache and push to the cache once finished and this can be (as long as the keys match) fetched by action B later etc. Of course if they are at the same time they won't benefit from the cache updates of each other but that is inevitable. Is this not the case?

@niklasdewally
Copy link
Contributor Author

niklasdewally commented Nov 4, 2023

I thought we could have action A fetch from the cache and push to the cache once finished and this can be (as long as the keys match) fetched by action B later etc

This is what i was hoping for - however, action A could only build minion and not chuffed, and then B wouldnt be able to update the cache with chuffed as the cache is read only and so on. Centralising cache building makes this a bit clearer in my mind - a cache is built for the entire project on main then consumed by actions. This centralised cache forces build of everything in the workspace, so should be complete. Timestamped caches could also solve this somewhat.

I don't think I follow this

Tbh I am not quite sure what's going on or whats best anymore, but I think its better than it was before so could be merged regardless and we revisit this in the future.

I've got 3099 deadlines soon so it might be best to get this in now in a good but not perfect state, then revisit it, as opposed to the PR sitting for two weeks...

@niklasdewally
Copy link
Contributor Author

niklasdewally commented Nov 4, 2023

In particular, all actions just go fetch the full build of the project from the last commit on main and do stuff based on that.

etc/ci/genindex.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ozgurakgun ozgurakgun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

.github/workflows/code-coverage.yml Outdated Show resolved Hide resolved
.github/workflows/code-coverage-deploy.yml Outdated Show resolved Hide resolved
@niklasdewally niklasdewally mentioned this pull request Nov 4, 2023
@ozgurakgun
Copy link
Contributor

Thanks!

@ozgurakgun ozgurakgun merged commit 6ae77bc into conjure-cp:main Nov 4, 2023
5 checks passed
@niklasdewally niklasdewally deleted the pr/ci-fixes branch November 4, 2023 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants